Recently I was working on a project that required running Erlang from AWS Lambda. While Lambda officially only supports Java, Node JS, and Python, it does allow packaging arbitrary binaries and executing those binaries via one of the supported languages.

When compiling your binaries for Lambda, AWS specifically recommends compiling from one of the Lambda AMIs specified in the documentation for the Lambda Execution environment.

I fired up an EC2 instance, downloaded the Erlang source, and compiled it.

$ sudo yum install gcc gcc-c++ glibc-devel make
$ wget http://erlang.org/download/otp_src_18.3.tar.gz
$ tar xvf otp_src_18.3.tar.gz
$ cd otp_src_18.3
$ ./configure
$ make
$ sudo make install
$ erl
Erlang/OTP 18

Eshell V7.3  (abort with ^G)
1> 

Got the erlang shell to run, so far so good.

Running the binary on Lambda

I packaged everything up into a Zip file, uploaded my build to Lambda, and executed my function.

And it all worked the first time! Just kidding.

Instead I got this cryptic error.

Crash dump is being written to: /var/task/erl_crash.dump…Kernel pid terminated (application_controller) ({application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}}}}}…

It’s at this point I would like to mention how terrible debugging AWS Lambda is. Since there is no way to SSH into an active Lambda environment, you are left with writing versions of your functions to run shell commands and report back the results to debug the environment.

Moving on…

After some googling of the error failed_to_start_child,net_kernel it seemed the problem was with epmd, and I should be able to reproduce the issue by running epmd -debug.

$ epmd -debug
epmd: epmd running - daemon = 0
epmd: error opening stream socket: Operation not permitted

Now we’re getting somewhere, so apparently it can’t open a socket? Grepping through the source of Erlang for error opening stream socket only produced a single result, even better! The exact line can be found in the Erlang source.

I’ve reproduced the block that produced the error below.

if ((listensock[i] = socket(sa->sa_family,SOCK_STREAM,0)) < 0)
  {
    switch (errno) {
        case EAFNOSUPPORT:
        case EPROTONOSUPPORT:
            continue;
        default:
            dbg_perror(g,"error opening stream socket");
            epmd_cleanup_exit(g,1);
    }
  }

The error is being produced by the system call to the socket function. Right above that block is a conditional block that enables IPV6 if the compiler detected IPV6 support.

#if defined(EPMD6)
      size_t salen = (sa->sa_family == AF_INET6 ?
              sizeof(struct sockaddr_in6) :
              sizeof(struct sockaddr_in));
#else

At this point I took a leap and guessed there was an IPV6 difference between the AWS AMI and the Lambda execution environment. I wrote a short C program to test this assumption by opening an IPV4 and IPV6 socket and checking the success of each.

#include <sys/types.h>
#include <sys/socket.h>
#include <errno.h>
#include <stdio.h>

void checkSock(int sock) {
  if (sock < 0) {
    printf("Socket fail: %i\n", errno);
  } else {
    printf("Socket success\n");
  }
}

void checkAll() {
  printf("\nChecking IPV4\n");
  int ipv4 = socket(AF_INET,SOCK_STREAM,0);
  checkSock(ipv4);
  printf("\nChecking IPV6\n");
  int ipv6 = socket(AF_INET6,SOCK_STREAM,0);
  checkSock(ipv6);
}

int main() {
  checkAll();
  return 0;
}

Running this from the AWS AMI

./test-sockets
Checking IPV4
Socket success

Checking IPV6
Socket success

And from AWS Lambda

./test-sockets
Checking IPV4
Socket success

Checking IPV6
Socket fail: 1

So there it is, IPV6 is enabled on the AWS AMI and not in the Lambda container. When I compiled Erlang on the AMI it enabled the IPV6 codepaths, causing it to error out when run from the Lambda environment.

The “Fix”

For now I’ve forced Erlang to compile without IPV6 support and filed an issue with the AWS Lambda team about the inconsistency between the AMI and the Lambda environment.

Since I couldn’t explicitly disable IPV6 via a config option, I opted to modify the source as part of my build script by running the following before running ./config.

# comments out the line that sets the EPMD6 variable.
sed -i '/#  define EPMD6/c\//#  define EPMD6' ~/otp_src_18.3/erts/epmd/src/epmd_int.h

Hopefully this write up saves other Erlang users a few hours if they attempt to run it via Lambda as well.