Erlang on AWS Lambda
Recently I was working on a project that required running Erlang from AWS Lambda. While Lambda officially only supports Java, Node JS, and Python, it does allow packaging arbitrary binaries and executing those binaries via one of the supported languages.
When compiling your binaries for Lambda, AWS specifically recommends compiling from one of the Lambda AMIs specified in the documentation for the Lambda Execution environment.
I fired up an EC2 instance, downloaded the Erlang source, and compiled it.
Got the erlang shell to run, so far so good.
Running the binary on Lambda
I packaged everything up into a Zip file, uploaded my build to Lambda, and executed my function.
And it all worked the first time! Just kidding.
Instead I got this cryptic error.
Crash dump is being written to: /var/task/erl_crash.dump…Kernel pid terminated (application_controller) ({application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}}}}}…
It’s at this point I would like to mention how terrible debugging AWS Lambda is. Since there is no way to SSH into an active Lambda environment, you are left with writing versions of your functions to run shell commands and report back the results to debug the environment.
Moving on…
After some googling of the error failed_to_start_child,net_kernel
it seemed the problem was with epmd, and I should be able to reproduce the issue by running epmd -debug
.
Now we’re getting somewhere, so apparently it can’t open a socket?
Grepping through the source of Erlang for error opening stream socket
only produced a single result, even better! The exact line can be found in the Erlang source.
I’ve reproduced the block that produced the error below.
The error is being produced by the system call to the socket
function. Right above that block is a conditional block that enables IPV6 if the compiler detected IPV6 support.
At this point I took a leap and guessed there was an IPV6 difference between the AWS AMI and the Lambda execution environment. I wrote a short C program to test this assumption by opening an IPV4 and IPV6 socket and checking the success of each.
Running this from the AWS AMI
./test-sockets
Checking IPV4
Socket success
Checking IPV6
Socket success
And from AWS Lambda
./test-sockets
Checking IPV4
Socket success
Checking IPV6
Socket fail: 1
So there it is, IPV6 is enabled on the AWS AMI and not in the Lambda container. When I compiled Erlang on the AMI it enabled the IPV6 codepaths, causing it to error out when run from the Lambda environment.
The “Fix”
For now I’ve forced Erlang to compile without IPV6 support and filed an issue with the AWS Lambda team about the inconsistency between the AMI and the Lambda environment.
Since I couldn’t explicitly disable IPV6 via a config option, I opted to modify the source as part of my build script by running the following before running ./config
.
Hopefully this write up saves other Erlang users a few hours if they attempt to run it via Lambda as well.