Unified and Elixir
While developing an interactive B2B marketplace platform, it became apparent that writing a scalable order-writing backend would need the capacity to do highly concurrent database reads and writes based on the volume of our customer’s requests. We needed it to update inventory counts in as real-time as possible to avoid users placing orders that couldn’t go through. Other parts of the application were originally written in Ruby on Rails for simple CRUD features and forums. However, it became clear that the sheer audience size for the application would have effects on the speed and hosting costs. After much deliberation and research, we decided we’d try the Elixir programming language with the Phoenix framework for this in-particular project. We made a strategic decision to go ahead and port the entire application, not just the order-writer, from Rails to Elixir.
Elixir has been gaining popularity over the past couple of years for its powerful pattern-matching, fault tolerance, and concurrency. However, these are all things that it gains from being built on top of Erlang, which has been around since the mid-80’s.
So Why Then Should We Choose Elixir Over Erlang?
Well, the purpose of this blog post isn’t to weigh the two against each other, but more provide information to the reader should they be contemplating a use case. For an excellent writeup on the subject we recommend reading “Why Elixir“ by The Erlangelist. His primary reasons boil down to the differences in Elixir’s code organization:
- Metaprogramming capabilities (writing macros that modify the AST at compile-time, allowing you to extend the language to include any control constructs you desire)
- The pipe operator `|>`
- Polymorphism (via protocols)
- Mix for managing every aspect of Elixir OTP applications from creation and compilation to testing and managing dependencies
- Subjectively beautiful, Ruby-inspired syntax (many people, myself included, balk when first encountering Erlang’s Prolog-inspired syntax)
An example of Elixir’s syntax, showcasing the pipe operator being used in parsing a JSON response into a map.
So the real question is, why use Elixir and Phoenix over Ruby on Rails for our backend? Until now, we’ve used Rails for many of our simple CRUD projects because it’s very easy to be productive in. Rails is geared around pragmatism (for instance it embodies convention over configuration), the Rails open source community is _huge_ so there are lots of nice packages that save development time (i.e. the excellent Devise gem), and the syntax is easy to grok (and, again, is subjectively beautiful.) Enter Elixir! Its creator, Jose was a Rails Core developer who wanted to marry the benefits of Erlang with the productivity of Ruby.
Where Elixir shines for our needs is with writing concurrent applications. Erlang was developed in the 80’s for telephony systems, which handle massive amounts of phone calls at once, necessitating concurrency. These need to be able to handle server failures and dropped connections without the network going down (necessitating distributed computing and fault-tolerance). According to the History of Erlang, after computer scientists experimented with using other languages for telecom applications from 1982-1985, they concluded that they would need a “very high level symbolic language in order to achieve productivity gains!” Then, after experimenting with the existing symbolic languages that were sufficiently high level (Lisp, Prolog, Parlog) from 1985-1986, it was determined that the language must contain
“primitives for concurrency and error recovery, and the execution model must not have back-tracking … It must also have a granularity of concurrency such that one asynchronous process is represented by one process in the language.”
Thus, Erlang was born, designed from the start around concurrency, scalability, and fault-tolerance.
The concurrency model of Erlang is built around actors and message passing. The easiest way of looking at this is that each actor is a lightweight process that shares no memory with any other processes. Instead, they must communicate by passing messages to each other which contain copies of any data that needs to be shared. This is necessary because sharing memory between processes can leave things in an inconsistent state should one of the processes crash, and while message passing is slower by design than sharing memory, it sets the stage for Erlang’s fault-tolerance.
“Some studies proved that the main sources of downtime in large scale software systems are intermittent or transient bugs (source: http://dslab.epfl.ch/pubs/crashonly.pdf ). Then, there’s a principle that says that errors which corrupt data should cause the faulty part of the system to die as fast as possible in order to avoid propagating errors and bad data to the rest of the system. ” – Learn You Some Erlang
Since every process is lightweight and designed around being able to crash without affecting other processes, errors are handled by letting processes die “as fast as possible”. Processes can then be quickly restarted by supervisors<https://hexdocs.pm/elixir/Supervisor.html>; processes which exist solely to monitor other processes. They automatically restart child processes when they fail. Supervisors can supervise other supervisors, forming a hierarchical process structure known as a supervision tree.
“Packing Erlang or Elixir processes onto cores is easy because they are small and are like packing physical objects. If we want to pack sand in barrels it’s easy. The grains of sand are so small that it’s easy to completely fill the barrels. Packing huge boulders is difficult, they don’t pack well and much space is wasted” – Managing Two Million Web Servers
This concurrency model lends itself very well to distributed computing; since the processes are all lightweight and don’t require shared memory, they can run on separate computers as long as they’re networked to each other and thus able to pass messages. Rather than having to buy better hardware to scale your application, you can just buy more hardware as needed. This makes Erlang inherently highly scalable, as you can just “throw hardware at it”.
“If half of Erlang’s greatness comes from its concurrency and distribution and the other half comes from its error handling capabilities, then the OTP framework is the third half of it.” – Learn You Some Erlang
The OTP is a set of core libraries which group essential practices around coding, particularly error-prone and time consuming parts of Erlang applications, leaving you to only worry about writing the unique parts of your program and avoiding a lot of headaches along the way. To truly understand what all the OTP does for you, check out this excellent chapter from Learn You Some Erlang
While Rails still boasts a *much* larger community (a rough indicator being the 130,000 active Ruby repositories on Github as opposed to 1,700 Elixir reposi), I think that the Elixir and Phoenix ecosystem will catch up in due time given the benefits that they bring to the table, not to mention the joy they are to work in.
Of course, the language choice was only the first decision to make; we needed to decide on a DBMS (database management system). We considered using a nosql such as CouchBase, Redis, or Cassandra due to their notorious speed. After researching the options, including what other companies use at scale, we tentatively decided on Postgresql with a master/slave read/write cluster. Setting this up on Amazon RDS is trivial, and it can be easily configured to scale automatically by creating new read slaves to keep up with demand.
The first thing we did after deciding on Elixir and Postgresql was to run some benchmarks. We’ve used Postgres a lot in the past so we were already familiar with it and in love with its stability and performance. We found that it would easily suit our needs, and Elixir blew us away with its speed as well as how fun it is to code in!
(The above benchmarks are specific to the application. You can read more here )
While we are publishing our benchmarks in Part 2 of our upcoming blog post, some others have published theirs. The Phoenix Showdown benchmarks show a 1300% increase in throughput speed compared to Rails as well as having a sixth of the latency. Depending on the use case for each application, companies have found anywhere from 20-50 servers running at 100% CPU usage for Rails application is roughly the equivalent of 1 server running at 15-20% usage for Elixir / Phoenix.
Part 1 Conclusion
For this project, we had already written some backend functionality in Rails, for handling user authentication, session management, and serving some assets as well as for talking to our Discourse backend which we use for the community portion of our app. After we started writing the order writing backend in Elixir, it became clear we could port everything over from the original Rails server to our Elixir OTP application quickly. This only added a few weeks to our development time while reducing hosting costs by decreasing the number of servers used. We’ll go into more depth about this transition in Part 2. Stay tuned!