Three Decades with Erlang
A Personal OdysseyPosted: 2023-12-14
My Brief History with Erlang
As 2024 approaches, I am closing in on nearly 30 years of Erlang programming. My journey with this language began in 1994, when Erlang itself was still in its early stages of evolution. Over these decades, I have seen the growth and transformation of Erlang, and its impact on the world of telecom and beyond. This experience has given me a unique perspective on the development of Erlang and its runtime system, ERTS.
Erlang's Genesis: Engineering Modern Telecom
The Erlang Runtime System (ERTS) was born from Ericsson's need to create a robust and efficient telecommunications infrastructure. This runtime system was designed to handle the demanding and concurrent requirements of both mobile and fixed phone networks.
The groundwork established by Joe Armstrong, Mike Williams, and Robert Virding, under Bjarne Däcker's mentorship, shaped the core principles and architecture of Erlang. They recognized the importance of independent concurrent processes for telecom applications. Their focus on concurrency from the beginning was a strategic decision that aligned perfectly with the parallel nature of telecommunication operations, where multiple tasks need to run simultaneously and independently.
Unlike traditional systems where concurrency often lead to complex issues like deadlock and race conditions, BEAM's architecture allows each process to operate independently. This independence is key in ensuring that the failure of one process does not cascade into a system-wide failure.
Another aspect of BEAM's design contributing to its robustness is its soft real-time capabilities. In telecom networks, 'soft real-time' refers to the system's ability to process and respond to inputs within a reasonable timeframe. This is essential for services like voice calls or data transmission, where delays or interruptions can degrade the quality of service.
BEAM also features a sophisticated error-handling mechanism. It allows individual processes to fail and restart without affecting the overall system. This approach, often called "let it crash," is a radical departure from traditional error handling but proves highly effective in maintaining system integrity. By localizing failures and managing them effectively, BEAM ensures that the larger system continues to operate smoothly.
Combining these features - efficient management of concurrency, soft real-time processing, and robust error handling - makes BEAM uniquely suited for the demands of global communication networks. It enables systems built on Erlang and running on BEAM to offer high availability and reliability. As such, BEAM support a wide array of services that range from everyday communications to data transfers.
Erlang's Early Years: Foundations and Evolution
But let's go back to the origins of Erlang and its developmental journey. The language and the name came sometime in 1986 and 1987.
Initially, Erlang was implemented in Prolog, but this early version faced limitations, especially in terms of speed and efficiency. When Erlang started to get real users around 1989 these issues had to be addressed. This led to the development of JAM (Joe's Abstract Machine) which was heavily influenced by WAM the virtual machine for prolog. JAM represented a significant advancement in Erlang's development, laying the foundational groundwork for what would eventually evolve into BEAM.
JAM overcame some of Erlang's initial performance challenges. However, as Erlang began to be used more extensively, the need for a more efficient and robust system became apparent. This led to the development of BEAM, starting in 1993 as a compiler to C. You can read more about this in A history of Erlang by Joe Armstrong.
BEAM: Advancing Erlang's Core
BEAM brought several improvements over JAM, focusing on speed of execution, with features such as a threaded code dispacher and mapping of virtual machine registers to CPU registers.
Thus, the evolution from Prolog-based Erlang to JAM, and subsequently to BEAM, illustrates a path of continuous refinement and adaptation. Each stage was driven by the changing needs of the systems Erlang aimed to support, with each new development building upon the strengths of its predecessors. You can read more about how BEAM works in The Beam Book.
HiPE: Pushing Erlang's Performance Envelope
My involvement in the evolution of Erlang's ecosystem began with my work on the Jericho compiler. This project was part of my master's thesis and aimed toward efficiency and speed of execution of Erlang code. Jericho, a JIT native code compiler for Erlang, was written in C and translated JAM code into SPARC v9 assembler. This translation improved performance, addressing one of the key limitations of JAM.
The pursuit of optimizing Erlang's performance further led to my doctoral research and the creation of the High-Performance Erlang (HiPE) group and compiler. Unlike Jericho, HiPE was written in Erlang and was designed to convert both JAM and BEAM bytecode into machine code. This capability was not limited to a single architecture; HiPE supported multiple architectures, including x86, SPARC, and ARM.
The advancements brought by the HiPE group to ERTS were substantial. During this period, our focus extended to garbage collection strategies, exploring the effectiveness of approaches such as hybrid and common heaps for processes. We also pioneered a just-in-time compilation method, implemented on a per-function basis.
I was fortunate to collaborate with a team of esteemed colleagues in the HiPE project, including Kostis Sagonas, Mikael Pettersson, Richard Carlsson, Tobias Lindahl, Per Gustafsson, and Jesper Wilhelmsson. This collaboration extended to working closely with the OTP team, where we fine-tuned inter-process communication, introduced binary searches for extensive pattern matching, and redesigned BEAM's tagging scheme for enhanced efficiency.
The work of my colleagues in the HiPE team also led to notable contributions to the language and the virtual machine. These include the introduction of Core Erlang and EDoc, implementation of bit syntax, refinements in floating-point arithmetic, the addition of type specs, and the development of the Dialyzer for static code analysis. Each of these contributions played a crucial role in enhancing the robustness and efficiency of the BEAM ecosystem.
Diversifying with Scala and Simics: Broadening Horizons
Before going into the next era of Erlang's development, I like to highlight two significant chapters in my journey that predates this period: my involvement with Scala at EPFL (École Polytechnique Fédérale de Lausanne) from 2003 to 2004 and at Virtutech 2004 to 2005. This experience, while seemingly a detour from Erlang, it broadened my perspective on programming languages and virtual machines.
In 2003, I joined EPFL as a project manager for Scala, a language that was then in its nascent stages of development. Scala, designed to be a scalable and efficient language, aims to bridge the gap between object-oriented and functional programming paradigms. My role in this project was not just administrative; it was an opportunity to think about language design. It also gave me experience in the management of a complex software project.
One of the most interesting aspects of Scala was its functional programming characteristics, some of which were inspired by Erlang. Erlang’s approach to concurrency, fault tolerance, and distributed computing offered valuable lessons relevant to Scala's development. My experience with Erlang provided me with insights that I could bring to the Scala project. It was a chance to see how concepts from Erlang could be adapted and applied in a different context and to a language with a distinct set of goals and design principles.
My tenure at EPFL was a period of rich learning and cross-pollination of ideas. It was like being at a junction where the roads of Erlang and Scala intersected, allowing for an exchange of ideas that enriched both my understanding and the development of Scala. This experience underscored the value of exploring diverse programming paradigms and the importance of functional programming in modern software development.
Following my role in the Scala project at EPFL, I transitioned to Virtutech in 2004, working on virtual machine technology. At Virtutech, I worked on Simics, a high-performance virtual platform that simulates the hardware of different computer systems. Simics was a powerful tool for developers, allowing them to mimic the behavior of complex systems and test software in a controlled environment.
My focus at Virtutech was on the performance optimization of Simics. This task involved deep dives into the intricacies of system architecture and the challenges of accurately simulating hardware behavior at high speeds. The experience honed my skills in understanding and improving system performance, a crucial aspect of virtual machine technology and the Erlang runtime system.
Working on Simics gave me a unique perspective on the significance of performance in virtual systems. It was akin to fine-tuning a high-performance engine, where every adjustment and enhancement could significantly improve how the system operated. This knowledge proved invaluable in my subsequent endeavors with Erlang and BEAM.
These experiences at EPFL and Virtutech shaped my approach to system design and optimization. They underscored the importance of a holistic understanding of software and hardware in creating efficient and robust systems. As I returned to the world of Erlang, the skills and insights gained from these roles enriched my contributions to the Erlang community, particularly in system performance and optimization.
Erlang's Stabilization Era: Consolidation Over Innovation
The period between 2006 and 2014 was when the focus noticeably shifted from innovation to stabilization. This era was marked by a concerted effort to make BEAM more stable and efficient, an endeavor that, while crucial, came at the cost of slowing down on the innovation front.
During this time, the OTP (Open Telecom Platform) team, operating under the age-old adage of "the golden rule" – that is, 'whoever has the gold makes the rules' – followed a path largely dictated by Ericsson's immediate needs. Ericsson, having become heavily dependent on Erlang for its revenue-generating projects, preferred a conservative approach. Their goal was clear: ensure the smooth running of existing systems and avoid any disruptions that might arise from overly ambitious innovations. It's a bit like being at a rock concert where the band decides to play only ballads; necessary for a breather, but you do miss the high-energy numbers.
In this environment, much of the OTP team's work centered around enhancing the efficiency and stability of the Beam. This included the introduction of the multi-core version, which now could deliver true parallelism. Something that most Erlang programs could take advantage of immidately and automatically withou rewriting a single line of application code.
Several enhancements to ETS (Erlang Term Storage) tables, and other performance optimizations were also introduced. However, this shift also meant dialing back on some of the experimental features that HiPE had introduced. Features like hybrid heaps, parameterized modules, namespaces, and the just-in-time aspects of the HiPE compiler and loader were gradually removed to simplify maintenance.
Eventually, this would lead to the complete phasing out of HiPE a number of years later, when it was replaced by a simpler JIT compiler in 2020, bringing the native code story back full circle. It was a bit like saying goodbye to a favorite experimental band that had once headlined festivals but was now playing in small clubs. While its contributions were significant and propelled BEAM forward, the need for a stable, easily maintainable system took precedence in the strategic decisions of the OTP team.
This phase in Erlang's history highlights a common trajectory in technology development where, after a period of rapid innovation and experimentation, a consolidation phase is necessary. It underscores the balancing act between pushing the boundaries of innovation and ensuring the reliability and stability of technology, especially when it forms the backbone of infrastructure like telecommunications.
Klarna: Challenging and Enhancing Erlang's Limits
During this Stabilization Era, my professional journey found me at Klarna, where I was deeply involved in working with Erlang's runtime system. At Klarna, we were not just passive observers of the developments in the Erlang community; instead, we were actively pushing the Erlang Runtime System (ERTS) to its limits. This period was marked by a rigorous process of identifying, understanding, and resolving various challenges within ERTS, contributing to its overall stability and efficiency.
One of the notable challenges we encountered at Klarna was the limitation of process memory in ERTS, even on 64-bit systems. We discovered that a single Erlang process could not hold more than 32 GB of data, a constraint that posed significant challenges given the scale at which we were operating. This finding was crucial as it highlighted a fundamental limitation in the system, prompting us to dig deeper into Erlang's memory management mechanisms.
Another issue we observed was related to the behavior of the schedulers. In some instances, the schedulers in ERTS could become a bit too eager to enter a sleep state, which, while efficient in some contexts, could lead to performance bottlenecks under certain workloads. Addressing this required a careful rebalancing of the scheduler's responsiveness, ensuring that they remained alert enough to handle fluctuating demands efficiently.
Additionally, we tackled challenges with long-running built-in functions (BIFs). These functions could occasionally block the schedulers for extended periods. In extreme cases, this could trigger the HEART mechanism, Erlang's watchdog, to erroneously conclude that the system was unresponsive and initiate a restart. Such scenarios were not just theoretical concerns but real issues that we needed to address to maintain the reliability of our systems.
We also grappled with issues related to memory usage, particularly concerning large binaries. References to these large binaries could get inadvertently 'stuck' in processes, leading to substantial, and often unnecessary, memory consumption. This was a subtle yet significant issue, as it could lead to gradual memory bloat, impacting the system's overall performance and stability.
One interesting system limitation resulted in our systems consistently crashing after a code upgrade. Yes, we did hot code loading of new releases at that time. The only hint on what was going wrong was a printout directly to the terminal of a sad face ":(". This did not end up in any logs, so we first had to try to reproduce the error locally before we could see it. After some searching in the code, we found out that the system printed the sad face and exited if the hard-coded size of the number of exception handlers was exceeded. The developer of that code had very graciously left a comment: "This should probably be dynamically allocated."
My time at Klarna during this era was not just about addressing these challenges; it was also about contributing to the broader Erlang ecosystem. The issues we identified and the solutions we developed helped in enhancing the robustness of ERTS. It was a period that underscored the importance of real-world applications in revealing the limitations of a system and the collaborative nature of problem-solving in the open-source community. The experience at Klarna was like being in a high-stakes game where every discovery and improvement not only benefitted us but I think also enriched the entire Erlang community.
Elixir's Emergence: Invigorating the Erlang Ecosystem
The emergence of Elixir in 2011 marked a new chapter in the story of Erlang. Created by José Valim, Elixir is a dynamic, functional language built on the Erlang VM (BEAM) and designed for building scalable and maintainable applications with an initial focus on web technologies. Elixir’s arrival had a profound impact on the Erlang community, infusing it with fresh energy and attracting a new wave of developers.
Elixir managed to bridge a gap in the backend development community. Its modern syntax and tooling, combined with the robustness of the Erlang VM, made it appealing to a broader audience, including those who might have found Erlang's syntax and conventions challenging. This influx of new talent and ideas brought about a rejuvenation in the community, sparking renewed interest and innovative approaches to solving backend problems.
From my perspective, Elixir's contribution to the Erlang ecosystem is akin to a revitalizing rain shower over a fertile field. It not only nourished the existing landscape but also encouraged new growth, diversifying and enriching the ecosystem. The presence of Elixir helped in popularizing some of Erlang's core principles, such as concurrency and fault tolerance, to a wider audience, thereby reinforcing the relevance of these ideas in modern software development.
The BEAM Book: Chronicle of a Virtual Machine
Following the emergence of Elixir and the energizing effect it had on the Erlang community, I started writing "The BEAM Book." This project is a comprehensive documentation of the BEAM virtual machine, designed to provide an in-depth look into its inner workings, principles, and capabilities.
The BEAM Book is an effort to encapsulate the rich history and technical intricacies of BEAM. It's intended to serve as both a historical record and a technical guide, offering insights into the design decisions, architectural nuances, and the operational mechanics of BEAM.
My work on this book is driven by a desire to share knowledge and insights gathered over years of working closely with Erlang and BEAM. It's meant to be a resource for those who want to understand the depths of BEAM's capabilities, for developers who build systems on it, and for enthusiasts who wish to explore the underpinnings of this powerful virtual machine.
The BEAM Book project is also a reflection of my commitment to the Erlang and Elixir communities. It's a way of giving back, of contributing to the collective knowledge base that has been so instrumental in the growth and success of these technologies. This endeavor is like charting the course of a great river - capturing its origins, its meandering journey, and the many ways it has enriched the lands through which it flows.
My nearly 30 years in this field have been a testament to the dynamic, ever-evolving nature of software development, where each new discovery builds upon the last.
I encourage you, the readers, to reflect on your own experiences with Erlang, BEAM, or Elixir. Consider how these technologies have influenced your professional path and how you have interacted with these communities. Whether your role has been that of a developer, a scholar, or an enthusiast, your experiences and contributions are valuable chapters in this ongoing story. It is through our collective efforts and shared knowledge that these technologies continue to thrive and evolve, underpinning some of the most innovative and robust systems in the world today.
I realize that each experience, each challenge, and each triumph has not only shaped my understanding of this technology but has also prepared me for my next endeavor. I am excited to announce that I will be building upon this history in my upcoming talk titled "30 Years On and In the Beam: Mastering Concurrency." In this presentation, I plan to cover some technical aspects of BEAM, exploring some of its peculiarities and offering insights into designing effective concurrent systems.
This talk is an opportunity to share knowledge, engage with new ideas, and continue learning from the community that has been a big part of this journey. I invite you to join me in this discussion, where we can learn more about concurret programming and explore the BEAM together.
For more details and to register for the event, please visit 30 Years On and In the Beam. Mastering Concurrency. I look forward to the opportunity to connect, share, and learn together. Hope to see you there!