23 Aug The road to databaseless and why to follow it!
A few weeks ago, Lightbend introduced Akka-Serverless. It is a framework for the development of cloud-native applications, in combination with a managed cloud platform to deploy them on. What makes it unique is the fact that it allows for serverless applications which are stateful. From the documentation:
(…) eliminates the need for plumbing code to handle database access or connections. The managed platform relieves you from configuring and maintaining the orchestration platform, and the data stores. Akka Serverless auto-scales services, and handles network partitions and failures.
What this means is that, from a development perspective, the application seems to have no database. Yet, the application can load and store it’s state in a highly scalable and resilient way. Therefore: databaseless applications.
This release (at the time of writing in public beta) is truly a milestone. It announces the beginning of a new era of software development. If you think this statement is suspiciously extravagant, please bear with me. Let’s take a short trip through software history to better comprehend the significance of this innovation.
Carl Sagan was once quoted “If you wish to make an apple pie from scratch, you must first invent the universe.”
The idea of taking something existing, and make something bigger out of it, is a concept that has been around for quite some time, and it is very successful. A few examples from everyday life:
- bakeries buy their flour, instead of growing, harvesting and processing wheat
- sandwich bars buy their bread and hardly ever grow their own cattle to produce cheese or pork
- car-engines (fuel or electric) are used by different manufacturers
- fashion brands usually don’t herd sheep
The very same idea applies to software, and technology in general, too. The fast paced technological evolution from the last decades is certainly a product of such an approach.
The goal of every piece of software is to solve some business problem. Nowadays, software developers can choose from a range of programming languages, tools and frameworks to accomplish the intended goal. Lots of those languages and frameworks have claimed that ‘with framework X, the developer can focus on the business requirements, without being distracted by the technical boilerplate code’. But is that really the truth?
Today it is possible (and sometimes taken for granted) to run a program on a single laptop, or on thousands of computers distributed across data-centers around the globe, without changing a single line of applicationcode. All this goodness didn’t appear out of thin air. How did we end up here?
In the beginning there were transistors
All things considered, software is nothing more than a list of instructions for some hardware.
As it turns out, writing CPU-instructions by hand is not that obvious. It’s also not a lot of fun. Fortunately for us, software professionals, we don’t need to write those instructions by hand (anymore).
Already decades ago, smart folks thought about how the development of software can be made easier, quicker and more efficient. They did it by adding layers of abstraction on top of each other.
For starters, operating systems hide the hardware-specific quirks, so that software doesn’t need to be aware which specific hardware is executing the code. That is already a big improvement over writing hardware-specific applications. But that is often not good enough: we want software that runs on any operating system.
Things didn’t stop there. For example, for most applications it doesn’t make sense to manually manage the memory that it is using. Because of that, most modern programming languages have a mechanism of automated memory-management (smart pointers, garbage-collection). It leaves the developer with more time to focus on what matters: the business requirements, instead of solving general technical problems that don’t add actual value.
Furthermore, since multi-core processors have become the norm, the need for applications to make use of parallel processing has exploded. So, to cope with that, suitable abstractions were created too: for example Akka (inspired on the actor model and Erlang) allows the developer to not worry about which thread runs which logic, or how those threads are created (or even on which computer they are running).
And so, layer after layer is added to the ‘onion of abstractions’, so that the capabilities of the fast evolving hardware can be harnessed to the full extend.
For a lot of scenario’s, having software on a single computer just isn’t cutting it: we need multiple computers and ways of communicating between them. Network-communication is built using the same abstractions-tricks: It is a stack of protocols (of which each is an abstraction of the one below it). Programming-languages, on their turn, provide abstractions on top of those protocols (RPC, location transparency ).
Now that software can run on multiple machines, new problems pop up. For instance, we now need to deal with all those servers: they need to be installed, maintained, secured, updated, … The complexity and sheer amount of manual labor that can be involved with that, lead to the ‘invention’ of ‘the cloud’.
Cloud native software
All told, cloud infrastructure is basically an abstraction of (networked) computers (and a lot more). If we want to write software that runs on ‘some machine in the cloud’, it is important that the tools and languages that we use facilitate this.
Tools like Docker and orchestration-runtimes like Kubernetes have been true game-changes in this regard. They allow to create software that is easy to spin-up, without knowing up-front on which machine(s).
Today it is easy to start an additional instance of an application on a virtual server with a simple mouse-click. But even that is sometimes not enough. Some applications need to be able to scale up (or down) dynamically, when the load on the system requires it.
This brings us to the next abstraction: Cloud vendors now have ways to deploy software which only uses resources when it is actually being used: applications are being started based on real-time user demand. When things get busy, additional instances of an application can be created automatically (*). As an owner of such an application, you only pay for the actual usage of the computer, not for the computer itself. We call it ‘Serverless Computing’.
Interesting as it might seem, there is one major drawback to serverless computing: it makes it notoriously hard to deal with state.
Serverless functions work very well for any stateless computation: pure functions for which the output is defined only by the input. On the other hand, the most trivial application that needs state beyond what is in the input, becomes excessively difficult to implement in a serverless approach.
Unfortunately, the reality is that almost every application needs to deal with some form of state. Even though serverless computing makes it easy to scale the ‘compute’ part of an application, if the ‘state’ isn’t scaling with it, it’s not all that useful. It means that serverless computing was threatened to be reduced to a marginal technology: only useful for a very limited set of use-cases.
Until now: Akka Serverless introduces yet another layer of abstraction: that of state-persistency in a serverless application. The problem of storing and retrieving state in a serverless-function (which is a hard one) is solved, and we as developers can just relax and make use of it. We can create databaseless (**) stateful functions.
Granted, the idea is not completely new: there are other technologies around that allow similar (but more limited) functionality, for example in the area of stateful stream-processing (ksqlDB, apache Flink). However, those are tied to 1 specific technology and a relative limited set of usecases.
There is no magic involved: as with all evolutions in software-development, this is built on other existing technologies and abstractions. Akka Serverless is based on the Akka Cloud Platform, which is supported by GoogleCloud, AWS and Microsoft Azure (***).
So what does it mean for an application to be databaseless, and still have state, and be scalable at the same time? Obviously there is some sort of database involved. But the thing is, as a developer, you don’t have to worry about it at all, much like you don’t have to wory about memory-management in a ‘standalone’ JVM application.
The SDK provides a way so that the application can operate on (read/write) state. Loading, storing and propagating the (updated) state is something that is taken care of by the framework. It is done in a way that is scalable and resilient: the storage of the state cannot become a bottleneck, no matter how much load the systems experiences. Akka-Serverless applications can scale to as many (virtual) machines as you can spare, and make efficient use of it.
In our next blog-post, we’ll zoom in on the details and get our hands dirty with it.
Every additional ‘layer of abstraction’ is based on the same idea: solving a hard problem once and for all, allowing to reuse the solution in a generic, abstract way. This is true for software development as it is true for sandwich-development. It leads to better use of resources: developers can focus on ‘business’ problems, instead of solving technical problems that already have been solved by someone else.
This doesn’t necessarily mean sacrificing freedom or flexibility: you can still write software in Assembly if you really need to. It’s a matter of selecting the right tool for the job and deciding what creates most value: are you solving a business-need, or are you (re-)inventing a proverbial wheel? We can all agree that Assembly is not the appropriate tool if the job requires a scalable distributed cloud application.
History has shown that higher levels of abstractions in (software-development-)tools lead to boosts in productivity. Why wouldn’t that be the case for cloud-native software? The general problem of scalable application-state, which is something every cloud-native application has to deal with, has been solved. It’s hard to overestimate the importance of this feat.
Sure, maybe not all use cases are covered by Akka-Serverless today. But much like everyone eventually stepped away from Assembly in favor of a higher-level language (once they matured), the innovation of Akka Serverless allows to step-up the development of cloud-native software to a higher level of abstraction too.
Maybe one day, software will truly be cloud native, in the sense that it gets deployed to ‘some’ cloud-provider, without the developers even knowing which exact one. Or even, why not, all of them. We believe that Akka-Serverless brings us one step closer to that goal.
(*) Because in such scenario’s, the startup-time of an application becomes important. Things like GraalVM and ScalaNative which brings us full circle back: writing software in a high-level language and compiling it native code.
(***) At the time of writing, Akka Serverless is only available on GCP. Other cloud platforms will surely follow. So, it basically takes abstraction of the cloud-platform too.
Curious for more?
Stayed tuned for a practical in-depth follow-up on this article, where we explore what it means to actually build a full fledged application with Akka Serverless.
Want to get started already? Go ahead en register for this workshop organised by Lightbend.