QCON 2020 started with a dark cloud of COVID-19 on the horizon. I was monitoring news - and the QCON website - and made the decision on Sunday March 1st to travel. Had it been one week later... But the QCON organization reassured us that a number of measures had been taken and that it was safe. So we had a go, and the conference started Monday March 2nd.
I was in London for QCON 2018, and found the conference talks refreshing: (almost) no our-product-solves-all-your-problems pitches, but speakers that were talking about their own experiences in implementing solutions. I really like that, as they address to do's and also the dont's, illustrated with examples from their own work.
Below I will list my key takeaways. As my focus is on microservices, serverless and architecture, other topics are not addressed. However, note that the conference has many more topics, many of which my internal knowledge junkie still needs to have a look at.
Compared to 2018, I found the Microservices talks to be more matured. Companies like Monzo and Sky Bet international have platforms that are microservices based and implemented successfully. However, it also is more and more clear that building distributed systems is still difficult: more difficult than building that monolith. That was illustrated by Segment case, where they moved from a monolith to microservices and back again. Their trade-off was simple: they experienced some problems in their monolith, and thought microservices are the solution. It turned out not to be, so they moved back to (a better) monolith. It takes some guts to admit that on the main stage. Meanwhile, it is perhaps the best example of how the microservices solutions should be treated: microservices solutions are more difficult to develop and operate than a monolith and should only be considered if the expected advantages outweigh the additional costs. Or, as Sam Newman simply put it: 'microservices should not be the default choice'. This caused quite a stir, because 'one of the microservices founding fathers was recommending against it'! However, having followed Sam Newman for quite some time, I think that he has always been consistent in his approach to microservices: 'consider if your really need them in your situation'. The thing is that making these types of decisions requires some good old-fashioned engineering skills to weigh the pro's and con's of microservices. The same skills it takes to make a proper modular design for a monolith! And for doing that, Domain Driven Design is the most important engineering tool in your toolbox. Instead of a gut feeling that is fueled by ... whatever can be found on the internet ;-)
The claim is being made that Serverless computing is the next logical step after microservices and will dominate cloud computing in the future. So, some talks at QCon also covered Serverless, but the one I found most interesting was by Sean Walsh from Lightbend. The point was made that Serverless computing is only partially supported by the offerings we know as 'Functions as a Service'. These FaaS platforms support a limited set of use cases:
- Embarrassingly parallel processing tasks
- Low traffic applications—enterprise IT services, and spiky workloads
- Stateless web applications—serving static content form S3 (or similar)
- Orchestration functions—integration/coordination of calls to third-party services
- Composing chains of functions—stateless workflow management, connected via data dependencies
- Job scheduling
Sean observed that, in order to build general purpose applications, the FaaS offerings must be extended with better support for:
- Managing in-memory durable session state across individual requests E.g. User Sessions, Shopping Carts, Caching
- Low-latency serving of dynamic in-memory models E.g. Serving of Machine Learning Models
- Real-time stream processing E.g. Recommendation, Anomaly Detection, Prediction Serving
- Distributed resilient transactional workflows E.g. Saga Pattern, Workflow Orchestration, Rollback/Compensating Actions
- Shared collaborative workspaces E.g. Collaborative Document Editing, Blackboards, Chat Rooms
Sean presented the CloudState platform which amongst others helps to build stateful functions (https://cloudstate.io). The platform leverages existing technologies like kubernetes/knative, gRPC and Akka Cluster, and it is worthwhile watching how this will influence the FaaS world (and lift some of its current limitations)!
Teams ... are people ... are important
One wouldn't expect it, but in a conference on software, there were around 15 talks on 'how to get your team to perform better'. Which makes sense, because we all know that in the end the teams' performance is crucial to what the company can deliver. It is all about humans - as Anjuan Simmons illustrated in his keynote about Technical Leadership and the Underground Railroad.
What is important to understand is that the IT world is rapidly transforming from a traditional waterfall delivery into a DevOps world. The first delivers products 1 - 4 times per year. The hardcore DevOps enthusiasts are targetting 1 - 4 times per day. While we all understand that in order to achieve this, we have to automate everything that's involved in this process, like deployment pipelines and testing. What we forget every now and then, is that this also imposes many different requirements on the involved team(s) members. So next year, I'm thinking about dragging some team leads along to QCon 2021.
It is nice to hear success stories, but it's especially great if you can learn something from them. Talks that got my attention were from Testla, Sportsbook and Monzo bank.
Monzo bank is a UK based 'internet bank' that implemented its whole IT in ~1600 microservices. They started in 2015 offering only a pre-paid debit card and an app, but are currently offering many more banking services. What was surprising is that ALL of their microservices were written in Go and had a strict source code layout, including requirements that must be met before a microservice would go into production. Where this seems contradictory to the a-microservie-can-be-written-in-the-language-the-team-likes, Monzo found that this helps them in controlling the complexity. Another benefit - especially for a quick growing organisation as theirs - is that developers can easily switch between teams...
Ian Thomas from Sportsbook talked about the challenge we all would like to have: re-build an existing platform from scratch - and then better! And then they did what is normally not recommended: they started with a microservices approach and were successful. And of course, along the way they did very sensible things like putting an SRE team in place that removed a lot of the having-to-run-services burden from the developers. But in the end, I feel that the main reason why they succeeded with this microservices from scratch approach is that the Domain model / bounded contexts were already know...
Tesla was announced as 'the first public talk ever about the Tesla Virtual Power Plant'. Tesla already has their first Virtual Power Plant operational in Australia. In short, they put a gigantic battery somewhere in the South of Australia and complemented that with a large number of solar panels and some windmills. The battery charges and is then used to deliver to the power grid at the moments the most energy is needed. It is called 'peak-shaving' and it's a big thing for network operators: it is the peaks that cause them headaches. The Tesla VPP helps in a big way and already delivered good results.
Next step was to include households in the VPP grid. These households typically have some solar panels and a Tesla battery. The battery is charged by the solar panels, so the stored energy can also be used overnight. Now, te Tesla VPP can also access all of these batteries and use them to deliver energy to the public power grid to - again - contribute to peak shaving. Now, there is of course quite an optimization question behind this: you would not want your battery to be empty because all energy was delivered to the power grid ... just when you need to charge your car. Your VPP application will have to balance all stakeholder needs.
What struck me most was that the core of their application platform was an Akka/Scala platform. Here, the batteries would be modelled as actors, but more importantly, a node of the platform can fail without loosing (the state of) an actor. The Akka platform provided part of the robustness they needed. A similar challenge that needed to be handled was the 'loss of connection to a battery for a longer period of time'. It is - of course - not acceptable to remotely discharge a battery without knowing its status. This aspect/situation was taken into account from the start, when designing the application. Sounds to me like a great thing to do, because adding these kind of capacilities afterwards may quickly turn into a nightmare... Nice!
Kubernetes - the future
To most of us, Kubernetes is the Cloud application platform of choice. But I remember at QCon 2018, Ian Dobson from Container Solutions gave a talk where he illustrated with examples that Kubernetes was a far from mature platform at that time - and he was right. Now, 2 years later, there was a keynote by Katie Gamanji (American Express, and member of the CNCF Technical Oversight Committee), where she talked about where Kubernetes is going and how it is maturing. The approach that the CNCF is taking is to standardize various interfaces with Kubernetes to make it more flexible. Typical, early examples of these interface are the Container Runtime Interface - CRI - and the Container Network Interface - CNI. The CRI makes it possible to develop plugins for other runtimes than Docker, e.g. Containerd and even AWS Firecracker. Likewise, CNI makes it possible to run Kubernetes on top of networking solutions like Flannel and Cilium. Other interfaces are the Service Mesh Interface (SMI), the Container Storage Interface (CSI) and the ClusterAPI (for cluster provisioning).
These developments ensure that Kubernetes can (1) focus on its core task of Container orchestration and (2) make it possible to use Kubernetes on a wide variety of technologies. You only need the right plugins for the interfaces ...
So, where Ian Dobson had some legitimate concerns about Kubernetes in 2018, Katie Gamanji gave in 2020 a convincing outline that Kubernetes is maturing very quickly - and IMO in the right direction.
Bilgin Ibryam (RedHat) approached the maturity of Kubernetes from a different angle. He focussed on 'how is Kubernetes application development evolving and where is it going?'. He outlined the monolith to microservices to serverless evolution and illustrated how various non-functionals are addressed. He categorized the 'distributed application needs' into 4 categories: lifecycle management (a.o. deployments), advanced networking needs (a.o. circuit breakers), resource binding (a.o. protocol adapters) and statefull abstractions (a.o. handling applicatin state). His observation was that lifecycle management is covered by the Kubernets platform. The advanced networking needs are addressed by Service Meshes and API Gateways. Kubernetes Knative supports serverless workloads and various resource bindings. The only thing that needs to be addressed is the statefull abstractions. Bilgin there referred to Microsoft Dapr: a 'platform for building distributed applications'. The Dapr platform offers various resource bindings, state stores, messaging infrastructures and monitoring/tracing options. Dapr focusses on making microservices development easy - and it is definitely on my todo list.
Some random observations:
- More and more software developers discuss infra
- Things move to the edge: a LOT of machine learning seems to move to the edge
- Streaming data: processing is moving to the edge and much more stream processing examples in the conference talks
- There was a solid security track
- ... and a lack of testing, which I find strange, as it is one of the pillars for doing successful DevOps. Very strange...
Quotes ... with a smile:
And every now and then, speakers present you with a remarkable quote ...
- BBC doing data analysis of 'on demand series watchers': people first watch the last episode (outcome of data analysis, first thought to be a mistake)
- Belgium and Budapest: both cities in Europe - speaker Anjuan Simmons spoofing an American tourist ;-)
- I just compared REST to Stalin —- because I can - Mark Rendle
- You java people - whatever you use - I really have no idea- is it spring? - but i love you ❤️ - Mark Rendle