The Cloud Native Computing Foundation’s first Kubecon + CloudNativeCon of the year took place in the Bella Center, Copenhagen. A giant greenhouse of a building with snaking industrial pipework and connecting concrete bridges; it's a vast container made of glass letting in light and suitable setting for an industry that’s evolved rapidly from the release of Docker’s superstar container technology back in 2013.
Attendance has rocketed to 4,300, according to Dan Kohn, executive director of the CNCF, which almost triples attendance from a year ago in Berlin, but that’s not surprising as cloud native computing industry is meeting the business world’s demand for more scalable, agile applications and services that can be run across multiple geographical locations in distributed environments.
What’s impressive about the native cloud industry is that from a standing start roughly four years ago, it’s close to building an open cloud platform that it wants to share with the whole business world. It’s not quite there yet and needs a few more layers, but thanks to the foresight of the Linux Foundation to establish the Cloud Native Computing Foundation (CNCF), the industry’s tottering steps were shepherded well.
The industry’s health wasn’t always such a given, Google’s David Aronchick recalls standing on a little stage presenting Kubernetes at the first CNCF event to just 50 to 100 developers.
Aronchick was the product manager on Kubernetes, which is an open source container orchestration system which has become a key component in native computing’s growth.
At the Copenhagen event, Aronchick is presenting again but in a vast hall of thousands of engineers and developers and this time he’s updating everyone on Kubeflow, the hot toolkit for deploying open-source systems for Machine Learning at scale. Kubeflow is an example of open technology that is being built on top of Kubernetes and that was a key message at the event.
As chair of the CNCF’s Technical Oversight Committee, Alexis Richardson’s keynote was focused on the future. He thinks it will be packed full of developers. In his presentation he estimates that there will be 100 million developers by 2027 up from today’s 24 million.
The expectation is that we’ll see them all creating ubiquitous services on the cloud and devices. The vision then for the CNCF, and the community around it, is to build all the foundational layers to create an open cloud platform for developers to simply run their code at scale.
In a sense, it’s a future where everyone has the potential to have their own Tony Stark Iron Man lab, albeit from a software perspective, where code can be written and run on top of an agile infrastructure that abstracts away all the complexity and allows you to present your application to the world at large. The developer focuses on making the best application while the infrastructure deals securely with the demands.
The CNCF was set up and tasked with incubating the ‘building blocks’ required to make an open source native cloud ecosystem successful. You can see all the current incubated projects in CNCF’s new ‘interactive landscape’ (https://landscape.cncf.io/).
A perusal of the site’s interactive catalogue also gives an idea of the problems facing engineers and developers having to deciding what products to use as there’s been an explosion of third-party technologies.
Kubernetes was the first project to be incubated by the CNCF. Donated by Google, it’s an open-source system for automating the deployment, scaling and management of containerised applications. The CNCF has many projects in early sandbox or incubation stage for many critical areas, such as monitoring (Prometheus), logging (fluentd) and tracing for diagnosing problems (openTracing).
At the Copenhagen event, the CNCF highlighted Vitess and NATS as two of its recent incubation additions. Vitess was originally an internal project at YouTube and is a database clustering system that scales MySQL using Kubernetes. For example, it’s being used at Slack for a major MySQL infrastructure migration project. NATS is a more mature project that fills the gap for a cloud native open source messaging technology.
To understand the importance of Kubernetes we need to return to containers briefly. Containers, by design, use less resources than virtual machines (VMs) as they share an OS and run ‘closer to the metal’. For developers, the technology has enabled them to package, ship and run their applications in isolated containers that run virtually anywhere. When continuous integration/continuous delivery software (e.g. Jenkins) and practices are added into the mix, this enables companies to benefit from nimble and responsive automation and it significantly speeds up development. For example, any changes that developers make to the source code will automatically trigger the creation, testing and deployment of a new container to staging and then into production.
The idea of a container allowing one process only to run inside it has also led on to microservices. This is where applications are broken down into their processes and placed inside a container, which makes a lot of sense in the enterprise world where greater efficiencies are constantly being sought.
However, this explosion of containerised apps has created the need for a way to manage or ‘orchestrate’ thousands of containers.
A number of container orchestration products have appeared. Some have been adapted for containers, such as Apache Mesos, or created specifically for containers, such as Docker’s Swarm, or specifically for certain cloud providers, such as Amazon’s EC2. But just over a year after Docker sprinted out of the blocks, Kubernetes popped up. This offered a less complicated and more efficient way to manage clusters (groups of hosts running containers) that spanned hosts across public, private, or hybrid clouds – and most importantly it was open source.
Kubernetes is essentially the culmination of the lessons learned by the Google engineers who developed Borg, an internal platform that used containers to run everything at the company. It’s also the technology behind its Google Cloud service.
“Three years ago Kubernetes was just getting started,” says Sheng Liang, CEO of Platform as a Service company, Rancher Labs: ”It wasn’t even clear what technology was going to take over. There was [Docker] Swarm, [Apache] Mesos, and Mesos was very mature back then, was very popular, so we built a container management product that back then was only one that was agnostic to the orchestration frameworks […] the end users were confused and to be honest so were we knowing what was going to be the standard.”
David Aronchick, who product-managed Kubernetes for Google would probably agree: “Thinking back to those days of the original Kubernetes and Kubecon,” says Aronchick in his keynote. “It’s crazy to think about how many ways there were to run containers. Crontab, orchestrator, Bash (looking at you OpenShift on Bash), everything was bespoke. You ran it yourself and had to deal with everything yourself. But Kubernetes brought a transformation, because it gave everyone a common platform that they could trust, they knew what the APIs are and they could focus on the next level up and that really transformed the entire industry that we’re operating in.”
To say that Kubernetes has had quite a rapid rise is like saying NASA’s Saturn V rocket was quite powerful. Arguably, that rise has a lot to do with the quality engineering that Google offers and the evangelising efforts of community member, Kelsey Hightower.
In March this year, Kubernetes ‘graduated’ from CNCF’s incubation stage, which was an indication that Kubernetes was mature and “resilient enough to manage containers at scale across any industry in companies of all sizes,” according to Chris Aniszczyk, COO of CNCF.
Highlighting the scale of its use, JD.com, the largest retailer in China has over 20,000 servers running Kubernetes and, Kohn says, the largest cluster has over 5,000 servers.
On the showfloor of the Copenhagen event, it was clear that this stamp of maturity also came with a crown as Kubernetes has clearly won the battle to be the container orchestration of choice for developers and vendors alike.
That’s not to say that other products aren’t being used. Chatting to Alex Nehaichik, a software engineer at Wargaming, the online gaming service that runs popular titles such as World of Tanks, he says they are still hedging their bets and using other products, including HashiCorp’s Vault (for secrecy management) and Nomad.
But the reason he’s here is because they are looking into running some of their services on Kubernetes to see how it compares. That’s where a lot of companies are right now, shopping around, doing the research and looking at migration options.
But migration is a non-trivial process. Sarah Wells, Technical Director for Operations and Reliability at the Financial Times, described the FT’s migration as “changing horses in a roaring river” in her keynote. Wells explained how The FT moved from an existing containerised system, stepping it up to Kubernetes, which enabled them to go from 12 to 2,200 releases a year and running 150 microservices. It’s that speed of release that makes the move beneficial for big companies, “When you move from one change a week to many changes a day,” says CNCF’s Alexis Richardson. “You get a lot more confidence in how you work, and you can start doing things you didn’t dream of before so it empowers you to innovate.” (Sorry, not sorry, Kelsey.)
It’s also saved money for the FT. Wells says while It was a risk and EC2 costs were higher while they ran old and new systems in parallel, the FT has seen an 80% reduction in EC2 costs since the migration and being more stable, her team have only had two nodes go down in the first month, rather than 17 nodes.
We asked Brandon Philips, CTO of CoreOS, who has been around this industry since the start to explain why this shift has occurred so quickly. CoreOS was acquired by Red Hat to bolster its OpenShift, Red Hat’s Platform as a Service.
Philips was at the event to talk about its new Operator Framework, which is another example of a new product making it easier to build against and extend Kubernetes for applications. Prior to Kubernetes and containerisation, Philips says “You got a whiteboard and drew out your thing: here’s the web server and here’s the database. After that you’d write a bunch of Bash scripts, source some Linux packages and wire stuff together and the thing that you’ve drawn on the whiteboard no longer exists; it’s translated into a bunch of scripts and recipes that you’ve followed and that gets modified over time.”
However, it’s now possible to translate that diagram directly into an API: “You say this is going to be a deployment, this is a service and I’m going to tie them together with this metadata and you tell Kubernetes this is what I want and the system just makes it happen,” says Philips. “This is quite a shift for businesses, because, back in the day, you’d say I want a VM and you’d be given your SSH credentials […] but now you just deploy the app and the app appears.”
This is the shift that has caused cloud to be so popular,” says CoreOS’ CTO, “because developers are empowered. The big reason that this thing,” Philips told us, pointing around at the bustling show floor at Kubecon, ”is taking off so quickly is bringing that to open source and bringing it in a way that people can design an application to be API-driven as well. The cloud only said, here are the nouns that are API driven: databases, caches, load-balancers. With Kubernetes it’s anything that you find important to your business.”
As an example of Kubernetes pervasiveness, Rancher Labs, was demonstrating its new Rancher 2.0 enterprise platform, which CEO Sheng Liang says “is 100% built on Kubernetes now”. Going forward, he and many other other vendors, expect Kubernetes to become entrenched as infrastructure: “We will worry less and less about it,” says Liang. “And be interested in building stuff on top.”
Liang believes that Kubernetes is going to be so successful that all infrastructure providers, such as Google Cloud, Amazon Cloud, Azure Cloud, even VMware, will support Kubernetes out of the box: ”I think the point has already come, at least for the clouds. All the major clouds have announced a shift of support for Kubernetes as a Service. Amazon hasn’t publicly released it yet, but they’ve announced that they’re adding it in private beta. They announced it last November as the EKS service.”
To ram that message home the CNCF has also announced a new Kubernetes for Developers course and certified exam.
According to Dan Kohn, executive director at the CNCF, there are now 55 Kubernetes distributions and implementations. Being able to gain better observation of Kubernetes was a key issue last year and Prometheus, which is used for monitoring, Kohn says, is being assessed currently to see whether it’s ready to join Kubernetes graduation status while fluentd, used for logging, is the next likely candidate after that.
Better interfaces, better security
As often seems to be the case in cloud native computing, disaggregation in pursuit of performance gains tends to lead to more complex issues to solve initially. When dealing with microservices, for instance, connecting them together so that they offer the functionality of the previous monolithic system has had its challenges.
However, the CNCF has tackled these routing issues by pulling in a number of projects for incubation. Linkerd and Envoy (an internal project at Uber), for instance, are both a ‘service mesh’, a proxy which sits between microservices and routes their requests.
The CNCF also supports a universal RPC framework for Kubernetes pod communication called gRPC and a DNS and service discovery tool called CoreDNS, which manages how processes and services in a cluster can find and talk to one another.
This year, the CNCF is moving on to other challenges. Kubernetes abstracts away a lot of the complexity of managing containers at scale, it still needs to integrate with services such as networking, storage and security to supply a comprehensive container infrastructure.
Alexis Richardson, Chair of the TOC at CNCF, says that the priorities are better interfaces, storage, security and easy on ramps for developers.
Probably one of most popular on ramps is Helm, a package manager. This is another CNCF-supported project that helps to simply running applications and services in a Kubernetes cluster for developers. Helm uses a ‘chart’ format which holds a collection of files detailing the resources needed for a particular application or service to run inside a Kubernetes cluster.
In regard to improving interfaces, the CNCF is focused on creating an open standard for companies to use, which is why it’s spinning out OpenMetrics from Prometheus, the open source monitoring system. Richardson says they want to evolve the exposition formats from Prometheus which are used to expose metrics to Prometheus servers “and standardise it so anyone can do it for other projects as well.”
Additionally, the CNCF is working hard on standardising the way that events are described by creating consistent metadata attributes in a common specification called OpenEvents (although it appears it may now be called CloudEvents). Events are important because they provide valuable data about actions to businesses, on the developer side (e.g. indicating new commits for auto-testing) and on the customer-facing side (e.g. customer activities like creating a new account).
The CNCF’s work on open standards is steadily bearing fruit and has enabled cloud providers, for example, to improve their own interfaces and monitoring systems. Google Cloud, for instance, released Stackdriver Kubernetes Monitoring . Google’s Craig Box explained that this “ingests Prometheus data” and pulls it together with metrics, logs, events and metadata from your Kubernetes environment to give developers more oversight of their clusters, site reliability engineers a centralised place for maintenance and security engineers all the auditing data they need.
Not surprisingly, security was a hot topic in Copenhagen. From the CNCF’s perspective, Richardson highlighted a few foundation-hosted projects, such as Secure Production Identity Framework for Everyone (SPIFFE) project, which offers container authentication and end-to-end encryption for untrusted networks, and Open Policy Agent (OPA) which handles the policy and authorisation side of things.
Addressing the security issues, Brandon Philips, CTO of CoreOS at Red Hat says there are essentially three pillars of security: “The first is just security of the infrastructure software. In regard to Red Hat that’s something that CoreOS focuses on. So making sure that the operating system container runtime and the Kubernetes API server and all this stuff stays up to date and secure. That’s just about making automation happen around all those pieces.”
Philips says for a long time people have actually been very bad at this: “They would forget to run apt get update and upgrade. So the thesis of the CoreOS company was: we’re going to secure stuff by automating that basic operational cleanliness of making sure updates can apply. That’s one pillar of security. This is where companies essentially just ignore the problem, and then they eventually get hacked.”
The second pillar is application security. This is where containers have a very particular advantage, says Philips: “One of the problems with VMs – we have customers that used to have this problem – people would request VMs or file a ticket to get a VM that would show up and then IT would have no idea what happens after that; it’s just this black box. And you end up caring for inventory of hundreds of VMs or thousands of VMs. You have no idea what’s going on inside of them. But there’s probably software that’s getting out of date, middleware software that’s getting out of date.”
Philips says that containers supply more transparency about what’s inside that container:” You’re able to say, “Here’s some metadata about the container. I’m going to introspect that container and dig through what JAR files exist.” This is how something like the Equifax hack happens, he told us “because you’re not paying attention to what is actually in your application, because you have no idea. This is really nobody’s fault except for the application developer and he’s never been a security expert.”
The third pillar is application infrastructure security: “This is network policies, and making sure the application can’t talk to this application, or that secrets get injected. So like database connection streams and so on.” Kubernetes essentially provides APIs for that, says Philips: “And then those APIs can be managed by the person in charge of the app, but they can also have overrides above that, where the infrastructure people can say, ‘Actually, you can’t talk to anybody outside of your application. You can’t talk to our super-secret secure database. You can’t talk to the HR database. You can only talk inside of this particular set of application pieces.’”
“CoreOS is always trying to productise this, and then the application security stuff is a knock-on effect. We’ve added to the security scanning to containers and bubble up information metadata that is actionable. So sending you an email, like, ‘You have vulnerable software in the container image. Maybe you should not be the next Equifax.’”
Outside of the three current pillars, there are the emerging security vendors, says Philips. “And Kubernetes is starting to build in stuff to make it possible for the compliance officers inside of these companies to do their part of the job; make sure that application developer mistakes don’t turn into organisational mistakes.”
An example of the mistakes that could occur was vividly demonstrated by Liz Rice, software engineer and technology evangelist for Aqua Security, in her keynote. Her main point was not that containers are wide open, but rather that the default settings can create unforeseen opportunities. For instance, most Dockerfiles are run as root. According to Microbadger, the project that enables you to inspect Dockerfiles hosted on DockerHub, 86% don’t have a user line and are therefore running as root by default. This can be fixed by making changes to the Docker image itself so they run as non-root. She demonstrated this with an NGINX Dockerfile by binding to a different port, changing file permissions and ownerships.
Running containers as root isn’t necessarily an issue, but as Rice says: “You might not think that anything is going to happen, but nobody thought Meltdown or Spectre was going to happen, right?” If a future vulnerability enables an attacker to escape a container with root then they can do what they like on the host machine, which is an unnecessary risk.
Rice also went on to demonstrate that there’s nothing to stop someone from mounting a root directory in their host so it’s available in a container. It’s not a smart move, she admits, but at this low level it’s the fact that it’s available at all that’s the issue. This enabled Rice to change entries in the manifest to create a pod for mining crypto-currency all without a service account and credentials of any kind.
Rice says there’s work in progress to support rootless containers and username spaces, but as you’d expect from someone working for a commercial security company, she did say that there are plenty of extra paid-for measures for auditing containers during build and runtime.
In a different approach, Google’s Craig Box announced that the company was open-sourcing gVisor, a sandboxed container environment. Companies are looking to run heterogeneous (mixed CPUs and GPUs) and less trusted workloads and this new type of container appeals to that as it’s designed to provide a secure isolation boundary between the host OS and the application running inside the container.
Box says that gVisor is used for “intercepting application system calls and acting as a guest kernel all while running in userspace.” He demonstrated this on a VM that was vulnerable to the Dirty CoW exploit, where an attacker had managed to change the password file in a container. “The exploit is causing a race condition in the kernel,” Box explained, “by alternating very quickly between two system calls and that will eventually give it access.” However, even though the container had the correct permissions to make the system calls, you could see that runac, the runtime gVisor, had stopped them and the exploit hadn’t worked.
There wasn’t much elaboration on what better storage would entail from Alexis Richard, during his future-gazing keynote for the Technical Oversight Committee, except to say that the CNCF “weren’t done until it can feed storage into the platform.”
Speaking to Michael Ferranti, VP, product marketing at Portworx, a company specialising in persistent storage for containers, he sees storage as the vital missing piece of the cloud native puzzle.
The community may be excited about transforming enterprise IT from a VMware-based virtual machine model to a container model, but the people sitting on the boards of global enterprises don’t care about that: “What they care about is getting faster to market with applications,” Ferranti explained. ”I need to make sure that my data is secure [is what they will say]. I don’t want to read about my company in data breach in the Wall Street Journal. I need to make sure that wherever my user are they can always access my application. What containers and microservices enable is solving all of those problems.”
But according to Ferranti, quoting Gartner, “90% of enterprise applications are stateful, they have data – it’s your database, your transaction processes. So if you can’t solve the data problem for those types of applications and for containers, you’re only talking about 10% of the total deployable applications in an enterprise that can actually move to containers. Now that’s not a transformation; that’s an incremental add-on.”
The problem with storage is that data has gravity: moving petabytes of data from one location to another takes a lot of time. It also exposes data to risks during transport and because it’s hard to move, you tend to run your application in one location. Ferranti says this is what happened to Amazon essentially: it had a lot of problems with its east region at one particular point in time, so lots of people had outages because they were dependent on that region.
Ferranti say that Portworx makes it possible to run applications, including mission critical data, in multiple clouds and hybrid clouds between environments, which means you can have a copy in one location as your production system and a disaster recovery site in another place. It seems to be doing well from its early adoption of containers too, picking up business from corporate giants such as Comcast, T-Mobile and Verizon.
However, the issue, or one of them at least, that the CNCF has is that typically persistent storage systems have existed outside of the native cloud environments creating the potential for vendor lock-in from provider managed services and although Alexis Richardson didn’t mention it in his keynote, he was likely thinking of Rook, the distributed storage orchestrator, as a major part of the solution.
Rook was given an early, inception stage status by CNCF in January of this year and the CNCF has indicated that Rook is focused on “turning existing battle-tested storage systems, such as Ceph, into a set of cloud-native services that run seamlessly on-top of Kubernetes.”
Now Ceph is a distributed storage platform, which has one particularly significant characteristic and that is as more units are added to the system, the aggregate capability in terms of transactions, data in and out of IOPs (input/output operations per second) and its bandwidth continues to expand.
In December of last year, Allen Samuels, advisory board member for Ceph, said that the community are deeply involved in a redesign of the lowest level interfaces of Ceph. This will remove it from being on top of the filesystem. So instead of using a native filesystem, it’s going to use a storage block and manage that itself. As Rock is seeking to provide file, block, and object storage services that feeds into Kubernetes that makes a lot of sense.
Serverless & new developments
Other interesting developments of note from Kubecon + CloudNativeCon were the announcement of a cloud native programming language called Ballerina backed by WSO2, which is designed to make it really easy to write integration services and a growing interest in serverless, which now has a working group that’s been involved in the OpenEvents (now called CloudEvents), the project standardising event specifications, we’ve previously mentioned.
Serverless continues the disaggregation of applications in the enterprise world and there are a number of companies attempting to make serverless more approachable and in particular Austen Collins, founder and CEO of Serverless, Inc, who ran a few talks on the subject.
He describes serverless as two things: Functions and events. The rationale is that events happen all the time. Everything emits an event, for example, when you upload a file to an S3 bucket in the cloud. These events will often be things that you want to react to because they are important to your business, but you don’t want to have something that’s always ready, idling away in the background costing you money. This is where serverless functions become relevant and can replace certain aspects of microservices, which are typical ones that receive an event from somewhere in the cloud.
Omri Harel, senior software developer at Iguazio, the company behind an open source serverless framework called Nuclio, told us that functions can also be used in a traditional CI/CD pipeline: “Let’s say a pull request gets opened on GitHub and GitHub fires off a web hook and that can arrive at some service located in the cloud or alternatively it could arrive at a serverless function.
The way that Harel, and many serverless companies explain the technology, is focused on a common theme in cloud native: making life easier for developers. However, the name itself is a misnomer: “It’s not that there isn’t a server,” Harel explains. “it’s that you don’t feel one. You just don’t think about it.” One key aspect of serverless is that you don’t have to think about provisioning servers, how they communicate, dependency issues and so on, says Harel: “What serverless frameworks aim to allow you to do as a developer is to not worry about that, simply write code, write a function, tell the framework about that function and deploy it. Where, you don’t care, how many? You don’t care either. Most serverless frameworks offer some sort of auto-scaling so if your function is invoked many times then its number of instances will scale up appropriately. If it’s not invoked at all, then it has no instances running and then that also means you don’t pay. You never pay for idle when you use serverless.” All the major cloud providers, have their own cloud functions systems, for example, Amazon has Lambda and Google has Google Functions. If you are looking for an open source framework that you can deploy on frame on a Kubernetes cluster that you own and operate then you benefit from the autoscaling that Kubernetes has itself.”
While many were trying to promote a definition of serverless in Kubecon to interested parties, Kelsey Hightower, software advocate at Google and high profile figure in the community, used his keynote to suggest that it might be a little too early to limit this event-driven architecture. “Typically, Hightower says, “if you listen to people they will tell you something happens in the cloud and it calls their function.” He felt this was constraining it to a platform or two, when serverless events could be democratised by standardising the wrapping and transport of events using CloudEvents so they could flow through any system you want to use.
He demonstrated this by running a Hello World demo, where he translated text from English to Danish in Amazon S3, which is a common serverless example. This would normally all occur in the cloud, but he devised a way to do it from on-premises and have S3 call his function. Without diving too deeply into the details, which you can watch here [https://youtu.be/_1-5YFfJCqM https://www.youtube.com/watch?v=_1-5YFfJCqM], whenever the Amazon S3 received anything in a specific bucket, it was told to send the event to the IP address of a special open source broker called the event gateway (by Serverless). Next, the gateway wrapped the event in the CloudEvents structure and passed it along to Hightower’s application. Finally, he used the normal libraries that are available in Lambda or any cloud provider to process the event.
There were some issues with notification and authentication issues to overcome, but he was able to demonstrate that once the event hit the gateway he could, at that point, decide who got the event next, which could on-premises, his laptop, it could be a function or even a container. The point was that serverless has huge potential and the industry shouldn’t be too hasty to scope down what it was for.
There’s a lot of excitement around Kubernetes, containerisation and the growth of cloud native, but as Ranchers Sheng Liang commented in our interview with him: “Container orchestration hasn’t quite become mainstream yet. It’s probably mainstream within web companies, but it’s getting to more traditional banks and insurance companies, I mean, they are all talking about it, but if you objectively measure the workload that they have working on Kubernetes, it’s probably still quite low.”
As we stopped in at the Rancher stand to say goodbye to Liang before heading for the airport, he reminded us that cloud native is still only a market in the tens of millions, which in the enterprise world is pocket change.
But in the future that figure is going to grow significantly.
To quote Satya Nadella, CEO of Microsoft recently, who, in turn, was quoting Mark Weiser, chief scientist at Xerox PARC and father of ubiquitous computing: “The most profound technologies are those that disappear. They weave themselves into the fabric of everyday life until they are indistinguishable from it.” Cloud native has a breathtaking ambition of wanting to hide the complex infrastructures and systems to empower the daily lives of of developers. If it succeeds, which seems likely, it will enable them to release ubiquitous services on a global scale that will, in turn, change everyone’s daily lives and that explosion of creativity will powered by open source software.