Hello everyone, and welcome to our new blog series covering one of the most important areas of innovation for modern computing: containerization.
You probably have a lot of questions: What are containers and how do they work? what are Docker, Kubernetes, Google Container Engine and Managed VM’s? How do these relate to each other? How can I build powerful applications with containers and get them to production scale? How do I extract business value from this shift? Well, ask no further. Consider this an introduction to container technology and what the benefits could mean for you
As computing models evolved we’ve seen a few pivotal moments where a clear shift in approach occurred. Glossing over ancient history, in the last 10 years we’ve seen one such change in the form of virtualization. The movement to virtualization created an enormous increase in overall resource utilization along with corresponding reductions in both time-to-value and in the repetitive work required to deliver services. Multi-tenancy, API-based administration, and the advent of public clouds further increased these benefits. The most significant breakthrough was to enable resource utilization as small as a single core out of a physical computer, available on demand, in minutes. But why virtualize an entire machine when you only need a tiny part of one?
Google confronted this problem early. Driven by the need to develop software faster, cheaper, and operate at a scale never seen before, we created a higher level of abstraction enabling finer grained control. We developed an addition to the Linux kernel called cgroups, which we used to build an isolated execution context called a container: a kind of virtualized, simplified OS which we used to power all of Google’s applications. Fast forward a few years and now the fine folks at Docker have leveraged this technology to create an interoperable format for containerized applications.
Why containers?
So what does a container provide that a VM does not?
Simple deployment: By packaging your application as a singularly addressable, registry-stored, one-command-line deployable component, a container radically simplifies the deployment of your app no matter where you’re deploying it.
Rapid availability: By abstracting just the OS rather than the whole physical computer, this package can “boot” in ~1/20th of a second compared to a minute or so for a modern VM.
Leverage microservices: Containers allow developers and operators to further subdivide compute resources. If a micro VM instance seems like overkill for your app, or if scaling an entire VM at a time seems like a big step function, containers will make a big, positive impact in your systems.
What are the implications of some of these advantages?
An obvious one is that a developer has in their laptop plenty of compute power to run multiple containers, making for easier and faster development. While it is certainly possible to run several virtual machines on a laptop, it’s far from fast, easy, or lightweight.
Similarly, release administration is easier: pushing a new version of a container is a single command. Testing is cheaper as well: in a public cloud where a VM is billed for a minimum of 10 minutes of compute time (or, gasp, a whole hour?!), that means a single test might only cost you a small amount. But if you’re running thousands of programmatically driven tests per day, this starts to add up. With a container, you could do thousands of simple tests at the same cost, amounting to large savings for your production applications.
Another implication is the composability of application systems using this model, especially with applications using open source software. While it might be a daunting systems administration (not to mention pronunciation) task for a developer to install and configure MySQL, memcached, MongoDB, Hadoop, GlusterFS, RabbitMQ, node.js, nginx, etc. together on a single box to provide a platform for their application, it is much easier and vastly lower risk to start a few containers housing these applications with some very compact scripting. Consider the amount of error-prone, specialized, boilerplate work this model eliminates. Add to this the public registration of canonical implementations for these types of core building blocks, and you have the beginnings of a real ecosystem for quality components.
In the long term, the most important value this technology promises is a portable, consistent format for running applications on diverse hosts. If you’re building applications today you have access to “bare-metal” and virtualized on-premises infrastructure, public and private clouds, as well as a multitude of available PaaS options. You might have as many as six different ways to package and deploy software! By standardizing on a container format, providers in each of these models can deliver a uniform experience, allowing workloads to easily flow to where they’re cheapest and fastest to run, while avoiding lock-in with a single platform provider.
Docker
There are plenty of detailed introductions to containers and the Docker container implementation, specifically here, here and here; suffice it to say that this approach is “A Good Idea”. That said, it’s not one without challenges. While the increased granularity of containers provides marvelous benefits, it doesn’t substantially increase the efficiency of any running workload, and some workloads require thousands of computers to run the application. Docker today is really only designed to operate on a single computer. How can containers and the workloads they host be coordinated, distributed, and managed as they consume infrastructure resources? How can they operate in multi-tenant network environments? How are they secured?
Perhaps an even more fundamental question from a system design standpoint: are we even talking about the right abstraction? Most developers and their sponsors that I talk to aren’t interested in this specific container on this specific computer — they want their service, the application itself, to be up, running, delivering value, monitored and maintained. And they want all of that irrespective of the trivial (oh, the wishful thinking) details like which containers on which computers are doing what.
Kubernetes
Google has iteratively tackled this problem, building a cluster management, networking, and naming system — the first version of which was called Borg and its successor, called Omega — to allow container technology to operate at Google scale. We now start around seven thousand new containers per second, or over two billion per week. We’ve recently applied our operational experience and technical depth around containers to create Kubernetes (sometimes shortened to “K8s” on public forums) to the world.
Kubernetes has created a layer of abstraction that allows the developer and administrator to work collectively on improving the behavior and performance of the desired service, rather than any of its individual component containers or infrastructure resources.
What does a Kubernetes cluster provide that a single container does not? By refocusing control at the service level rather than the container level, Kubernetes allows for intelligent, active management of the service as a whole. A service which scales, self-heals, and is easily updated is another of those “Good Ideas”. For example, at Google we apply machine learning techniques to ensure that a given service is operating at top efficiency.
If containers have empowered individual developers by reducing the drudgery of deployment, Kubernetes empowers teams by minimizing the quagmire of coordination. Kubernetes allows teams to compose services out of lots of containers, and to have those containers follow rules upon deployment to ensure proper operation. It’s remarkably easy for one part of a service to screw up another! With Kubernetes, these conflicts are systematically avoided. At Google, this kind of improved coordination has enhanced developer productivity, service availability, and enabled real agility at scale.
While we’re still in the early days, Kubernetes has already seen substantial adoption both by customers and a broad team of companies. Companies including Red Hat, VMware, CoreOS, and Mesosphere are eager to help customers extract the business benefits of container technology delivered at scale by Kubernetes.
Container Engine Google Container Engine is “containers as a service” hosted on Google Cloud Platform. Based on Kubernetes, Container Engine provides the means for developers to get up and running quickly with containers, and to deploy, manage and scale within the boundaries they set. We’ll be drilling into Container Engine in more detail in the upcoming series of posts.
Deployment options
We see containerization as the beginning of a big shift in computing models, and we plan to play a major role in this revolution. As you start to get to terms with containers, investigate the deployment options below and find the best fit for your requirements.
Want to run a managed cluster of computers for running dozens of containers? Take Google Container Engine for a spin.
Want to build your own clusters, either on public infrastructure or on your own systems? Give Kubernetes a shot.
Above all else, we’re extremely interested in your experiences, needs, and ideas (even your pull requests!) around this new area of innovation, so please don’t hesitate to reach out to me. We’re participating in as many conferences and meetups as we can, and we’re eager to connect with you about how container technology can make a big impact on your goals and ambitions for 2015 and beyond. I’m looking forward to talking to you about it!