Why We Need to Rethink Authorization for Cloud Native
Companies have moved to cloud native software development so that they can increase development speed, improve product personalization, and differentiate their buyer experiences in order to innovate and win more customers. In doing so, enterprises have also redefined how they build and run software at a fundamental level. For instance, modern cloud native applications may now be composed of dozens of microservices, housed in containers and hosted on immutable, dynamically scaling platforms like Kubernetes.
Because of these radical changes, the ways that companies implement fundamental app requirements — security, compliance and operations — have changed dramatically. The basic requirements have remained largely the same: even in the cloud, you don’t want to run random binaries you downloaded from the internet in production, let apps that process credit card information talk to apps that don’t, or let data go unencrypted or unprotected. Meanwhile, you still need to maintain four nines (99.99%) of availability for application SLAs.
But how companies actually enforce the policies that underpin those requirements has changed dramatically compared to the pre-cloud era. For instance, in the cloud, if you want to control which actions are allowed to be executed, that decision today depends on a host of contextual factors: which person or machine is taking the action, what resource the action is impacting, what data would be released within that resource, what time of day or location the action is happening, the load the action will place on the software, the IT person who is currently on-call, and much more. For these reasons, and given the general power and complexity of cloud native environments, companies have had to completely rethink how they make and enforce such policy-based decisions.
Why cloud native has reshaped security and authorization
To understand why authorization — or the policies governing “who” (people) and “what” (machines) can do what in your environment — has changed in the cloud, it’s critical to understand the fundamental building blocks that comprise cloud native environments and how authorization shapes and links them. In general, these three components are:
-
CI/CD pipelines that are responsible for assembling and deploying chunks of code.
-
Platforms, like Kubernetes, that are responsible for running that code.
-
Applications that are comprised of those chunks of code and are running on those platforms.
Each of those cloud native components must address authorization to solve a broad range of security, compliance and operational problems — issues that are standard fare for every development team. But in containerized, cloud native applications, companies must approach authorization differently compared to the pre-cloud era.
The new world of application-level authorization
Take, for instance, application-level authorization. In the cloud, many applications have moved to microservices architectures, meaning that each application is composed of dozens or hundreds of individual services — each of which has its own set of APIs. This means that security teams can no longer rely on authorization at a single point, like a gateway or firewall that sits in front of a monolithic application. Instead, policy must be enforced between all the various services, to address the substantially larger attack surface created by all of those APIs. At the same time, applications are also now deployed across any number of clusters, as well as across multiple clouds — which means that IT teams also cannot rely on a single, static app environment for security. In addition, with many more people now requiring access to production environments — including developers, IT, security, platform teams, and support engineers — the rules around privileged access can no longer be managed in hard-coded Active Directory groups or coarse static roles.
And just as always, authorization must be enforced at every layer of the application stack: microservices, gateways, the frontend, databases and more. Even a simple policy, like preventing contractors from accessing customer PII, requires multiple enforcement points. In order for the app itself to allow the right access, display the right information and provide the appropriate UI experience to a contractor (for example), the frontend, backend and database all must be policy-aware.
Authorization has always been a critical part of applications; and indeed, teams have typically built bespoke, custom authorization either directly into apps, or as a centralized system that various applications must interface with for access control. The challenge in the cloud is that authorization policy has become substantially more difficult to create and enforce in a consistent way across all the components of microservices applications. One reason for this is that microservices (and policies) are often written in many different programming languages, eliminating the possibility of a single shared library for authorization. Pulling that common code into a centralized authorization service is problematic because the network hop/latency will be too detrimental to overall app performance.
Meanwhile, there are other complicating factors — such as the fact that teams often want to choose components that are “plug-and-play,” so that they can create best-of-breed solutions by swapping in new technologies whenever needed. Those new technologies need to plug into whatever authorization scheme is being used by the application, which means that authorization can become a bottleneck when dev teams swap out one piece of technology for another — because they must first solve authorization for the new piece.
The diversity and heterogeneity of containerized apps means that companies need to find new, unified ways to manage those apps in production. Monitoring, logging, deployment and authorization all need to be standardized, so that they can work across clouds, containers and code. Practically speaking, this has become a de facto requirement for truly embracing the benefits of cloud native development. Indeed, application components cannot truly be plug-and-play in the cloud native sense until all of the typical cross-cutting concerns are handled in a unified way: authorization, monitoring, logging, deployment, and so on.
Authorization for cloud native platforms has also fundamentally changed
The authorization story is similar for the cloud native platforms, like Kubernetes, on which applications run. Public cloud environments, in general, now have thousands of APIs, and the public cloud providers have realized that traditional access control measures (RBAC, for instance) provide nowhere near the level of granularity required to truly implement robust authorization and secure them. Naturally, public cloud providers have all developed their own bespoke solutions to authorization — each built within (and dedicated to) its own cloud environment.
Kubernetes, the platform used by many organizations in production, also has APIs that need securing. Unlike the public clouds, Kubernetes has its own, revolutionary model that provides a unified API for all cluster resources, that can be extended by users and vendors to manage customer-built and customer-purchased software natively through the Kubernetes API. Authorization for the Kubernetes API must check the full configuration of each resource before it is deployed or updated, whether that resource was developed by the Kubernetes community, by the end-user, or by a 3rd party vendor. In a similar result to the public clouds (but for different reasons), the Kubernetes API requires a form of authorization that is significantly more granular and expressive than the tools that developers are most familiar with — like RBAC, ABAC, ACLs or even the IAM policies developed by public clouds. Here’s a writeup detailing why RBAC is not enough for Kubernetes.
All of this to say that authorization at the platform level also requires a rethink, because of the higher orders of complexity and sheer volume of resources and data involved. It is not only that authorization at the platform level must be unified, then; but those authorization mechanisms must also be able to account for the complexity and power of modern platforms, with a level of granularity and expressiveness that is unprecedented in terms of the tools available.
Authorization for CI/CD has also evolved
Finally, there is the CI/CD pipeline, which is responsible for building, reviewing and deploying code in cloud native environments. CI/CD has become, in effect, the cloud native equivalent of change management, because every code change to production environments (applications and platforms) goes through it. And, while application code changes have often gone through CI/CD, it is only recently — with the advent of infrastructure-as-code — that platform changes have as well. This is why CI/CD has become an incredibly powerful point of control for all cloud native development processes, and an important artifact to protect with authorization policies.
Today, CI/CD systems are used heavily by personnel across many areas of the company: operators, developers, security teams, and others. Indeed, applications are even built to make it possible for less technical employees to contribute to the source-code repositories that sit behind the CI/CD pipeline. Want to make an exception to a security, compliance or operational policy? Just add a row to the right YAML file in the right repository, and the exception is made.
At the same time, CI/CD pipelines also implement a tremendous amount of automation — and not just for running unit tests or Makefiles. Indeed, automation is used to build many versions of software, facilitate manual approvals before deployment, or deploy software across a fleet of servers — all while monitoring progress and enabling rollbacks when things go poorly.
For these reasons, CI/CD pipelines are essentially just as powerful as applications and platforms; they control what code runs on which applications, and which applications run on which clusters and platforms. Consequently, the inherent complexity, power and large user base of the CI/CD pipeline make it a crucial area to control authorization. Even more so, given that many organizations don’t have just one pipeline — they have many. As such, companies have a need to implement unified authorization across CI/CD pipelines. For example, specifying which context-aware policies can grant access to the right on-call developer — depending on the time or day, which cluster they are operating in, which code repository they are pulling code from, and so forth. Moreover, it is critical that authorization policy leverages (and accounts for) the level of automation that is inherent in the CI/CD environment.
Companies need unifying authorization across the cloud native stack
The components of cloud native development — CI/CD, platforms and applications — do not exist in individual vacuums. They interact extensively, by design. Truly governing app behavior, limiting risk, or maintaining compliance therefore requires a unified authorization standard between and across them. Indeed, in the cloud, you can hardly enforce authorization policies on the platform unless you also enforce them in the CI/CD pipeline. For instance, no developer or engineer wants to have their proposed infrastructure-as-code changes accepted and deployed by the CI/CD pipeline, only to have them rejected when they are deployed to the platform. Instead, Teams are always looking for ways to shift authorization and policy left, as early in the development process and as close to the teams trying to make changes as possible — to CI/CD pipelines, to developers’ laptops and even to the applications that sit on top of source-repositories in Git.
Meanwhile, the distinction between the application and the platform has become grayer than ever before. The database is part of the application, of course, but now that every cloud provider offers and runs databases for you, that distinction blurs. Meanwhile, Lambdas or other serverless compute platforms have dissolved the idea that your source code is ever completely under your control. This means, similarly, that any unified authorization solution for the application must also apply to the platform and CI/CD pipeline. The need inevitably arises for unified authorization across the entire cloud native stack.
Open Policy Agent (OPA) was designed for unified authorization
OPA — an open-source, general-purpose policy engine — was designed to do exactly that: implement unified authorization across every layer of the cloud native stack. Initially developed by Styra and since donated to the CNCF, OPA has become the de facto standard for cloud native policy and allows companies to build and enforce consistent authorization policy anywhere, no matter the environment.
While OPA is just one tool that developers have at their disposal, it is one that leading enterprises like Netflix, Capital One, Goldman Sachs and Atlassian use to solve authorization in their organizations. Netflix, for example, uses OPA to solve service-level authorization across the dozens of back-end services that support its popular online streaming services. Atlassian, meanwhile, used OPA to build a global authorization platform to centrally manage and audit authorization decisions for its suite of services — like Jira, Trello and Confluence (previously, many of Atlassian’s services had their own individual ways of implementing authorization).
In essence, OPA provides a file format for a policy that uses a unified language (Rego) that is flexible enough to handle all of the policies you need (including context-aware policies) across applications, platforms and CI/CD pipelines — or anywhere else. Then, the OPA policy engine can make decisions according to the policy language and can be injected with arbitrary context to make dynamic decisions. For instance, OPA policies can provide elevated access to the right on-call personnel, ensure that infrastructure-as-code changes in the CI/CD pipeline are in compliance with operational policies for the platform, or ensure that application decisions don’t put undue load on the cluster. OPA also provides a toolset that brings policy into the “-as-code” era, enabling companies to manage an entire lifecycle for policy, just as they would with software.
Just as importantly, OPA was designed to address the architectural requirements of cloud native. Specifically, OPA decouples policy from each underlying software service (involving application, platform, CI/CD, or any tooling) so that companies can use the same policy framework, regardless of what kind of software they’re using. Also, OPA runs at the edge — allowing teams to achieve the same availability and performance as if policies were hard-coded into software systems.
As companies embrace cloud native, software-defined development strategies to deliver immense value at unprecedented speed, they are running headlong into the challenge of solving authorization among and between the core components of the cloud native stack. For many companies, OPA represents a way to unify authorization and policy across every cloud environment — and of bringing authorization, itself, into the cloud native era.
Learn how SugarCRM, Atlassian and Netflix unified authorization across the stack with OPA.
This article first appeared in The New Stack on December 9, 2020.