Ask security professionals to name the biggest threat to their organization’s cloud environment and most won't hesitate to give a one-word answer: misconfigurations. Technically, they're not incorrect, yet they're defining "misconfiguration" much too narrowly. They're likely thinking of an Amazon S3 bucket that's left exposed or a misconfigured security group rule.
While identifying and remediating misconfigurations must continue as a priority, it's important to understand that misconfigurations are but one means to the ultimate end for attackers: control plane compromise, which has played a central role in every major cloud breach to date.
Considering the steady cadence of news headlines tying cloud breaches to misconfigurations over the last several years, it's understandable that finding and fixing misconfigurations has been the primary focus of security professionals and their solutions vendors.
But these stories almost always bury the lede. All of these attacks began with an initial penetration event — either a single resource misconfiguration, an application vulnerability, or application programming interface (API) keys in source code — which attackers identify using automation tooling.
That's just the beginning of the story. Once a hacker gains a foothold in an environment, it's the API keys that permit them to operate against the cloud provider's API control plane that they're really after. These API keys let them discover knowledge about the environment, move laterally, and find and extract data while evading detection by security tools.
The cloud control plane is the collection of APIs that a cloud service provider like Amazon, Google or Microsoft provides to developers so they can configure and control the cloud environments they work in every day. When developers build applications in the cloud, they're also building the infrastructure for the applications as opposed to buying a pile of infrastructure and shoving apps into it. The process of building cloud infrastructure is done with code, which means developers own that process. They're the ones using the APIs to make or destroy servers and make or access storage.
The lesson for all business leaders and security professionals: They must shift their strategic focus from traditional security approaches like intrusion detection and network security to prevention and secure cloud architecture design. Making this shift requires addressing five fundamentals, beginning with knowing the environment and all of the ways that attackers might exploit it.
- Know the company’s environment.
Resource misconfigurations will always slip past guardrails and into the run-time environment, that's unavoidable. Find and remediate them before attackers can exploit them. Knowing the environment requires making sure resources aren't misconfigured. Security pros need to think like a hacker to understand the vulnerabilities in the environment if a hacker gains initial penetration.
The vast majority of cloud exploits are because of imperfect design and architecture; cloud security functions mainly as a design problem, not a maintenance problem. It’s very different from data center security.
- Focus on prevention and secure design.
Because the security team must know its environment to thwart attackers and prevent security events from occurring, the team must implement secure designs that start not with the security team but with the people working in the cloud every day: developers.
Inherently-secure cloud architecture denies attackers the ability to discover knowledge about the environment and move laterally — should they gain initial penetration. Secure design focuses on the configuration and use of identity and access management (IAM) resources as well as resource access policies.
When designing cloud infrastructure, it’s generally faster and more convenient to configure cloud resources in highly insecure ways, and as a result, overly-broad roles and permissions are commonplace in enterprise cloud environments. It may take more time to understand the granular configurations any given resource needs to serve its function — and nothing more — but it’s necessary to produce an inherently-secure cloud environment.
The effort required to get the design secure up front pays off. Making needed changes once an environment is deployed and in operation can be a lot more complicated and time-consuming and often requires painful application rework.
- Empower developers.
When a company focuses on prevention and secure design, who better to prevent misconfigurations and design flaws than the developers and engineers building these systems in the cloud? Give them the tooling that guides them in designing environments that are inherently secure against today’s control plane compromise attacks.
When developers build applications in the cloud, they also build the infrastructure for their applications, and they’re making configuration decisions that have a big impact on the security posture of the environment. Because developers (and DevOps) now own the process of building and managing cloud infrastructure, the security team’s role becomes one of security architect.
Security teams need to understand how to design secure cloud architecture — and how to identify flawed designs that open the door to control plane compromise attacks. It’s their job to work closely with developers and DevOps teams to ensure that environments are designed in secure ways. But the way to do this isn’t with the usual security checklists and policy manuals; it’s with automation using Policy-as-Code (PaC).
- Adopt Policy-as-Code.
In a completely software-defined world, the security team’s role becomes that of the domain expert that imparts knowledge to the people building apps — the developers — to ensure they're working in a secure environment. PaC lets the team express security and compliance rules in a programming language that an application can use to check the correctness of configurations.
PaC has been designed to check other code and running environments for unwanted conditions or errors. It empowers all cloud stakeholders to operate securely without any ambiguity or disagreement on what the rules are and how they should get applied at both ends of the software development life cycle (SDLC). At the same time, PaC automates the process of constantly searching for and remediating misconfigurations.
There are no other approaches that in the long run are successful at this because the problem space keeps growing. The number of cloud services keeps growing, the number of deployments, and the amount of resources keeps growing. Organizations must automate to relieve security professionals from having to spend their days manually monitoring for misconfigurations and enable developers to write code in a way that is flexible, that can be changed over time, and that can incorporate new knowledge.
- Measure what matters.
Because cloud environments are constantly changing, teams must constantly measure the effectiveness and deficiencies of their cloud security strategy. Successful organizations quantify how effective they're being at preventing hacks that could potentially happen and using that data to improve their processes. Because these are mutable resources, the team wants to prevent misconfigurations upfront and when new ones inevitably appear.
This measuring discipline will not just help manage and reduce risk, it will help the organization realize the full potential of the cloud. Developers are under enormous pressure to build and ship applications quickly. Cloud security has often become the rate-limiting factor for how fast they can go in the cloud and, more broadly, how successful the organization’s digital transformation can be.
Companies don't want developers waiting around for security approvals, and they don't want the engineers to spend the bulk of their hours on manual cloud security tasks such as auditing environments, remediating vulnerabilities, or doing the rework that often results when teams wait until infrastructure gets built to identify security issues. Measuring the impact cloud security systems and policies have on these lines of business will help identify and smash roadblocks that hurt productivity levels and create in-fighting.
While it's good practice to log everything that happens in a cloud environment and analyze those logs for unwanted activity, control plane compromise attacks happen so fast that the best the security team can count on is to discover that it was hacked shortly afterward. Most victims don't discover they were breached until their data shows up on the dark web — or the hacker starts bragging about their exploits.
Don't become yet another statistic that prompts a new wave of bad news headlines. Embracing the five fundamentals of cloud security will help the organization broaden its cloud security focus beyond the narrow view of the risk misconfigurations pose and focus on preventing control plane compromise.
For more on this topic, check out this short video.
Josh Stella, chief architect, Snyk