Breaking Down the Codecov Attack: Finding a Malicious Needle in a Code Haystack

April 28, 2021 Nimrod Stoler

Earlier this month, San Francisco-based technology company Codecov discovered that attackers had compromised its software platform — used by more than 29,000 customers worldwide to test software code — in the latest digital supply chain attack to make headlines. While that was troubling enough, there was an added hitch. Although the attack was identified and reported in April, the tampering reportedly started back in January. And it may have continued undetected had it not been for some astute observations by a customer.

The ripple effects and long-term ramifications of this attack have not yet been determined, and investigation is still ongoing. However, based on reports, we can examine the Tactics, Techniques and Procedures (TTPs) used by attackers to place a needle in a haystack of code, surreptitiously infect Codecov’s CI/CD pipeline and potentially gain access to thousands of customer networks. What’s clear is that this attack, like so many before it, followed a familiar path: target and steal credentials to get to the intended target.

The Codecov Breach

Codecov produces an array of code testing software, and the software that was reportedly impacted during this attack was made specifically for CI/CD pipelines. When developers at a customer organization finish testing, they will often download a script directly from Codecov’s servers, which will check the code coverage of the testing apparatus. It will then report back to Codecov’s servers.

An online statement by Codecov details the initial breach discovery: “On Thursday, April 1, 2021, we learned that someone had gained unauthorized access to our Bash Uploader script and modified it without our permission. The actor gained access because of an error in Codecov’s Docker image creation process that allowed the actor to extract the credential required to modify our Bash Uploader script.”

This uploader tool works with popular development platforms like GitHub. They use secrets and other credentials that enable interaction between applications and other tools in the CI/CD pipeline, along with access to cloud resources.

After gaining a foothold and obtaining the necessary credentials, the attackers created a backdoor by hiding a single line of malicious code within the approximately 1,900 lines of code that made up the uploader. They did so with relative ease, as two-factor authentication was not required to access the uploader, according to reports.

Each time a developer downloaded the Codecov testing script, the malicious software would begin running on the customer organization’s test machines. This allowed the attackers to export the secrets, credentials and other sensitive data stored in the victim’s continuous integration environments and send them to an attacker-controlled server outside of the customer’s infrastructure.

The particulars of this attack share commonalities with the large-scale SolarWinds breach and present an interesting perspective on the CI/CD pipeline and how efforts to protect these dynamic environments often fall short. The Codecov supply chain attack was clearly designed to extrapolate weaknesses and scale efforts for maximum impact. It took advantage of the fact that Codecov’s script was an unusually large one, based heavily on environment variables — sets of dynamic name-value pairs used by Linux and Windows operating systems — that often contain hard-coded API keys, database credentials and more. These secrets are stored and used heavily within CI/CD pipelines. And because there is often very limited security oversight around how they are managed and protected, these secrets are easy targets for attackers who pinpoint and harvest them.

For months, the attackers apparently had code execution access into each and every system that was using the Codecov script. They avoided arousing suspicion by hiding the malicious code inside a larger code series — and by concentrating their efforts on those environment variables. Based on the line of script the attackers used — one that specifically centers on sending Git repository URLs to the attacker-controlled server — it appears that GitHub was the focus.

The Discovery

It’s not abnormal for an organization to consider code downloaded from a business partner to be trustworthy, and so customers don’t always pay much attention to the granular details of the code, such as its digital fingerprint. The fact that Codecov attaches a signature to its proprietary code, however, is ultimately what led to the discovery of the attack, months after the initial breach. When the signature on one machine didn’t match up to that on another, a red flag was sent up by a customer and the attack revealed. Had the attackers changed the code signature, they might well have been able to operate unnoticed for an indefinite amount of time.

The Takeaway

Development environments are complex, with numerous places where secrets and credentials can be, and often are inadvertently exposed. For example, while code repositories such as GitHub are an essential part of the development process, credentials can be inadvertently exposed by making code public and allowing attackers to include malicious code within the builds. An organization’s code and intellectual property can be tampered with or stolen from repositories if credentials are compromised.

While there is no one vendor or tool that can completely prevent digital supply chain attacks like this from happening, there are steps organizations can take to strengthen their security postures and minimize risk:

Perform Code Signature Checks. By simply checking the software’s digital fingerprint to verify its integrity, the “dwell time” of this attack could have been limited to mere days or even hours, rather than months.

Mandate Multi-Factor Authentication (MFA). By looking at this attack process from the attackers’ point of view, the Codecov breach was made a lot simpler because they didn’t need a second authentication factor or another piece of approval to insert their code.

Do Not Store Credentials and Secrets in Environment Variables. The reasons are numerous: most notably, it is extremely hard to track usage of environment variables, as they are passed down to child processes that allow for unintended access and break the principle of least privilege. Instead, if an application requires a secret be handed over in an environment variable, use a secrets manager to help ensure only authenticated users get access to the clear text secrets.

Implement Threat Detection Capabilities. CI/CD pipelines are highly automated, which means there is little human interaction, making it easier for attackers to fly under the radar. Having threat detection capabilities in place could help detect anomalies and potential breaches easier and earlier.

Finally, with cybersecurity, most things boil down to effective and consistent communication. Developers simply don’t always have the awareness of inherent security issues along CI/CD pipelines, nor do they often have a clear sense of what their tools are actually doing with credentials and secrets. Making the dangers of vagueness and ambiguity clear to developers is a core part of Shift Left, and it’s an important — if not always welcome — way forward to more stringent security.

Developers may sometimes look at security specialists as just one more layer that makes their jobs harder. But there’s obviously a compelling need for this important layer. And while putting in the time and effort to clearly identify the risks involved and the measures needed to mitigate them may mean extra steps now, it could very well result in fewer security headaches later.

Checklist for 6 Approaches to Engage Developers

Security teams can use this checklist to take six actionable approaches to more effectively engage with dev...

Securing Cloud-Native Apps and CI/CD Pipelines at Scale

Read this white paper to understand how PwC applies its strong capabilities in working with clients to iden...

Up Your Security I.Q. by Checking Out Our Collection of Curated Resources.

Breaking Down the Codecov Attack: Finding a Malicious Needle in a Code Haystack

The Codecov Breach

The Discovery

The Takeaway

Previous Article

Next Article

STAY IN TOUCH

Breaking Down the Codecov Attack: Finding a Malicious Needle in a Code Haystack

The Codecov Breach

The Discovery

The Takeaway

Previous Article

Next Article

Recommended for You

Watch how to protect and securely deploy your secrets in a multi-cloud world.

Identity-led cybersecurity exposure is on the rise.

AI Tool Use, Employee Churn and Economic Pressures Fuel the Identity Attack Surface

How CyberArk and AWS provide simple and effective secrets management.

Puppet, Chef, Ansible, and others are great for developers but pose serious risk.

If the entire security industry warped into a Galaxy Far, Far, Away, are we wrong to think that Identity Security would be the force that surrounds us?

While the concept of secrets management is not new, it has evolved significantly over the years, especially with the shift from static to dynamic applications and increasingly distributed teams.

There’s always a balancing act when it comes to building and deploying cloud-native applications in environments like Amazon Web Services (AWS). The whole point of moving production to the cloud...

Learn how SaaS-based secrets management can help security teams quickly secure secrets and non-human across multi-cloud environments.

Most companies now recognize the serious and insidious nature of cybersecurity threats. But many fail to grasp that the digital transformation, remote work, automation and cloud migration...

When news of the recent CircleCI breach broke, developers everywhere scrambled to rotate tokens and remove hardcoded secrets stored in the popular CI/CD platform to minimize their exposure. Now...

Developers want security solutions that don't slow them down - why are SaaS based secrets management solutions the answer for overtaxed security teams?

It wasn’t too long ago that using a single cloud for some business operations was cutting-edge technology. Now the cloud is essential for accelerating growth, improving efficiency and remaining...

The U.S. Department of Defense (DoD) is going all in on Zero Trust. In late 2022, the Pentagon released its long-anticipated Zero Trust strategy and roadmap for migrating “trusted” perimeter-based...

Join Kurt Sand, General Manager of DevSecOps at CyberArk, as he sits down with The Reg’s Tim Phillips to discuss the top emerging trends in the DevSecOps space, along with one big future prediction.

The recent CircleCI breach highlights the risk of storing secrets in places like private code repositories (GitHub), scripts, configuration files, files encrypted at rest, CI/CD pipeline code or...

Agencies are continuously updating their software development lifecycles and IT operations. It’s critical for them to maintain a short timeframe during this process to improve workflows.

Enabling organizations using AWS Secrets Manager to gain all the advantages of centralized secrets management with the same developer experience.

Learn how the CyberArk Identity Security Platform can help you meet the NIST Secure Software Development Framework guidance.

The world of Internet of Things (IoT) devices is both fanciful and ubiquitous, from routers and smart appliances popular at home to intelligent building systems and self-monitoring industrial...