Breaking Down the Codecov Attack: Finding a Malicious Needle in a Code Haystack

April 28, 2021 Nimrod Stoler

Earlier this month, San Francisco-based technology company Codecov discovered that attackers had compromised its software platform — used by more than 29,000 customers worldwide to test software code — in the latest digital supply chain attack to make headlines. While that was troubling enough, there was an added hitch. Although the attack was identified and reported in April, the tampering reportedly started back in January. And it may have continued undetected had it not been for some astute observations by a customer.

The ripple effects and long-term ramifications of this attack have not yet been determined, and investigation is still ongoing. However, based on reports, we can examine the Tactics, Techniques and Procedures (TTPs) used by attackers to place a needle in a haystack of code, surreptitiously infect Codecov’s CI/CD pipeline and potentially gain access to thousands of customer networks. What’s clear is that this attack, like so many before it, followed a familiar path: target and steal credentials to get to the intended target.

The Codecov Breach

Codecov produces an array of code testing software, and the software that was reportedly impacted during this attack was made specifically for CI/CD pipelines. When developers at a customer organization finish testing, they will often download a script directly from Codecov’s servers, which will check the code coverage of the testing apparatus. It will then report back to Codecov’s servers.

An online statement by Codecov details the initial breach discovery: “On Thursday, April 1, 2021, we learned that someone had gained unauthorized access to our Bash Uploader script and modified it without our permission. The actor gained access because of an error in Codecov’s Docker image creation process that allowed the actor to extract the credential required to modify our Bash Uploader script.”

This uploader tool works with popular development platforms like GitHub. They use secrets and other credentials that enable interaction between applications and other tools in the CI/CD pipeline, along with access to cloud resources.

After gaining a foothold and obtaining the necessary credentials, the attackers created a backdoor by hiding a single line of malicious code within the approximately 1,900 lines of code that made up the uploader. They did so with relative ease, as two-factor authentication was not required to access the uploader, according to reports.

Each time a developer downloaded the Codecov testing script, the malicious software would begin running on the customer organization’s test machines. This allowed the attackers to export the secrets, credentials and other sensitive data stored in the victim’s continuous integration environments and send them to an attacker-controlled server outside of the customer’s infrastructure.

The particulars of this attack share commonalities with the large-scale SolarWinds breach and present an interesting perspective on the CI/CD pipeline and how efforts to protect these dynamic environments often fall short. The Codecov supply chain attack was clearly designed to extrapolate weaknesses and scale efforts for maximum impact. It took advantage of the fact that Codecov’s script was an unusually large one, based heavily on environment variables — sets of dynamic name-value pairs used by Linux and Windows operating systems — that often contain hard-coded API keys, database credentials and more. These secrets are stored and used heavily within CI/CD pipelines. And because there is often very limited security oversight around how they are managed and protected, these secrets are easy targets for attackers who pinpoint and harvest them.

For months, the attackers apparently had code execution access into each and every system that was using the Codecov script. They avoided arousing suspicion by hiding the malicious code inside a larger code series — and by concentrating their efforts on those environment variables. Based on the line of script the attackers used — one that specifically centers on sending Git repository URLs to the attacker-controlled server — it appears that GitHub was the focus.

The Discovery

It’s not abnormal for an organization to consider code downloaded from a business partner to be trustworthy, and so customers don’t always pay much attention to the granular details of the code, such as its digital fingerprint. The fact that Codecov attaches a signature to its proprietary code, however, is ultimately what led to the discovery of the attack, months after the initial breach. When the signature on one machine didn’t match up to that on another, a red flag was sent up by a customer and the attack revealed. Had the attackers changed the code signature, they might well have been able to operate unnoticed for an indefinite amount of time.

The Takeaway

Development environments are complex, with numerous places where secrets and credentials can be, and often are inadvertently exposed. For example, while code repositories such as GitHub are an essential part of the development process, credentials can be inadvertently exposed by making code public and allowing attackers to include malicious code within the builds. An organization’s code and intellectual property can be tampered with or stolen from repositories if credentials are compromised.

While there is no one vendor or tool that can completely prevent digital supply chain attacks like this from happening, there are steps organizations can take to strengthen their security postures and minimize risk:

Perform Code Signature Checks. By simply checking the software’s digital fingerprint to verify its integrity, the “dwell time” of this attack could have been limited to mere days or even hours, rather than months.

Mandate Multi-Factor Authentication (MFA). By looking at this attack process from the attackers’ point of view, the Codecov breach was made a lot simpler because they didn’t need a second authentication factor or another piece of approval to insert their code.

Do Not Store Credentials and Secrets in Environment Variables. The reasons are numerous: most notably, it is extremely hard to track usage of environment variables, as they are passed down to child processes that allow for unintended access and break the principle of least privilege. Instead, if an application requires a secret be handed over in an environment variable, use a secrets manager to help ensure only authenticated users get access to the clear text secrets.

Implement Threat Detection Capabilities. CI/CD pipelines are highly automated, which means there is little human interaction, making it easier for attackers to fly under the radar. Having threat detection capabilities in place could help detect anomalies and potential breaches easier and earlier.

Finally, with cybersecurity, most things boil down to effective and consistent communication. Developers simply don’t always have the awareness of inherent security issues along CI/CD pipelines, nor do they often have a clear sense of what their tools are actually doing with credentials and secrets. Making the dangers of vagueness and ambiguity clear to developers is a core part of Shift Left, and it’s an important — if not always welcome — way forward to more stringent security.

Developers may sometimes look at security specialists as just one more layer that makes their jobs harder. But there’s obviously a compelling need for this important layer. And while putting in the time and effort to clearly identify the risks involved and the measures needed to mitigate them may mean extra steps now, it could very well result in fewer security headaches later.

Can You Stop a Cyborg Attack? Get Inside a Biohacker’s Mind at RSA 2021

With a consuming curiosity, obsession with lock picking – both physical and abstract – and sharp technical ...

Codecov Breach Learning: Engage Developers to Protect the DevOps Pipeline

Regardless of what industry you’re in, software is a driving force behind digital innovation. But what happ...

Up Your Security I.Q. by Checking Out Our Collection of Curated Resources.

Breaking Down the Codecov Attack: Finding a Malicious Needle in a Code Haystack

The Codecov Breach

The Discovery

The Takeaway

Previous Article

Next Article

STAY IN TOUCH

Breaking Down the Codecov Attack: Finding a Malicious Needle in a Code Haystack

The Codecov Breach

The Discovery

The Takeaway

Previous Article

Next Article

Recommended for You

Modern enterprises are facing an identity explosion. Fueled by cloud adoption, DevOps acceleration, and now agentic AI, the number of human and machine identities is growing faster than most...

Alex was the kind of IT administrator who kept everything humming smoothly behind the scenes at QuantumAxis Corp. Servers, user accounts, random requests at 4:55 PM on Fridays—he put out the fires...

Scattered Spider has emerged as one of the most disruptive advanced persistent threats in recent years, breaching major organizations across telecom, gaming, transportation, and retail. In the...

Technology is moving at the speed of light, and two forces—quantum computing and AI agents—are poised to shake up cybersecurity. We’re not talking about some far-off future; this is happening now....

In July 2024, Google introduced a new feature to better protect cookies in Chrome: AppBound Cookie Encryption. This new feature was able to disrupt the world of infostealers, forcing the malware...

The line between human and machine is blurring—and it’s not a question of whether machines can do more, but how far we’re willing to let them go. The frontier lies in tackling the chaos and...

Machine identities—like the API keys, certificates, and access tokens that secure machine-to-machine connections—are swarming businesses. Yet, many teams still reach for manual tools while their...

Running Kubernetes on Amazon EKS? You’re likely already using cert-manager—the open source standard for TLS and mTLS certificate automation in Kubernetes clusters. Today, we’re excited to announce...

This blog is the second part of a two-part series on post-quantum cryptography (PQC). In Part 1, we explored how the Harvest Now, Decrypt Later (HNDL) strategy has moved from crypto-conspiracy...

Unless you lived under a rock for the past several months or started a digital detox, you have probably encountered the MCP initials (Model Context Protocol). But what is MCP? Is this just a...

This blog is the first part of a two-part series on post-quantum cryptography (PQC). In this piece, we explore why quantum threats are no longer theoretical. In Part 2, we’ll cover practical steps...

If the mere mention of identity governance and administration (IGA) stresses you out, you’re in good company. Managing digital identities and access privileges is a significant challenge that only...

The identity is the main attack vector for cybercriminals, with cybercriminals using stolen identity to infiltrate the organization, move laterally and vertically throughout the organization, and...

“The modern ‘software as a service’ (SaaS) delivery model is quietly enabling cyber attackers and—as its adoption grows—is creating a substantial vulnerability that is weakening the global...

The Model Context Protocol (MCP) is an open standard and open-source project from Anthropic that makes it quick and easy for developers to add real-world functionality — like sending emails or...

Have you ever been on a tight deadline, and suddenly, your organization’s core services go dark because a TLS certificate expired without warning? It’s a nightmare scenario no team wants to face....

As 2025 unfolds, U.S. federal agencies are navigating significant operational shifts that are impacting their overarching cybersecurity strategies. Government security leaders have always...

As organizations modernize IT infrastructure, many are adopting platforms like OpenShift Virtualization to run both traditional virtual machines (VMs) and containerized workloads on a single,...

For years, cybersecurity has been chasing a future where passwords no longer exist. And yet, here we are in 2025—still resetting them, reusing them and getting breached because of them. The...

Imagine walking into your next board meeting and saying, “We need to secure all the non-humans.” You can probably picture the reactions: furrowed brows, confused glances—not exactly a solid...