Blog

The good, the bad and the ugly of the XZ vulnerability

Author: Craig McLuckie

7 mins read

Apr 1, 2024

/ Subscribe

The recent CVE 2024-3094 (the XZ vulnerability exposed by Red Hat on March 29, 2024) has sparked many discussions here at Stacklok and discussions with our friends in the community. We see a sea change in how hostile actors are operating. This will necessitate a change in how we think about security practices for open source communities, and we need to collectively do a better job understanding and supporting communities that form the backbone of our technology stack.

Background: What is the XZ vulnerability?

In short, a community project (XZ, a commonly used compression toolset) was deliberately compromised by an individual who was trusted as a maintainer by their community. It appears that they worked diligently to establish credibility in the ecosystem and then used that credibility to introduce a remote code execution in SSH that only manifests in certain environments. They also used their position in the community to start applying pressure to include the changes they had introduced into a cross section of environments. Ironically, it was only through serendipity and some sloppiness on the implementation that a Microsoft researcher was able to discover the issue.

For those not familiar with the exploit, there are several well-written analyses of the attack payload. In short, the attack appears to have been professionally executed, targeting OpenSSH as popularly deployed by package managers:

The packaged release code included an additional script beyond the public code, which activated the payload by extracting it from obfuscated binary test files checked into the repository.
The extracted payload will further only activate when building as part of a Debian or RPM bundle on x86.
The affected library is not a direct dependency of OpenSSH, but is commonly linked with OpenSSH by distros that integrate between OpenSSH and systemd via libsystemd (which depends on liblzma, which includes xz).
When built into a library, the exploit will check several evasion conditions before activating.
If activated, exploit hooks the RSA_public_decrypt pointer to run the attack code
It appears that the attack code would extract an encrypted command from a target certificate and execute the resulting command with the permissions of OpenSSH (typically, root)

Given the specificity of the target and the sophistication of the attack, it’s been speculated that this may have been the work of a nation-state attacker.

The Good

It was caught prior to widespread production deployment. A security researcher (AndresFreundTec) noticed odd performance behavior with sshd, and serendipitously caught this issue within a month of it being introduced. The discovery was made prior to the introduction (as far as we know) into production operating systems, though there were signals the hostile actor/actors were actively working to backdoor the vulnerability into multiple environments. We, as a community, were fortunate that this was discovered early, as it prevented potentially widespread damage. The potential impact along with the novel approach (really speaking to an advanced and persistent attack) means that despite the mitigation, this does represent a significant moment for the OSS community.

The Bad

At Stacklok, we have been talking about the transition from hackers "sneaking in through open windows" with a CVE-oriented attack model to "bribing people at window manufacturers to ensure that all windows are manufactured to be less secure". This is the first concrete example of something that we believe represents a potential sea of change in OSS security. It is worth emphasizing that this pattern of exploit is neither simple to effect, nor easy to address.

The attacker’s methodical approach to gain trust, and working with multiple persons to circumvent checks and balances highlights the limitations of existing security tools in identifying such threats.

Existing SCA (Software Composition Analysis) tools that rely on the CVE as the primary ‘currency’ for security simply don’t provide as much security for this type of incident as organizations may hope. Certainly now that the issue is understood, tools will address the problem, but the community is vulnerable in the meantime. We need innovative solutions to support the open source community and bring new capabilities to address these challenges.

We are actively looking hard at the Trusty and Minder roadmaps in light of this issue. We can see ways in which the tools will, over time, help both communities and organizations navigate this type of situation. It will take time for new tools to emerge and be adopted at sufficient scale to move the needle here. The most important transition will be moving from making ‘good/bad’ decisions about software on the basis of the presence or absence of a CVE, to making consumption decisions based on aggregate health and community support, and to draw the community’s attention to projects that are critical and need more love.

The Ugly

The really tough truth is that open source maintainers are doing critical work, in many cases for the love of technology and without a lot of recognition or reward.

One of the worst possible outcomes of a situation like this, is potentially layering even more pressure on under-served and under-appreciated maintainers who are already struggling.

Without really speaking to the involved individuals it is hard to know what exactly transpired here, but it seems plausible that the drag and thanklessness of OSS maintainer roles created an opening for a hostile actor to gain a toehold in a critical environment.

Given the seriousness of the threat and the risk this represents to national interests, we face the very real prospect of something that looks and feels a little like ‘McCarthyism’. The outcome cannot be putting even more pressure on under-appreciated people. We need to provide better support to maintainers across the board, and enterprise organizations need to step forward and give back to OSS communities that support them.

Looking forward: How to prevent attacks like this from affecting your organization

Here are a few concrete things you might think about as you navigate the fallout of this incident for your organization:

Find safety in numbers

This specific attack targeted a critical piece of technology that has been under-maintained. We need to bring more resources to critical projects that are under-maintained and train our engineers to make good moment-to-moment choices around the intrinsic sustainability of a package. Packages that have healthy communities around them will, on aggregate, be safer than packages that have dwindling community engagement and support. Let’s help engineers understand how to assess the health of a community, and gravitate towards projects that have stronger health metrics.

Move beyond a "binary world" of good and bad

Where the presence or absence of a CVE is the only indicator of the health of a piece of software we are consuming. It is critical that we start to transition to assessing a piece of software based on the provenance and attestations associated with it. We need to understand and invest in metrics around community health and sustainability.

Find better ways to support software maintainers

Look at the work that folks like Tidelift are doing to support communities by directing contributions directly to maintainers. Please ask your vendors how intrinsically sustainable and community aligned their approach is to OSS engagement. Are you buying services from an organization that is ‘strip mining’ OSS communities or are you buying services from an organization that is giving back by paying contributors for the work they are doing? Ask questions like ‘how many OSS only maintainers do you hire’ when assessing who to buy technology from.

Proactively prepare for sophisticated threats

We now live In a world of ‘professional spies compromising software’. We must all stand vigilant and support the community in discovering and addressing their actions. At Stacklok, for example, we have implemented and are enriching a service (Trusty) that registers a watch on public repositories, and then generates trust heuristics every time a new project is published. If we see something suspicious, we flag it to our analysts and initiate a takedown.

As we and others offer up increasingly rich heuristics around the reputation of packages, it would be good to encourage the development of community-based tools to support discovery and isolation of malicious content.