Decentralized Identity Attack Surface – Part 2

December 15, 2022 Shaked Reiner

Introduction

This is the second part of our Decentralized Identity (DID) blog series. In case you’re not familiar with DID concepts, we highly encourage you to start with the first part. This time we will cover a different DID implementation — Sovrin. We will also see what a critical (CVSS 10) DID vulnerability looks like by reviewing the one we found in this popular implementation.

Sovrin

The Sovrin network is a public service network that enables anyone to get a self-sovereign identity (DID) on the internet. It is operated and governed by the Sovrin Foundation (a nonprofit organization).

Sovrin is probably one of the more popular DID networks currently in production, and it just recently celebrated a five-year anniversary for its network MainNet that has more than 150k transactions.

Sovrin is a deployment of Hyperledger Indy, which provides a distributed ledger and various tools specifically for identity purposes. Hyperledger Indy is one of the projects of the Hyperledger Foundation, which focuses on blockchain software and lives under the Linux Foundation umbrella.

One key difference between the Indy ledger and any other ledger you’re probably familiar with is that Indy is permissioned. That means that no one can just spin up a node and start writing to the Indy ledger. Every network deployment (e.g., Sovrin) needs to have a governance document that defines specifically who is permitted to run a node. This results in the following:

  1. Decentralization is impaired since not anyone can be a part of the network and a finite number of nodes is enforced.
  2. Node-client trust is no longer an issue. The client can know in advance who the nodes are and verify that every piece of data they get from the blockchain is signed and agreed on by all nodes.

The permissioned model allows Indy to have different roles and privilege levels in the network, in which some can only observe the ledger, some can write to it and others can perform administrative operations.

The Indy distributed solution for identity purposes is very elaborate, and if you want to expand your knowledge of it outside the scope of our post here, I encourage you to do so using the code or this presentation.

Diving In

Now that we have our context all set up, let’s begin to dig into the technical stuff, which will mainly be the Indy node. The code of the node is what actually drives the network, and Sovrin is entirely based on Hyperledger Indy.

Every time you start to look at a new system, it can be helpful to map out the inputs and outputs of the system. This gives you a general “feel” of the system and also maps out most of the attack surface for you. When we talk about the Indy node (or any node in a blockchain environment for that matter), we have a few inputs:

  • The Blockchain/Ledger – The node needs to read information off the chain (be it identity information or any other kind).
  • Request Handlers – It is crucial for the network’s operation that the nodes can handle requests from both clients and other nodes.
    • Client requests – These will be requests for both reading and writing data from/to the chain.
    • Node requests – These are usually more operational, whether they are related to the network’s structure itself, mempool information, message propagation, etc.

The Indy node developers made it easier on us by having a very organized structure for both the Python request classes and their files and folders. This makes it easy to correlate the code with the description in Indy’s documentation.

$ tree /indy_node/server/request_handlers
.
├── action_req_handlers
│   ├── __init__.py
│   ├── pool_restart_handler.py
│   └── validator_info_handler.py
|   ...
├── config_req_handlers
│   ├── __init__.py
│   ├── auth_rule
│   │   ├── __init__.py
│   │   ├── abstract_auth_rule_handler.py
|        ...
|   ...
├── domain_req_handlers
│   ├── __init__.py
│   ├── attribute_handler.py
|   ...
├── pool_req_handlers
│   ├── __init__.py
│   └── node_handler.py
├── read_req_handlers
│   ├── __init__.py
│   ├── get_attribute_handler.py
|   ...
└── utils.py

Every request class type has a few abstract functions for different kinds of validations and verifications that are invoked in different parts of the request-handling process. Going over these handlers while reading the documentation allowed us to have a much better understanding of the system and how the nodes operate within it.

What Does a DID CVSS 10 Vulnerability (CVE-2022-31020) Look Like?

One of the request handlers stood out in the initial review —pool_upgrade_handler.py. The node’s code tried to handle a parameter, PACKAGE, that was not found in the documentation. Naturally, we had to dig in and understand if there was anything of interest in this request.

pkg_to_upgrade = operation.get(PACKAGE, getConfig().UPGRADE_ENTRY)

Based on the documentation, a POOL_UPDGRADE command is a command to upgrade the Pool (sent by Trustee). It upgrades the specified Nodes (either all nodes in the Pool, or some specific ones).
This handler’s additional_dynamic_validation() function seems to be able to handle upgrading different package names on the pool nodes, unlike the documentation that doesn’t mention a package name at all, implying it can only upgrade the Indy package.

def additional_dynamic_validation(self, request: Request, req_pp_time: Optional[int]):
        self._validate_request_type(request)
        identifier, req_id, operation = get_request_data(request)
        status = '*'

        pkg_to_upgrade = operation.get(PACKAGE, getConfig().UPGRADE_ENTRY)
        targetVersion = operation[VERSION]
        reinstall = operation.get(REINSTALL, False)

        if not pkg_to_upgrade:
            raise InvalidClientRequest(identifier, req_id, "Upgrade package name is empty")

        try:
            res = self.upgrader.check_upgrade_possible(pkg_to_upgrade, targetVersion, reinstall)
        except Exception as exc:
            res = str(exc)
...

Our POOL_UPDGRADE handler is implemented in the PoolUpgradeHandler class, and it inherits from WriteRequestHandler. The base class is imported from Plenum (an Indy project on which the node is based) and if we take a look at it, we can see that our
additional_dynamic_validation() function is called as part of the dynamic validation (which, as the name suggests, happens before the request is performed) after an authorize function:

def dynamic_validation(self, request: Request, req_pp_time: Optional[int]):
        self._validate_request_type(request)
        self._validate_ledger_is_not_frozen(request)
        self.authorize(request)
        self.additional_dynamic_validation(request, req_pp_time)

This gives us a bit of context, and we can now focus on what happens with our package name, a user-controlled input, during the validation stage in self.upgrader.check_upgrade_possible(). To make a long story short, the name is being passed “as is” to a few inner functions: check_upgrade_possible()→ curr_pkg_info()→_get_curr_info(). The last one will run a system command executing the package manager (dpkg) with the supplied package name as an argument in order to get its information.

@classmethod
    def _get_curr_info(cls, package):
        cmd = compose_cmd(['dpkg', '-s', package])
        return cls.run_shell_command(cmd)

Being naïve, we assumed there will be some sanitization or validity checks on our supplied package name in either the compose_cmd() or the run_shell_command() function.

def compose_cmd(cmd):
    if os.name != 'nt':
        cmd = ' '.join(cmd)
    return cmd

def run_shell_command(cls, command, timeout=TIMEOUT):
        try:
            ret = subprocess.run(command, shell=True, check=True, stdout=subprocess.PIPE, timeout=timeout)
            ret_bytes = ret.stdout
        except subprocess.CalledProcessError as ex:
            ret_bytes = ex.output
        except Exception as ex:
            raise Exception("command {} failed with {}".format(command, ex))
        ret_msg = ret_bytes.decode(locale.getpreferredencoding(), 'decode_errors').strip() if ret_bytes else ""
        return ret_msg

Lo and behold, no validations, no checks and no sanitization. We can supply any package name our heart desires, and it’ll be concatenated with dpkg -s to finally be executed on the node! Needless to say, we have a command injection vulnerability on our hands. Just slap a semicolon and a reverse shell as the package name, and you get control of any Indy node you like.

This far, we know arbitrary code can be executed using the vulnerable POOL_UPGRADE request handler, but who exactly has the sufficient permissions to do that? (Remember the authorize function?) The documentation says that you must be a Trustee to upgrade nodes in the pool. A Trustee is a privileged entity in the Indy network, and you must be explicitly given this permission. Decentralized systems keep reminding us and illustrating better than ever that “code is law.” This case is no different, so let us check whether the implementation is aligned with the documentation.

Since authorize() is not implemented in either the abstract base class (WriteRequestHandler), or our PoolUpgradeHandler class, anyone can trigger this vulnerability and execute code on any node in the network. Authorization does happen later in PoolUpgradeHandler. As expected, only Trustees can perform an actual upgrade, but this is already too late for authorization of the vulnerability trigger, which happens during the validation/checks of the package-to-be-upgraded name. As if this were not bad enough, the node also propagates the POOL_UPGRADE request. Once you send this payload to one node, it will forward it to all the nodes in the current pool, and you will effectively own the entire network.

Vulnerability Impact

It is probably clear to anyone who has spent even a little bit of time in the security industry that an unauthenticated RCE is pretty much “game over.” It can shut down or destroy the entire network in a heartbeat. But what can we actually do with it within the context of our DID system?

Even though Indy is a permissioned blockchain, it is still public (i.e., anyone can read anything). And since the nodes don’t hold users’ private data, it would probably not be too appealing for an attacker to steal information from the nodes. However, what will be completely detrimental to the system is breaking the consensus, and, of course, we can do that.

After getting code execution, an attacker is able to steal the node’s private keys, change actual data in the blockchain itself (stored locally in a RocksDB) or leave a backdoor in the Indy-node code. Of course, the attacker needs to do that on a number of nodes that can break the consensus. Having said that, this is not a proof-of-work blockchain when you need to compromise 51% of the nodes. Hyperledger Indy uses the RBFT consensus protocol, which allows the network to operate properly — even with a small number of malicious nodes. This number varies between deployments, but for Sovrin, it is only nine.

Then the attacker can steal an identity — not by stealing the user’s private key but by rotating the identity’s public key in the ledger to a new one, paired with a private key controlled by the attacker.

Conclusion

In this second part of our DID attack surface posts, we’ve had a quick overview of the popular Sovrin DID network and the project it’s based on — Hyperledger Indy. We also saw how a rather simple established bug class can be found in a fairly new technology — a DID network.

As with every new technology, it is our interest to have security considerations involved with its development as early as possible. In the past couple of years, we’ve been witnessing a paradigm shift in which Decentralized Finance (DeFi) app developers understand the huge risk vulnerabilities can pose to their systems (mainly because it’s easy to quantify with dollars), and they are willing to invest all types of resources to better secure their systems (and our lives). The best example of that is the Immunefi bug bounty platform. Just take a gander at their website and see how much these companies are willing to put in for security. We hope that such investment in security will be soon more common, whether it is in non-financial decentralized systems like DID we covered here or in more traditional centralized platforms.

When it comes to DID security, you can’t simply secure the new decentralization/identity-related code; you must also secure standard components like web interfaces or request handlers. These may not be as exciting as the innovative components are, but we saw how just as detrimental a security issue in one of them can be to the entire system. Finally, we hope to encourage others in the security community to delve into the DID world and help us make it more secure.

Disclosure Timeline

  • May 2, 2022 – vulnerability reported to Hyperledger Indy
  • May 6, 2022 – Indy team created a security advisory graph
  • May 31, 2022 – Indy team issued a fix
  • June 1, 2022 – GitHub issued CVE-2022-31020  for this issue
  • August 2022 – Indy team deployed the fix in the networks they work with
  • September 2, 2022 – public disclosure

Previous Article
What I Learned from Analyzing a Caching Vulnerability in Istio
What I Learned from Analyzing a Caching Vulnerability in Istio

TL;DR Istio is an open-source service mash that can layer over applications. Studying CVE-2021-34824 in Ist...

Next Article
Decentralized Identity Attack Surface – Part 1
Decentralized Identity Attack Surface – Part 1

Introduction Who are you? That’s a hard question to answer. Many philosophers have been fascinated with thi...