GhostHook – Bypassing PatchGuard with Processor Trace Based Hooking

June 22, 2017 Kasif Dekel

In this article, we’ll present a new hooking technique that we have found during our research work.

Hooking techniques give you the control over the way an operating system or a piece of software behaves. Some of the software that utilizes hooks include: application security solutions, system utilities, tools for programming (e.g. interception, debugging, extending software, etc.), malicious software (e.g. rootkits) and many others.

Please note, this is neither an elevation nor an exploitation technique. This technique is intended for post-exploitation scenario where the attacker has control over the asset. Since malicious kernel code (rootkits) often seeks to establish persistence in unfriendly territory, stealth technology plays a fundamental role.

Technical Description

The GhostHook technique we discovered can provide malicious actors or information security products with the ability to hook almost any piece of code running on the machine.

Let’s start by explaining the primary technology involved in this technique, Intel® PT:

Intel® Processor Trace (Intel PT) is an extension of Intel® Architecture that captures information about software execution using dedicated hardware facilities that cause only minimal performance perturbation to the software being traced.

This information is collected in data packets. The initial implementations of Intel PT offer control flow tracing, which generates a variety of packets to be processed by a software decoder.

The packets include timing, program flow information (e.g. branch targets, branch taken/not taken indications) and program-induced mode related information (e.g. Intel TSX state transitions, CR3 changes). These packets may be buffered internally before being sent to the memory subsystem or another output mechanism that is available in the platform.

Debug software can process the trace data and reconstruct the program flow. Here’s a list of a change-of-flow instructions which Intel PT traces:

Type	Instructions
Conditional Branch	JA, JAE, JB, JBE, JC, JCXZ< JECXZ, JRCXZ, JE, JG, JGE, JL, JLE, JNA, JNAE, JNB, JNBE, JNC, JNE, JNG, JNGE, JNL, JNLE, JNO, JNP, JNS, JNZ, JO, JP, JPE, JPO, JS, JZ, LOOP, LOOPE, LOOPNE, LOOPNZ, LOOPZ
Unconditional Direct Branch	JMP (E9 xx, EB xx), CALL (E8 xx)
Indirect Branch	JMP (FF /4), CALL (FF /2)
Near Ret	RET (C3, C2 xx)
Far Transfers	INTn, INTO, IRET, IRETD, IRETQ, JMP (EA xx, FF /5), CALL (9A xx, FF /3), RET (CB, CA xx), SYSCALL, SYSRET, SYSENTER, SYSEXIT, VMLAUNCH, VMRESUME

Intel PT was initially released as part of “Broadwell” (5^th-generation) CPU and was expanded on “Skylake” (6^th-generation) CPU.

So basically, Intel PT provides low overhead hardware that executes tracing on each hardware thread using dedicated hardware (implemented entirely in hardware) in the CPU’s Performance Monitoring Unit (PMU). Intel PT can trace any software the CPU runs including hypervisors (except for SGX secure containers).

This technology is primarily used for performance monitoring, diagnostic code coverage, debugging, fuzzing, malware analysis and exploit detection.

There are three types of tracing:

Tracing of the entire user-mode/kernel-mode (current privilege level).
Tracing a single process (Page Map Level 4).
Instruction Pointer tracing, and this is what we will take advantage of.

To enable tracing, all you have to do is set the proper values inside the IA32_RTIT MSRs according to the tracing type.

Although this technology can be used for legitimate, valuable purposes, one can also take advantage of the buffer-is-going-full notification mechanism to try to take control of a thread’s execution.

The basis of this proposed technique is to make the CPU branch to our piece of code. How can we achieve that with Intel PT?

Allocate an extremely small buffer for the CPU’s PT packets.

This way, the CPU will quickly run out of buffer space and will jump the PMI handler.

The PMI handler is a piece of code controlled by us and will perform the “hook”.

The PMI handler is invoked when the buffer is full or about to be full and can be registered via:

HalSetSystemInformation(HalProfileSourceInterruptHandler, sizeof(PMIHANDLER), (LPVOID)&amp;hookroutine);

Start the Intel PT to trace a critical range of code in the kernel. For example, the LSTAR MSR, which is the system-call entry point of the kernel. It’s address can be obtained as follows:

ULONG64 LSTAR = ((ULONG64(*)())"\xB9\x82\x00\x00\xC0\x0F\x32\x48\xC1\xE2\x20\x48\x09\xD0\xC3")();

This will produce a naked function with the following instructions:

As mentioned, LSTAR is the kernel’s RIP SYSCALL entry (MSR entry 0xc0000082) for 64-bit software.

So basically, we will intercept the nt!KiSystemCall64 function, which is the entry point for service functions on Windows, until the end of the nt!KiSystemServiceUser.

Once a user-mode (SYSCALL) or a kernel-mode (ZW functions) thread will branch inside this code region, the CPU will trace its execution.

As mentioned before, we are allocating a tiny buffer for the CPU to get it filled almost immediately. We can also use the PTWRITE instruction to make it more precise. PTWRITE will allow us to write data to a processor trace packet, and once the buffer is full, the CPU will interrupt the execution and will call the PMI handler (controlled by us) in the context of the running thread. It is possible to completely alter the execution context at this point, which is exactly the same as what one could do via traditional opcode-replacement-based patching for a given location. Proof of Concept:

The technique described above, in the current implementation, will result in a race-condition manner. This stack trace demonstrates hooking the service function nt!NtClose, called by a user-mode application.

The same thing can be done for example to an IDT routine:

Registering to PMI is transparent to the current PatchGuard implementation. Because this technique uses hardware to gain control of a thread’s execution and kernel code/critical kernel structures aren’t being patched, it would be extremely difficult for Microsoft to detect and defeat this technique. Thus, the proposed method should be preferable (despite adding significant complexity to the implementation). Moreover, the suggested technique should be future proof and reliable across kernel versions.

Microsoft’s response:

“The engineering team has finished their analysis of this report and determined that it requires the attacker already be running kernel code on the system. As such, this doesn’t meet the bar for servicing in a security update however it may be addressed in a future version of Windows. As such I’ve closed this case.”

Microsoft does not seem to realize that PatchGuard is a kernel component that should not be bypassed, since PatchGuard blocks rootkits from activities such as SSDT hooking, not from executing code in kernel-mode.

Red Team Insights on HTTPS Domain Fronting Google Hosts Using Cobalt Strike

In this blog post, we will cover HTTPS domain fronting Google hosts (google.com, mail.google.com, etc.) usi...

Shadow Admins – The Stealthy Accounts That You Should Fear The Most

Shadow Admin accounts are accounts in your network that have sensitive privileges and are typically overloo...

Up Your Security I.Q. by Checking Out Our Collection of Curated Resources.

GhostHook – Bypassing PatchGuard with Processor Trace Based Hooking

Previous Article

Next Article

STAY IN TOUCH

GhostHook – Bypassing PatchGuard with Processor Trace Based Hooking

Previous Article

Next Article

Recommended for You

In July 2024, Google introduced a new feature to better protect cookies in Chrome: AppBound Cookie Encryption. This new feature was able to disrupt the world of infostealers, forcing the malware...

Unless you lived under a rock for the past several months or started a digital detox, you have probably encountered the MCP initials (Model Context Protocol). But what is MCP? Is this just a...

The Model Context Protocol (MCP) is an open standard and open-source project from Anthropic that makes it quick and easy for developers to add real-world functionality — like sending emails or...

TL;DR In this post, we introduce our “Adversarial AI Explainability” research, a term we use to describe the intersection of AI explainability and adversarial attacks on Large Language Models...

Introduction The term “Agentic AI” has recently gained significant attention. Agentic systems are set to fulfill the promise of Generative AI—revolutionizing our lives in unprecedented ways. While...

In the past two years, large language models (LLMs), especially chatbots, have exploded onto the scene. Everyone and their grandmother are using them these days. Generative AI is pervasive in...

Cryptojacking malware—a type of malware that tries to steal cryptocurrencies from users on infected machines. Curiously, this kind of malware isn’t nearly as famous as ransomware or even...

Introduction Identity providers (IdPs) or Identity and Access Management (IAM) solutions are essential for implementing secure and efficient user authentication and authorization in every...

You might not recognize the term “OAuth,” otherwise known as Open Authorization, but chances are you’ve used it without even realizing it. Every time you log into an app or website using Google,...

While Kubernetes’ Role-based access control (RBAC) authorization model is an essential part of securing Kubernetes, managing it has proven to be a significant challenge — especially when dealing...

TL;DR ByteCodeLLM is a new open-source tool that harnesses the power of Local Large Language Models (LLMs) to decompile Python executables. Furthermore, and importantly, it prioritizes data...

Generated using Ideogram Abstract Privacy is a core aspect of our lives. We have the fundamental right to control our personal data, physically or virtually. However, as we use products from...

Recently, we researched a project on Portainer, the go-to open-source tool for managing Kubernetes and Docker environments. With more than 30K stars on GitHub, Portainer gives you a user-friendly...

As large language models (LLMs) become more advanced and are granted additional capabilities by developers, security risks increase dramatically. Manipulated LLMs are no longer just a risk of...

In software development, CI/CD practices are now standard, helping to move code quickly and efficiently from development to production. Azure DevOps, previously known as Team Foundation Server...

tl;dr: Large language models (LLMs) are highly susceptible to manipulation, and, as such, they must be treated as potential attackers in the system. LLMs have become extremely popular and serve...

Over the short span of video game cheating, both cheaters and game developers have evolved in many ways; this includes everything from modification of important game variables (like health) by...

Following our post “A Brief History of Game Cheating,” it’s safe to say that cheats, no matter how lucrative or premium they might look, always carry a degree of danger. Today’s story revolves...

During a recent customer engagement, the CyberArk Red Team discovered and exploited an Elevation of Privilege (EoP) vulnerability (CVE-2024-39708) in Delinea Privilege Manager (formerly Thycotic...

Golang applications that use HTTPS requests have a built-in SSL verification feature enabled by default. In our work, we often encounter an application that uses Golang HTTPS requests, and we have...