Debugging the Technical Interview

This is the second part of my adoption of …ing the technical interview. A blog series by Aphyr about writing programs in funny, non-standard ways. Again this is a writeup of a CTF challenge I created. Specifically, the dive in the lake challenge of LakeCTF, organized by the CTF of EPFL polygl0ts.

Since, the last technical interview in Java was apparently not enough to get me hired as a performance engineer, I need to go to the binary level. Also, I got the feedback that the code last time would be hard to debug. Therefore, even though shipping a binary, let’s include debug information in the DWARF format.

The ~~idea~~interview question is to write a simple flag checker. Let’s get some inputs for the flag. I only recently transitioned from Java to C, so I don’t know how to read-in strings yet. There are no real types in C anyway, so I was told. Also, I didn’t get the remarks of the interviewer about error handling. Now, the flag checker will be implemented in f, I don’t yet know what to put in there, so let’s start with a simple add of the inputs.

#include <stdlib.h>

__attribute__((noinline))
long f(long a, long b, long c) {
    long flag = a + b + c;
    return flag;
}

int main(int argc, char** argv) {
    long x = strtol(argv[1], NULL, 10);
    long y = strtol(argv[2], NULL, 10);
    long z = strtol(argv[3], NULL, 10);
    return f(x, y, z);
}

We need to take care of the debugging symbols now – I don’t want to fail the interview again! I don’t trust the compiler on this! Let’s see what it produced. Combining performance with maintainability I use:

clang main.c -O3 -S -g3 -o main.S

And well… that is not a lot.

(partial screenshot of the DWARF debugging symbols in main.S)

I can do better. Let’s open up the DWARF standard. A quick read through the 477 pages reveals on page 223: DWARF opcodes.

(partial screenshot of some opcodes in the dwarf documentation)

That is nice! I now propose DDD – Debug Driven Development. This demonstrates leadership, I need to show that I am worthy of being hired as a senior engineer, who drives the development with modern agile approaches. DDD is about first writing the debugging symbols and then the actual source code. Let’s get started. We are reading a flag, so let’s first make sure it is ascii. I want to get hired as a performance engineer, so let’s do this fast. In the Google Doc I am sharing my thoughts with my interviewer.

a & 0x8080808080808080 == 0
b & 0x8080808080808080 == 0
c & 0x8080808080808080 == 0

In DWARF opcodes we first need to get the registers. I use the DW_OP_breg<i> where <i> is the number of the register. This is the numbering of the registers:

regs = [(0, 'rax'), (1, 'rdx'), (2, 'rcx'), (3, 'rbx'), (4, 'rsi'), (5, 'rdi'), (6, 'rbp'), (7, 'rsp'), (8, 'r8'), (9, 'r9'), (10, 'r10'), (11, 'r11'), (12, 'r12'), (13, 'r13'), (14, 'r14'), (15, 'r15'), (16, 'rip'), (17, 'xmm0'), (18, 'xmm1'), (19, 'xmm2'), (20, 'xmm3'), (21, 'xmm4'), (22, 'xmm5'), (23, 'xmm6'), (24, 'xmm7'), (25, 'xmm8'), (26, 'xmm9'), (27, 'xmm10'), (28, 'xmm11'), (29, 'xmm12'), (30, 'xmm13'), (31, 'xmm14')]

DW_OP_breg<i> takes an offset, that is always zero here. DW_OP_breg0=0x70, the first argument a is stored in rdi, so I get 0x7500 as the instruction that pushes rdi on the stack. Also, I need the constant 0x8080808080808080 on our stack three times to and it with our flag inputs. Luckily, the assembler in clang/llvm can convert .byte DW_OP_<something> for me. For brevity I wrote a function asm that prepends the DW_OP and puts an optional argument encoded in uleb128 behind it. This sets up the constants and does the first and.

out = ''
out += asm('const8u')
out += f'\t.quad\t{0x8080808080808080}\n'
out += asm('dup')
out += asm('dup')
out += asm('breg' + str(regs.index('rdi')), 0)
out += asm('and')

Every good systems-level programming has some match and replace (in python) that only covers the special case at hand, ask academia. So this beautiful code inserts our custom debug symbols in the assembly and lets the compiler do the rest of the work. Well, I can’t write thaaaat beautiful code in a Google Doc. Maybe the only thing notable here, is that I need to calculate the size of my code and insert it to the DWARF assembly.

def size(out: str) -> int:
    return out.count('.byte') + \
            leb_and_uleb_sizes(out) + \
            out.count('.short') * 2 + \
            out.count('.quad') * 8


with open('./main.S', 'r') as f:
    assembly = f.read()

start = '.Ldebug_loc0:\n'
front, back = assembly.split(start, 1)
front += start
front += """\t.byte	4                               # DW_LLE_offset_pair
\t.uleb128 .Ltmp0-.Lfunc_begin0           #   starting offset
\t.uleb128 .Lfunc_end0-.Lfunc_begin0      #   ending offset
"""
end = '.Ldebug_loc1:\n'
front += f'\t.byte\t{size(out)}\n'
back = back.split(end, 1)[1]
back = '\t.byte\t0\n' + end + back

with open('./main.S', 'w') as f:
    f.write(front + out + back)

subprocess.check_call(["clang", "main.S", "-o", "a2.out"])

Also we need do the actual check and then jump based on the result for our ascii check. DWARF provides a not equal DW_OP_ne that pushes 0 or 1 on the stack. And, a branch instruction DW_OP_bra that jumps a specified offset in the DWARF assembly, if the last entry on the stack is 1.

out = ''
out += asm('const8u')
out += f'\t.quad\t{0x8080808080808080}\n'
out += asm('dup')
out += asm('dup')
out += asm('breg' + str(regs.index('rdi')), 0)
out += asm('and')

out += asm('ne')
out += asm('bra')
out += '\t.short\t26\n'

In the technical interview, it is of utmost importance that I lay out my thought process but also get the programming done. Now is when I get the programming done. These are the flag checks we want to implement. The <var> variables are the concrete inputs of the functions, and the <var>_t variables are the correct targets.

def to_int(s: str) -> int:
    return sum((ord(c) << (8 * i) for i, c in enumerate(s)))


flag = 'EPFL{_1tTookS3venDwarf5}'

a & 0x8080808080808080 == 0
b & 0x8080808080808080 == 0
c & 0x8080808080808080 == 0

a_t = to_int(flag[:8])
b_t = to_int(flag[8:16])
c_t = to_int(flag[16:])

(a_t >> (8 * 7) & 0xff) % 2 == (a >> (8 * 7) & 0xff) % 2
(b_t >> (8 * 7) & 0xff) % 7 == (b >> (8 * 7) & 0xff) % 7
(c_t >> (8 * 7) & 0xff) % 9 == (c >> (8 * 7) & 0xff) % 9

a & b & c == a_t & b_t & c_t
a*a + b*b == (a_t*a_t + b_t*b_t) % 2**64
c*c + b*b == (b_t*b_t + c_t*c_t) % 2**64

This is writing some rather boring assembly for a simple stack machine. At the end I materialize an non existent variable as the result of our computation using DW_OP_stack_value. This value as a C boolean is the result of our flag check. The basic structure of the code is do a check. If the check is successful multiply a prime number to output. Then go to the next check. If the result (the first variable I pushed) is the target prime number, return 1, otherwise return 0.

out = ''
out += asm('const1u')
out += '\t.byte\t1\n'
out += asm('breg' + str(regs.index('rdi')), 0)
out += asm('breg' + str(regs.index('rsi')), 0)
out += asm('and')
out += asm('breg' + str(regs.index('rdx')), 0)
out += asm('and')
out += asm('const8u')
out += '\t.quad\t7219272754963824708\n'
out += asm('ne')
out += asm('bra')
out += '\t.short\t3\n'
out += asm('const1u')
out += '\t.byte\t3\n'  # multiply by 3
out += asm('mul')

out += asm('const8u')
out += f'\t.quad\t{0x8080808080808080}\n'
out += asm('dup')
out += asm('dup')
out += asm('breg' + str(regs.index('rdi')), 0)
out += asm('and')

out += asm('const1u')
out += '\t.byte\t0\n'
out += asm('ne')
out += asm('bra')
out += '\t.short\t26\n'

out += asm('breg' + str(regs.index('rsi')), 0)
out += asm('and')
out += asm('bra')
out += '\t.short\t9\n'

out += asm('breg' + str(regs.index('rdx')), 0)
out += asm('and')
out += asm('bra')
out += '\t.short\t3\n'
out += asm('const1u')
out += '\t.byte\t3\n'  # multiply by 3
out += asm('mul')

out += asm('breg' + str(regs.index('rdi')), 0)
out += asm('dup')
out += asm('mul')
out += asm('breg' + str(regs.index('rsi')), 0)
out += asm('dup')
out += asm('mul')
out += asm('plus')
out += asm('const8u')
out += '\t.quad\t18116903027530606121\n'
out += asm('ne')
out += asm('bra')
out += '\t.short\t3\n'
out += asm('const1u')
out += '\t.byte\t3\n'  # multiply by 3
out += asm('mul')

out += asm('breg' + str(regs.index('rdx')), 0)
out += asm('dup')
out += asm('mul')
out += asm('breg' + str(regs.index('rsi')), 0)
out += asm('dup')
out += asm('mul')
out += asm('plus')
out += asm('const8u')
out += '\t.quad\t16612709672999228116\n'
out += asm('ne')
out += asm('bra')
out += '\t.short\t3\n'
out += asm('const1u')
out += '\t.byte\t7\n'  # multiply by 7
out += asm('mul')
out += asm('const1u')
out += '\t.byte\t189\n'
out += asm('eq')
out += asm('stack_value')

FAQ:

Question:

The challenge is broken!!!11111eleven11! The binary is empty and it segfaults when I run it.

Answer:

Dive deeper 🤿! (…But after teams solved it they all gave very positive feedback and had fun)

Question:

How many teams solved the challenge?

Answer: 49 teams, it was one of the easier revs

Question:

How did players solve it?

Answer:

Staring at it and then translating the constraints to z3.
Asking ChatGPT and getting trolled, then proceeding with 1.
Writing a custom Binary Ninja architecture for DWARF opcodes and then using the decompilation (very cool)

Question:

Was the challenge tested.

Yes, as a good CTF the challenge was tested. Here is the conversation with the challenge tester.

Me: Hi $TesterName, how are you doing?

Me: Still got r2 installed?

Tester Victim: Hi

Tester Victim: Yes

Tester Victim: What for

Tester Victim: Are you playing the $Competition btw

Me:: Nice, that makes you perfectly well prepared to playtest my lake challenge $LinkToChallengeRepo

Tester Victim: shit it was a trap

Tester Victim: ok though

Me: thank you 🐞

Days pass, pwn2own happens, life goes on

Tester Victim: i hate u

Tester Victim: was fun