I was one of the Tea Deliverers at DEF CON 29 CTF Quals.
I had the luck to play in this famous CTF event with some of the best hackers out there, and I definitely couldn’t miss this opportunity even though it’s a 5-day vacation here in China for the International Workers’ Day.
This is NOT a write-up NOR a blog for a general viewer, but merely a record of what I did and what I thought about this first ever CTF I’ve done.
0x00
I honestly had no idea who I was going to work with before the event. I was simply asked by my supervisor during my internship at Chaitin if I want to be in a CTF team, and I said yes. Later I was connected to @zTrix, the CTO of the company, and later joined a team that is called Tea Deliverers, which is obviously one the best CTF teams in the entire world.
Fun fact, the company founders got to know each other at CTF events, and they founded this security company together because of their experiences as teammates.
I signed myself in to be one of the “web gangs,” and told the group leader I am probably good at XSS (although I have probably done more binary reversing than XSS, both I’m not good at). Spoiler alert, only later to found out that there’s practically no web questions in this CTF, sad.
We had a fancy dinner the eve before the match at a seafood/hotpot/BBQ buffet place, which is pretty notable cause who doesn’t like free food.
The competition starts at 8 AM China local time, and lasts for 48 hours non-stop, so I better get some good sleep.
Day 1
I arrived at our “base,” which is just an office room in the company, slightly after the release of the first challenge, say-hellooo.
say-hellooo
8:00 AM
This challenge simply asks to call the event host @Zardus and ask for the flag. Finding the phone number was rather easy, by going to the host’s twitter, where he links his personal website, and where his CV can be found. On his CV there’s his phone number. I called over and after some chitchat, he said that the flag is “hellospacehackers, all lower case.” And…… it’s not correct.
But of course it is correct, only I was stupid to not realize that when he said “space” it refers to the whitespace instead of the word. After reminded by my teammate, I got the first flag for the team, and spoiler alert, my only flag during the event.
baby-a-fallen-lap-ray
8:05 AM
Although I didn’t sign-up for a pwn question, and I practically had no pwn experiences ever other than some basically stack overflow knowledge, I still took a look at the challenge and tried to find out what this is about, since this challenge is the only one released at the time.
Inside the package there is one executable in elf format, which is the entry point to the service, called manchester, and a few binaries that starts with the magic header sephiALD, and a mysterious file p that contains strings of the menu when we connect to the server.
There was one link that points to the source code of a challenge in the previous year’s DEF CON Finals, and it’s seemingly the same thing, where manchester is actually an implementation of a Manchester dataflow machine. Thought that someone must have a disassembler since this question comes from a previous year, but I can’t find any. So I think it might be a good idea to start with a disassembler using the assmbler.py provided.
Firstly it is easy to build an unpacker and disassembler, but it is a bit confusing to change the raw operations into the forms of a .tass file, mainly I couldn’t figure out how the arguments work. One teammate who has read the paper walked me through how parallelism and dataflow works in this machine, and I was able to get a grasp on the idea. Yet recover the assembled binary back into human readable .tass might still be to much for me.
Teammates who are reversing this tells me that the opcodes and some internal structures are switched, however they are able to quickly figure out the new opcodes are and updated the table. Also they figured out that the mysterious file p was ran by the vm, and vm was ran by the manchester, so figuring out how manchester works was the first step, and we still had to reverse vm and then reverse p and pwn p.
Noticed that there was a graphing function to draw a dataflow of the program. With my current disassembler I could use that function to draw out a graph. However it was simply way too large and makes no sense at all.
I unfortunately has no other good ways to continue this problem, so I hand over my code to my teammates and worked on other problems.
nooombers
11:30 AM
I took a look at the challenge while doing baby-a-fallen-lap-ray, and people somehow just miraculously figure out what each operation mean. After some conversation I still don’t get how they manage to figure that out, probably by comparing it against common signature algorithms, or just eyeballing. Didn’t know what was happening so I left the challenge.
exploit-for-dummies
4:40 PM
Trivia, but definitely not trivial. The trivia part was rather easy, there were only 25 questions and you didn’t need to get all of them in order to pass the 5000 score mark. The flag was read into the memory but at a random mmapped location with a random offset. These two addresses are erased from the stack, but there was one number that equals to the offset xoring some random numbers that were stored both on stack and heap.
If we manage to crash the program, the front-end will spin up a gdb, and we can input an address that starts with 0x and gdb will print out a string located at that place. Nonetheless, it is definitely impossible to guess plainly of where the string is stored at.
My first intuition was there might be some exploits in that version of gdb, and we could write something using the save mechanism of the trivia game to pwn gdb. After some discussions with the teammates, we all agreed on that we should focus on gdb because there seemed to not have any bugs in that trivia game other than that file handle close crash.
After some play with gdb, I was able to get the address of the aforementioned number using 0x0+environ trick. However I got stuck because there were seemingly nothing I can do anymore. Written to core file is not an option as a crash would simply overwrite it.
Teammates found out some files that gdb would go and read, including iofclose.c, .gdb_history, and trivia.debug. But weirdly all of us couldn’t figure out how this relates to shellcoding. I unfortunately lost interest in this challenge, so I went on with the new challenges.
rick
10:00 PM
Two hours since the release of the problem with no significant progress made by my teammates. I jumped in to see where we at.
Level 1
I opened the OpenGL game, and there was one big building, a lever and a strange black cube that fell into the ground, with “Level 1” written on the corner of the program window. I played around and figured out the basic controls. I peeked into the door crack and saw there’s a yellow lever inside, so it must be that we need to enter the building.
Having wireshark in the background, I also see that the game connects to the server when it starts but sent nothing to the server. And the server will automatically disconnect with KO after 15 seconds.
Still, didn’t want to touch Ghidra and IDA, I used scanmem to try to find the variables that controlled the player coordinates. However I was not able to find the exact variable, and I could only filter down to ~50 locations in memory. Setting them all at once apparently breaks the game.
One of my teammates was pretty smart and point out we could just go and look the cross-references to gluLookAt, since it calculates the transformation matrix for the camera. Indeed I found six helper functions that get the eye coordinates and the look vector. Using gdb to set the value was way too much of a hassle, so I wrote a frida script that intercepts one of the functions, print and change the values at my will.
Teleported into the building just to see a giant rick roll (which is what I anticipated seeing rick as the title of the challenge). The image had Rick Astley saying “Don’t Cheat,” and apparently teleporting was not an option. Somehow I thought it might be that the clue is hidden in the textures, that if you manage to open the door without cheating another image would appear. So I dumped all the textures just to find nothing there.
Then there were teammates figuring out that the button ‘E’ does something, by back-tracing from the call to send, and observed there were some conditions needed to be met in order to enter the function that sends something to the server. I simply hooked that function to let it always return 1, and whoosh the door opens, with the lever and block outside turns white, but nothing else happened.
Further reversing of the function revealed that it go through all the levers in the map and checks if the player is in the bounding box of something. Returns 1 if it indeed is. However there were some other things done in the function that simply hooking this function will not do. I therefore hooked into the actual low level checking function. And boom, I was in level 2.
Level 2~9
Didn’t know what happened, but wireshark told me that an interaction with the server had been done. It was not too long after that that we finally figure out that you can simply press ‘E’ on the levers to trigger them. Being a gamer for so long not realizing that, I was disappointed in myself.
Able to manually fly through the first 6 levels as a gaming boi within 15 seconds each level, starting from 7 there were too many levers to do manually. So I wrote some script to automate the process, with hardcoded levers to press.
Level 10
And then I arrived at level 10, where I realized that the logic is random each time you play it. I, because the lack of experience, thought that this must be the final level, cause it’s different from the rest.
Still didn’t want to do much work because I was pretty tired at that time, I figure out one combination of levers that would work to pass level 10, and just put my script there running, hoped that by randomization it would be sooner or later that this combination works again.
And indeed it worked, only that I was met by level 11.
Level 11~37
Okay, it’s time to fuzz, I thought. While others were still reversing the protocol, I started to write my algorithm to automatically switch the levers at random, and see if the door opens. I first let each lever has a certain probability \(p\) to be on, but the effect is not so good and only gets me through the first 17 levels. Later by observation I realized only the lower end of levers are constantly being switched and not the higher-indexed ones. Then I realized that since each function only detects the lever once, each lever actually has no equal probability of being switched, but it’s rather a geometric distribution.
Tried a lot of ways to fix, including generating a random number that represents 1 and 0s of the lever, and also tried to fix the probability by giving each of them different Bernoulli distribution and see how it goes. Finally I settled down on a way which recorded the index of the last lever being pulled, to make sure that each lever has the chance of being pulled at least once. And check the state of the final gate to make sure we get to the next level as soon as we have an acceptable answer. I also play around with different probability distribution because apparently there were some levels that requires a lot of levers being pulled and some requires less.
I also optimizes a lot of other things. I first teleport around to press the levers, only to realize that was pretty inefficient. Later I simply had the ‘E’ hold down for all the time and tell the function to return true at my intended lever index, by keeping an internal counter. I also hooked glutSetTimer to make the arguments being 0 so each refresh is way quicker and a lot more trials can be done within 15 minutes.
Level 37~100
I was able to get to the 37th level, but no further progresses can be made by fuzzing. I kind of already know that this should be using a CSP or BSP solver, because LiveOverflow once done a challenge that was pretty similar to this. In the meanwhile, my fellow teammates finally reversed the protocol and realize that it is simply a tree structure that goes from the levers to the final gate. So all they have to do is to write the exploit now.
At 6:45 AM the second day, we got the flag.
Day 2
There were still no web questions released, so I took a look at various questions.
pza999
6:50 AM
Downloaded the package, just to realize there’s another QEMU VM kernel image. Have no idea what to do.
segnalooo
7:00 AM
Another pwn. The reversing part seemed straightforward, and the input seemed to take in a hex number. I tried “0000” locally and it goes through to the next line of output, but when I tried the same on the server it says “invalid hex,” and won’t let me proceed. Again didn’t what to do next.
After less than 3 hours of sleep, I woke up from a camp bed at 9:30 AM, and jumped back to work.
coooinbase
9:40 AM
src is actually a .gz file, after decompression there were a .html the front end, a .rb the backend and .sh which runs the QEMU VM. The front-end takes input and feed into the ruby backend, the backend check conditions on the input and bson→base64 encoded it, and fed it into the virtual machine. However, I had no idea how to statically analyze the kernel image (it’s not an elf), and I’m not a pwn technician, so I can do really no help here.
smart-cryptooo
10:30 AM
Challenge was tagged machine learning and easy, seems suitable for a noob like me. Trained the model for some time and was able to encrypt and decrypt custom message successfully. The file name indicates that the original text was from philosophy.html in OOO’s website. Certainly there were more to it.
After taking a look at the python file, it was pretty easy to figure out the entire encryption and decryption process, and how the models are trained. The default message and key size are 64 bits, where one bit is actually a float64 number, and we’ll use these numbers from now on.
Alice and encryption
Alice is a 4-layer model that encrypts a message with a key. It takes in a plain text message (1×64 vector) and a key (1×64 vector), and outputs the encrypted message (1x64 vector). The encryption procedure of an entire text goes like this:
First split the text into groups of 8-byte message, use space to pad the end. Then each 8-byte message gets converted into a big-endian 64-bit number. Each message then will be converted into a 1×64 feature vector, where one bit corresponds to one feature, and if the bit is a 1, the feature is a -1, otherwise the feature is a 1. Every 16 (bunch_size) messages will be grouped into a bunch, and encrypt using the same key by putting these 16 messages alongside with the key into the Alice model. A random 8-byte (64-bit) key will also be generated to use to encrypt the next bunch of messages. And this randomly generated key will be encrypted with the current key and be appended to the current bunch of encrypted messages.
Bob and decryption
The Bob model has exactly the same structure as Alice. The only difference is that it functionally takes in an encrypted message and that key that was used to encrypt it, and outputs the decrypted message. The decryption procedure is the reverse of the above procedure by using the bob model, nothing special.
The magic Eve
How to prevent the model just being lazy and don’t encrypt at all? That’s when the eve model steps in. So the Eve is identically to the Alice and Bob model, excepts that it has new input layer before the original input layer, where it takes in only the encrypted message and densely connected it to the original input layer. Now we define the loss of the eve model as the absolute difference between the output of the eve model and the corresponding plain message, meaning that if we could train eve model to have a small loss, then we would be “breaking” the encryption.
ABE model
Now we define a new ABE model, such that we train the above three models at the same time. We randomly generated a plain message and a key, encrypt it using the Alice model, decrypt it using the Bob model, and we define the Alice-Bob Loss as the difference between the original message and the decrypted message. We feed that encrypted intermediate message into eve, and calculate the Eve Loss. The ABE model loss is the defined as \(\text{Alice-Bob Loss} + \frac{\left(32 - \text{Eve Loss}\right)^2}{ 32^2 }\), \(32\) is half the message size. Don’t know where it got the equation from, but apparently this model is a Adversarial Network, as it wants Alice-Bob Loss to be as small as possible, in the meanwhile the Eve Loss as large as possible.
Now with the above sorted out, I started to experiment with a lot of things. First, I confirmed that encryption and decryption depends heavily on the weights/training time instead of the model. Meaning if I encrypted a message with a Bob model that I trained for 5 minutes, the model that I trained for 30 minutes cannot be used to decrypt it with the correct key, and vice versa. Second, I realized that if you didn’t train the models long enough, the decryption could be broken easily. Third, even if the keys aren’t right, the entropy of the message would stay roughly the same, which means if you get the decrypted message being mostly 0x00 or 0xFF, then it is most likely that the model weights affected it, instead of the key being wrong.
So I first tried to observe the training process. For each epoch, I decrypt the first 8 bytes (message) of the given encrypted message and a random key, see when it would spit out a decrypted text that looked of right entropy. I observed for a long time but it doesn’t seem to give off any useful information.
I then though it might be possible to use the eve model to do something. I knew that eve model must already have a large enough loss for the ABE model to work, so it won’t work if I tried to use it to decrypt the entire message. However, that is under the assumption that the keys are different each time. What if the eve model could work if the decryption keys are the same, and the weights would do the magic?
In the meanwhile, my teammates were able to find some repeating messages (8-bytes) in the original html, and were able to figure out where the changes were made by comparing the location of the repeating messages in the same bunch of the plain text and the given encrypted message. They concluded that one change happened at around 12,000 bytes.
This give me much hope as I could be sure that no changed were made before 10,000 bytes, which meant I had about 1,250 training points to use. My idea was, to first use the first 16 messages to train an eve model so that it decrypts these 16 messages and consequently the first generated key stored at the 17th message. And then use the next 16 messages to train another eve model that would do the same job. After finding the keys for the first 1,250 messages, I could use these to train a Bob model that works for this dataset, and use it to decrypt the entire text.
However, someone beat me to it and got the flag. When I look at his exploit, there were NO machine learning stuff used. The exploit simply calculated some matrix and solved a linear matrix equation to get a transformation matrix between the encrypted message and the plain message. I was like, WHAT?
threefactooorx
2:30 PM
While doing smart-cryptooo, I jumped out to see what’s going on with this literally “one-and-only” web challenge. There was only a website to upload a html file, and one file to download which has an .crx extension. I realized that it must be to write a html exploit that cracks this chrome plugin. Unzipping the plugin we find two scripts, one background JS file that seemingly returns the flag, and another JS that is obfuscated. However, it really wasn’t that much of an obfuscation and with a bit of manual restoring of the strings it is pretty clear what we need to do in order for the script to print the flag for us.
There were too many other people working on this challenge, and they were all faster than me since I was late in the game. Not for too long they got the flag.
looocked-ooout
12:15 AM, after midnight
Before I went to sleep, I wanted to take a look at if there’s anything I still could do for the team. And yet another pwn. I think it might be to pwn the given cid binary, which is seemingly an mp3 loader, maybe craft a special mp3 or something. Dragged it into Ghidra, analyzed, decompiled, shook my head, and closed all the windows.
0xFF
Woke up after the event was over. Really nice to see that our team was at the third place, meaning a qualification for the finals (although I won’t be there). Also really surprised to see that there were even a time (an hour before the end) we were in the first place, although I contributed nothing to it.
After playing in such a professional, competitive, and finesse CTF event, I think that I still got a lot to learn, more specifically practicing pwn and reversing. One thing I feel okay about myself is that I can keep up with the pace, and at least I still provided some insights for my teammates in some of the questions, even though it’s my first time in a CTF. Also, the format is a jeopardy instead of attack-defense, which is another whole new area that I had absolutely no experiences in.
CTF is definitely interesting something I want to experience again. Before then, see you next time.
P.S. I have this idea where I think it might be interesting if I can come up with an easy CTF/ARG for the university students to play in, if you have the same idea, contact me!
