Introduction
Hi all, today I will showcase a new (at least I do not believe I have seen it in writeups before, nor does it seem to be in how2heap) method of exploiting double free. In essence, this technique aims to make tcache dup great again (it’s actually quite similar to house of botcake, but is far simpler, requiring one fewer chunk allocated).
Here are the conditions to apply this attack:
- Double free (this is the main primitive)
- Able to allocate up to 9 chunks at any given time
- Heap leak (to perform tcache poisoning)
- Target address is 0x10 aligned
And here is the advantage of the attack:
- Only write-on-alloc required, no edit functionality
- Only single size allocate required (ie, if the challenge only allows you to allocate chunks of size N, this will still work so long as n <= 0x78, excluding metadata). This is what sets it apart most from house of botcake, as that requires that you allocate chunks of differing sizes
- No requirements of valid free size (unlike fastbin dup into stack)
- Works in all latest libc, tested all from 2.34 up to 2.42
Technique
So let us look at https://elixir.bootlin.com/glibc/glibc-2.35/source/malloc/malloc.c#L3860. This is the 2.35 source, but just search for the string and you can find it in other libc versions. Code says:
3860 |
|
Let’s also look at the tcache_put function referenced,
3174 | static __always_inline void |
As can clearly be seen, there are NO protections against double free anywhere during the malloc! That means, we can bypass the double free checker in tcache, and bring back the good old malloc_dup that died long ago in 2.29!
So let’s come up with a plan. Let us start by making a target, something 0x10-aligned on stack because that is still checked in tcache_get.
1 | setvbuf(stdout, NULL, _IONBF, 0); |
Now we want to get something into fastbin first, so we can actually get an a->b->a chain going, to get a double free. This is just standard fastbin_dup. We must malloc a total of 9 times, to obtain 7 chunks to fill tcache, then 2 chunks to dump into double free. Then we free first 7.
1 | printf("Alloc 9 chunks...\n"); |
Performing the fastbin dup,
1 | printf("Perform fastbin dup...\n"); |
Now our tcache has 7 chunks, while our fastbin has a->b->a! Let us clear the tcache, so we can access fastbin,
1 | printf("Clearing tcache...\n"); |
So now, our tcache is empty, while our fastbin still has a->b->a. Now it is time for the magic, once we alloc a from the fastbin, b->a is moved into tcache! And since we have write-on-alloc, we can just perform classic tcache poisoning!
1 | uint64_t* a = malloc(0x30); |
Now tcache has b->a->target! We take out b->a, and finally, we get our arbitrary allocation!
1 | printf("Alloc b = %p\n", malloc(0x30)); |
Further explanations
If you are like me, while studying this technique, you may have realized that after the first alloc of a, the count in tcache should only be 2 (hint: are you sure?). This means that, only b->a should picked up from tcache, not the target. But this technique still works. Why? We need to dive further into this!
So we start having fastbin with a->b->a. Where a.fd = b, and b.fd = a; this is done by our double free previously. But you must note how fastbin does NOT actually have a count field! Instead, it’s a pure singly linked list! So we can actually express it like this:
fastbin: a->b->a->b->a->b->… (note that the fastbin is circular, technically it only has 2 distinct elements a and b, but since it’s self referential, we can take it as an infinite loop)
tcache: NULL, count = 0
a.fd = b
b.fd = a
Note how tcache DOES have a count field, which increases by 1 every time something is prepended to the linked list. This is to efficiently check the limit of 7, without having to traverse the list (O(n) (if no count field) vs O(1) (with count field) operation) each time.
Then we allocate away a, as we know a’s fd is not reset.
fastbin: b->a->b->a->b->…
tcache: NULL, count = 0
a.fd = b
b.fd = a
So now, it’s time for the moving of fastbin to tcache. Chunk b is pushed to the head of tcache’s linked list, and b’s fd is set to previous head of tcache, which is NULL. Fastbin head is set to a. You may be familiar with this algorithm as one to reverse a linked list.
fastbin: a->b->NULL
tcache: b->NULL, count = 1
a.fd = b
b.fd = NULL
Next step, move head of fastbin to head of tcache, set a.fd to previous head of tcache which happens to also be b,
fastbin: b->NULL
tcache: a->b->NULL, count = 2
a.fd = b
b.fd = NULL
Finally, taking b into fastbin, and setting b.fd to previous head of tcache which is a,
fastbin: NULL
tcache: b->a->b->a->b->a->…, count = 3 - but because of count = 3, it’s effectively b->a->b->NULL
a.fd = b
b.fd = a
And that, is how the supposed b->a in fastbin is transformed into b->a->b with count = 3 in tcache! So during the tcache poisoning, by replacing a.fd with target, we get,
fastbin: NULL
tcache: b->a->target->[corruption], count = 3
a.fd = target
b.fd = a
And there we have it!
Conclusion
I feel this technique is a very good complement to house of botcake in making tcache dup great again. With this, there should be no longer any need to use fastbin_dup_into_stack. Here is the complete code for the PoC:
1 |
|
also, do i get to name this house of fern or something
A small bonus
So while studying libc, I also came across potential new method for WWW2exec. It’s just a reskin of common exit handler, initial+24
override method, but I realize that first function pointer (ie what’s there before override) can reveal the PTR_MANGLE
cookie already, this is helpful especially in new libc where TLS is no longer at constant offset from libc.
The first function pointer goes to _dl_fini
, which is in ld, and as of libc 2.41, is still at constant offset from libc base, so just a libc leak is sufficient. But the issue with this method is, initial+24
is not 0x10 aligned (at least in my libc), so printf leaking of initial+24
is impossible, as the value at initial+16
is 0x4, many many null bytes. But I think it is still applicable to some challenge which alloc a struct
instead of char*
. So here is a simple PoC:
1 |
|
Of course, offsets need to be changed for your libc, I am using 2.35 on Debian 12.