zerofs.ko is a driver module of a custom filesystem. The kernel and the module is compiled by randstruct plugin, which I found in the magic string – vermagic=4.13.0 SMP mod_unload modversionsRANDSTRUCT_PLUGIN_3c73df5cc8285309b74c8a4caaf831205da45096402d3b1a80caab1d7fa1b03a`. run.sh and /init show that the kernel is protected by SMEP, SMAP, KASLR, kptr_restrict and dmesg_restrict.
zerofs.ko
I found the module may be modified from simplefs after the game. By reversing zerofs.ko, I knew the blocksize is 4096 bits. The first block of the image is the superblock. It consists of magic, block_size, inode_count and free_blocks bitmap.
The second block records all of the inodes in an array. ino is inode number, and dno is the block number of the image.
There is a root inode which ino is 1. It indicates the root dictionary. There is a block corresponding to root dictionary to indicates files in the dictionary. It is an array of zerofs_dir_record structure.
Vulnerabilitie
There isn’t any bound or size check in read and write function. If the filesize we set in image is bigger than blocksize(0x1000), there will be an out-of-bound read/write when invoking copy_to_user/copy_from_user.
After mounting the image, I could trigger out-of-bound read by read the file 666. I tried to find CRED struct in leaked memory. Fortunately, I found some by searching the uid. It took me some time to locate CRED struct because of the radomization of structures.
I still didn’t know which CRED is valid and which process the CRED belongs to although I could find some CRED structures. The exploit is not stable, so I run the exploit serval times. After leaking the memory, the exploit will check if it gets root privilege in a loop. If so, it invokes system("sha256sum /root/flag");.
The last step is to write the CRED. I invoked llseek to set offset to the CRED, and invoked write to modify the CRED, setting uid to 0.
There is a qemu-system-x86_64 binary with a launch script, a linux kernel, a initramfs and some dependencies.
We can get an interactive shell by executing launch.sh.
1
2
3
4
5
6
7
8
9
10
11
__ __ _____________ __ __ ___ ____
/ //_// ____/ ____/ | / / / / / | / __ )
/ ,< / __/ / __/ / |/ / / / / /| | / __ |
/ /| |/ /___/ /___/ /| / / /___/ ___ |/ /_/ /
/_/ |_/_____/_____/_/ |_/ /_____/_/ |_/_____/
Welcome to Tencent Keenlab
Tencent login: root
# uname -r
4.8.0-52-generic
#
The custom vulnerable device
luanch.sh shows there are two custom device named vdd.
1
2
$ ./qemu-system-x86_64 -device help 2>&1 | grep VDD
name "VDD", bus PCI, desc "KeenLab virtualized Devices For Testing D"
we can use some commands to find these devices and their io port/memroy.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# lspci
00:00.0 Class 0600: 8086:1237
00:01.3 Class 0680: 8086:7113
00:03.0 Class 0200: 8086:100e
00:01.1 Class 0101: 8086:7010
00:02.0 Class 0300: 1234:1111
00:05.0 Class 00ff: 1234:2333
00:01.0 Class 0601: 8086:7000
00:04.0 Class 00ff: 1234:2333
# cat /proc/iomem
...
fe900000-fe9fffff : 0000:00:04.0
fea00000-feafffff : 0000:00:05.0
...
# cat /proc/ioports
...
c000-c0ff : 0000:00:04.0
c100-c1ff : 0000:00:05.0
...
OOBW
In vdd_mmio_write, there is a out-of-bound write vulnerability which copys QEMU heap memory to guset physical memory when we set dma_len larger than sizeof(dam_buf).
First, we allocate a buffer and get it’s physical address. Then we set dma_state->dst to our buffer and set dma_len larger than sizeof(dma_buf). Finally, we trigger phys_mem_write by writel(0, piomem + 32). By searching the output, we can find libc addresses and program addresses then calculate the base address of program/libc.
Becasue the QEMU is launched with --nographic -append 'console=ttyS0', so we can simply invoke system(cmd) to run a command in host machine and the output will show in console.
To invoke system(cmd), We need to:
set opaque->dma_state->phys_mem_read to system
set opaque->dma_buf to cmd
make sure opaque->dma_state->cmd != 0.
In vdd_linear_write, when addr == 0, a buffer will be allocated with size of opaque->dma_len. And the data in opaque->dma_state->src with length of opaque->dma_len will be copied to opaque->buf, then copied to opaque->dma_state->dst