Linux io_uring 权限提升漏洞
漏洞信息
漏洞名称: Linux io_uring 权限提升漏洞
漏洞编号:
- CVE: CVE-2023-2598
漏洞类型: 权限提升
漏洞等级: 高危
漏洞描述: io_uring是Linux内核中的一个系统调用接口,支持几乎所有系统调用,不仅限于最初的read()和write()。它允许应用程序异步地启动系统调用。该漏洞存在于io_sqe_buffer_register函数中,该函数负责虚拟页面和物理地址的映射。漏洞的根源在于逻辑错误,即检查页面是否来自同一folio时,未验证它们是否连续。这可能导致同一页面被多次映射,从而绕过检查。攻击者可以利用此漏洞进行权限提升,获取root权限。该漏洞的影响包括远程代码执行和数据泄露,且无需认证即可被利用。由于io_uring是Linux内核的一部分,广泛用于各种Linux发行版中,因此该漏洞的影响范围广泛。
产品厂商: Linux
产品名称: io_uring
来源: https://github.com/SpongeBob-369/CVE-2023-2598
类型: CVE-2023:github search
仓库文件
- .vscode
- README.md
- bzImage
- images
- my_exp
- my_exploit.c
- rootfs
- rootfs_new.cpio
- run.sh
来源概述
CVE-2025-2598
what’s io_uring?
io_uring is a system call interface for Linux. It has supported almost all system call so far, not only read() and write initially. It enables an application to initiate system calls that can be performed asynchronously.
Submission and Completion Queues
At the core of every io_uring implementation sit two ring buffers - the submission queue(SQ) and the completion queue(CQ). Those ring buffers are shared between application and kernel.
We can get a submission queue entry(SQE) which describing a syscall you want to be performed by io_uring_get_sqe . The application then performs an io_uring_enter syscall to effectively tell the kernel that there is work waiting to be done in the submission queue.
After the kernel performs the operation it puts a Completion Queue Entry (CQE) into the completion queue ring buffer which can then be consumed by the application.
Vulnerability
The function io_sqe_buffer_register implements the mapping of virtual pages and physical addresses.
We should clarify some concepts first.
The application initiates a request for a buffer by io_uring_register. The call chain is as follows:
io_uring_register_buffers->io_uring_register->io_sqe_buffers_register
The source code of function io_sqe_buffers_register is as follows:
1 | |
In this function, we will run into io_sqe_buffer_register. And we will find a logical bug. The source code of function io_sqe_buffer_register is as follows:
1 | |
Here I only mention a few important points.
imumeans virtual address/page.pagemeans physical address/page.foliomeans a lot of pages that are continues physically, preventing the situation that when a function is called and its parameter contains a page, but this page belongs to a continuous range of pages, but we are not sure whether to use the whole page or a single page.struct iovec-> just a structure that describes a buffer, with the start address of the buffer and its length. Nothing more.- An
io_mapped_ubufis a structure that holds the information about a buffer that has been registered to anio_uringinstance.
1 | |
The member bio_ver is a struct like iovec but for physical memory.
1 | |
The code that checks if the pages are from the same folio doesn’t actually check if they are consecutive. It can be the same page mapped multiple times. During the iteration page_folio(page) would return the same folio again and again passing the checks. This is an obvious logic bug. Let’s continue with io_sqe_buffer_register and see what the fallout is.
1 | |
A single bio_vec is allocated as nr_pages = 1. The size of the buffer that is written in pimu->iov_len and pimu->bvec[0].bv_len is the one passed by the user in iov->iov_len.
Exploitation
1 | |
The main principle of the above exploit is to exhaust the credentials of the process and occupy as much buddy_memory as possible, so that when we spray the process (credential), the target can be within 500 consecutive pages. In this way, we can find the sprinkled credentials within 500 pages and generate a root shell.