Linux io_uring 权限提升漏洞
漏洞信息
漏洞名称: Linux io_uring 权限提升漏洞
漏洞编号:
- CVE: CVE-2023-2598
漏洞类型: 权限提升
漏洞等级: 高危
漏洞描述: io_uring是Linux内核中的一个系统调用接口,支持几乎所有系统调用,不仅限于最初的read()和write()。它允许应用程序异步地启动系统调用。该漏洞存在于io_sqe_buffer_register函数中,该函数负责虚拟页面和物理地址的映射。漏洞的根源在于逻辑错误,即检查页面是否来自同一folio时,未验证它们是否连续。这可能导致同一页面被多次映射,从而绕过检查。攻击者可以利用此漏洞进行权限提升,获取root权限。该漏洞的影响包括远程代码执行和数据泄露,且无需认证即可被利用。由于io_uring是Linux内核的一部分,广泛用于各种Linux发行版中,因此该漏洞的影响范围广泛。
产品厂商: Linux
产品名称: io_uring
来源: https://github.com/SpongeBob-369/CVE-2023-2598
类型: CVE-2023:github search
仓库文件
- .vscode
- README.md
- bzImage
- images
- my_exp
- my_exploit.c
- rootfs
- rootfs_new.cpio
- run.sh
来源概述
CVE-2025-2598
what’s io_uring?
io_uring
is a system call interface for Linux. It has supported almost all system call so far, not only read()
and write
initially. It enables an application to initiate system calls that can be performed asynchronously.
Submission and Completion Queues
At the core of every io_uring
implementation sit two ring buffers - the submission queue(SQ) and the completion queue(CQ). Those ring buffers are shared between application and kernel.
We can get a submission queue entry(SQE) which describing a syscall
you want to be performed by io_uring_get_sqe
. The application then performs an io_uring_enter
syscall to effectively tell the kernel that there is work waiting to be done in the submission queue.
After the kernel performs the operation it puts a Completion Queue Entry (CQE) into the completion queue ring buffer which can then be consumed by the application.
Vulnerability
The function io_sqe_buffer_register
implements the mapping of virtual pages and physical addresses.
We should clarify some concepts first.
The application initiates a request for a buffer by io_uring_register
. The call chain is as follows:
io_uring_register_buffers
->io_uring_register
->io_sqe_buffers_register
The source code of function io_sqe_buffers_register
is as follows:
1 |
|
In this function, we will run into io_sqe_buffer_register
. And we will find a logical bug. The source code of function io_sqe_buffer_register
is as follows:
1 |
|
Here I only mention a few important points.
imu
means virtual address/page.page
means physical address/page.folio
means a lot of pages that are continues physically, preventing the situation that when a function is called and its parameter contains a page, but this page belongs to a continuous range of pages, but we are not sure whether to use the whole page or a single page.struct iovec
-> just a structure that describes a buffer, with the start address of the buffer and its length. Nothing more.- An
io_mapped_ubuf
is a structure that holds the information about a buffer that has been registered to anio_uring
instance.
1 |
|
The member bio_ver
is a struct
like iovec
but for physical memory.
1 |
|
The code that checks if the pages are from the same folio doesn’t actually check if they are consecutive. It can be the same page mapped multiple times. During the iteration page_folio(page)
would return the same folio again and again passing the checks. This is an obvious logic bug. Let’s continue with io_sqe_buffer_register
and see what the fallout is.
1 |
|
A single bio_vec
is allocated as nr_pages = 1
. The size of the buffer that is written in pimu->iov_len
and pimu->bvec[0].bv_len
is the one passed by the user in iov->iov_len
.
Exploitation
1 |
|
The main principle of the above exploit is to exhaust the credentials of the process and occupy as much buddy_memory as possible, so that when we spray the process (credential), the target can be within 500 consecutive pages. In this way, we can find the sprinkled credentials within 500 pages and generate a root shell.