What is QEMU?
What is para-virtualization?
Para-virtualization is a virtualization technique that allows guest operating systems to directly communicate with the hypervisor, thus providing better performance compared to full virtualization. In para-virtualization, the guest operating system is modified to use special hypercalls or API calls to communicate with the hypervisor.
RDMA (Remote Direct Memory Access) is a technology that allows data to be transferred directly from the memory of one computer to another over a network, without involving the CPU or operating system of the computers. This technology provides low-latency, high-bandwidth, and low-CPU overhead data transfers between systems.
CVE-2021-3608 is a security vulnerability in the QEMU hypervisor that allows an attacker with administrative privileges in a guest virtual machine to execute arbitrary code on the host system. The vulnerability exists due to a memory leak in the code that handles para-virtualized RDMA connections in the QEMU. An attacker can exploit this vulnerability by sending specially crafted RDMA packets from the guest virtual machine to the QEMU hypervisor, causing a heap buffer overflow in the QEMU and allowing the attacker to execute arbitrary code on the host system.
In this attack scenario, the guest virtual machine is the attacker, and it exploits the vulnerability in the QEMU hypervisor to break out of the virtualization environment and gain code execution on the host system. The use of para-virtualized RDMA connections allows the attacker to directly communicate with the hypervisor and trigger the vulnerability. This attack can be mitigated by patching the vulnerable QEMU version or by disabling para-virtualized RDMA connections in the QEMU configuration.
In the case of CVE-2021-3608, security experts discovered a memory leak in the QEMU source code that handles para-virtualized RDMA connections. By analyzing the source code, they were able to identify the vulnerability and determine how it could be exploited by an attacker. They then provided recommendations on how to patch the vulnerability or mitigate the risk of exploitation.
#include "qemu/osdep.h"
#include <glib/gprintf.h>
#include <utime.h>
#include "9p-iov-marshal.h"
#include "qemu/bswap.h"
/*This code is importing various libraries needed for the program to run.
qemu/osdep.h is a header file from the QEMU emulator that includes operating system dependent definitions and functions.
glib/gprintf.h is a header file from the GLib library which provides many useful utilities and data structures, including string handling functions.
utime.h is a header file from the C standard library that defines functions related to time and date.
9p-iov-marshal.h is a header file specific to the 9P protocol, which is a network protocol used for sharing files and other resources between computers.
qemu/bswap.h is a header file from the QEMU emulator that includes functions for byte swapping, which is a technique used to convert between big-endian and little-endian byte order.
These libraries are necessary for the program to use various functions and data structures defined in them.*/
static ssize_t v9fs_packunpack(void *addr, struct iovec *sg, int sg_count,
size_t offset, size_t size, int pack)
{
int i = 0;
size_t copied = 0;
size_t req_size = size;
/*This code is defining a function called v9fs_packunpack with several parameters, including a memory address, an array of input/output vectors, the number of vectors in the array, an offset, a size, and a flag called pack.
The function first initializes several variables including i, copied, and req_size. i is used as a counter variable for a loop, copied is used to keep track of how much data has been copied, and req_size is used to store the size of the data to be copied.
The function returns a signed integer value called ssize_t which is used to represent the size of a byte buffer or an error condition.
The purpose of this function is to pack or unpack data from an array of input/output vectors. The pack flag indicates whether to pack or unpack the data. If pack is 1, the function packs the data from the memory address into the input/output vector array. If pack is 0, the function unpacks the data from the input/output vector array into the memory address.
The function uses a loop to iterate through each vector in the input/output vector array, copying data into or out of the vector as needed. The offset and size parameters are used to determine which portion of the data to copy. The copied variable is used to keep track of how much data has been copied so far. The function continues to copy data until the requested size has been copied or the end of the input/output vector array is reached.
Finally, the function returns the number of bytes that were copied.*/
for (i = 0; size && i < sg_count; i++) {
size_t len;
if (offset >= sg[i].iov_len) {
/* skip this sg */
offset -= sg[i].iov_len;
continue;
} else {
len = MIN(sg[i].iov_len - offset, size);
if (pack) { memcpy(sg[i].iov_base + offset, addr, len); } else {
memcpy(addr, sg[i].iov_base + offset, len);
}
size -= len;
copied += len;
addr += len;
if (size) {
offset = 0;
continue;
}
}
}
/*This code is a loop that iterates over each vector in the input/output vector array and copies data into or out of the vector as needed.
The loop iterates as long as there is still data to be copied (size is not zero) and the index i is less than the number of vectors in the array (sg_count).
For each vector, the loop first checks if the requested offset is greater than or equal to the length of the current vector (sg[i].iov_len). If it is, the loop skips this vector and moves on to the next one.
If the requested offset is less than the length of the current vector, the loop calculates the amount of data to copy (len) as the minimum of the remaining size and the remaining bytes in the current vector (sg[i].iov_len - offset).
If the pack flag is set to 1, the function copies data from the memory address to the input/output vector. If pack is 0, the function copies data from the input/output vector to the memory address.
The loop updates the copied variable to keep track of how much data has been copied so far, and updates the memory address to point to the next location to be copied to or from.
If there is still data to be copied, the loop sets the offset to 0 and continues to the next vector.
Once all data has been copied or the end of the input/output vector array has been reached, the function returns the number of bytes that were copied.*/
if (copied < req_size) {
/*
* We copied less that requested size. error out
*/
return -ENOBUFS;
}
return copied;
}
/*This code is the final part of the v9fs_packunpack function.
After the loop has finished copying data, the function checks if the total amount of data that was copied (copied) is less than the requested size (req_size). If the copied data is less than the requested size, it means there was not enough space in the input/output vector array to copy all of the requested data. In this case, the function returns an error code of -ENOBUFS, which indicates that the buffer is full and cannot accept any more data.
If the copied data is equal to or greater than the requested size, it means that the function was able to copy all of the requested data into or out of the input/output vector array. In this case, the function returns the total number of bytes that were copied, which is stored in the copied variable.*/
static ssize_t v9fs_unpack(void *dst, struct iovec *out_sg, int out_num,
size_t offset, size_t size)
{
return v9fs_packunpack(dst, out_sg, out_num, offset, size, 0);
}
/*This code defines the v9fs_unpack function, which is a wrapper around the v9fs_packunpack function with the pack flag set to 0. This function is used to copy data from the input/output vector array (out_sg) to a memory location (dst), using the same offset and size parameters as v9fs_packunpack.
In other words, this function is simply calling v9fs_packunpack with the pack flag set to 0, which means that the function will copy data from the input/output vector array to the memory location. The function then returns the result of v9fs_packunpack, which is the total number of bytes that were copied.*/
ssize_t v9fs_pack(struct iovec *in_sg, int in_num, size_t offset,
const void *src, size_t size)
{
return v9fs_packunpack((void *)src, in_sg, in_num, offset, size, 1);
}
/*This code defines the v9fs_pack function, which is another wrapper around the v9fs_packunpack function, but with the pack flag set to 1. This function is used to copy data from a memory location (src) to the input/output vector array (in_sg), using the same offset and size parameters as v9fs_packunpack.
In other words, this function is simply calling v9fs_packunpack with the pack flag set to 1, which means that the function will copy data from the memory location to the input/output vector array. The function then returns the result of v9fs_packunpack, which is the total number of bytes that were copied.*/
ssize_t v9fs_iov_vunmarshal(struct iovec *out_sg, int out_num, size_t offset,
int bswap, const char *fmt, va_list ap)
{
int i;
ssize_t copied = 0;
size_t old_offset = offset;
for (i = 0; fmt[i]; i++) {
switch (fmt[i]) {
case 'b': {
uint8_t *valp = va_arg(ap, uint8_t *);
copied = v9fs_unpack(valp, out_sg, out_num, offset, sizeof(*valp));
break;
}
case 'w': {
uint16_t val, *valp;
valp = va_arg(ap, uint16_t *);
copied = v9fs_unpack(&val, out_sg, out_num, offset, sizeof(val));
if (bswap) {
*valp = le16_to_cpu(val);
} else {
*valp = val;
}
break;
}
case 'd': {
uint32_t val, *valp;
valp = va_arg(ap, uint32_t *);
copied = v9fs_unpack(&val, out_sg, out_num, offset, sizeof(val));
if (bswap) {
*valp = le32_to_cpu(val);
} else {
*valp = val;
}
break;
}
case 'q': {
uint64_t val, *valp;
valp = va_arg(ap, uint64_t *);
copied = v9fs_unpack(&val, out_sg, out_num, offset, sizeof(val));
if (bswap) {
*valp = le64_to_cpu(val);
} else {
*valp = val;
}
break;
}
case 's': {
V9fsString *str = va_arg(ap, V9fsString *);
copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,
"w", &str->size);
if (copied > 0) {
offset += copied;
str->data = g_malloc(str->size + 1);
copied = v9fs_unpack(str->data, out_sg, out_num, offset,
str->size);
if (copied >= 0) {
str->data[str->size] = 0;
} else {
v9fs_string_free(str);
}
}
break;
}
case 'Q': {
V9fsQID *qidp = va_arg(ap, V9fsQID *);
copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,
"bdq", &qidp->type, &qidp->version,
&qidp->path);
break;
}
case 'S': {
V9fsStat *statp = va_arg(ap, V9fsStat *);
copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,
"wwdQdddqsssssddd",
&statp->size, &statp->type,
&statp->dev, &statp->qid,
&statp->mode, &statp->atime,
&statp->mtime, &statp->length,
&statp->name, &statp->uid,
&statp->gid, &statp->muid,
&statp->extension,
&statp->n_uid, &statp->n_gid,
&statp->n_muid);
break;
}
case 'I': {
V9fsIattr *iattr = va_arg(ap, V9fsIattr *);
copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,
"ddddqqqqq",
&iattr->valid, &iattr->mode,
&iattr->uid, &iattr->gid,
&iattr->size, &iattr->atime_sec,
&iattr->atime_nsec,
&iattr->mtime_sec,
&iattr->mtime_nsec);
break;
}
default:
g_assert_not_reached();
}
if (copied < 0) {
return copied;
}
offset += copied;
}
return offset - old_offset;
}
/*This is a function in the 9P protocol implementation for the Plan 9 operating system. The function is responsible for unmarshalling data from a series of iovec buffers, according to a specified format string.
The function takes as input a pointer to an array of iovec structures, the number of iovecs in the array, an offset to the beginning of the data to be unmarshalled, a flag indicating whether the data is in big-endian byte order, a format string specifying the layout of the data to be unmarshalled, and a va_list of arguments to be used with the format string.
The function loops through the characters in the format string, switching on each character to determine how to unmarshal the next piece of data. For each character, the function uses the v9fs_unpack function to copy the data from the iovec buffers into the appropriate variable type. If the bswap flag is set, the function also performs a byte swap on the data
The function returns the total number of bytes unmarshalled from the iovec buffers. If an error occurs during unmarshalling, the function returns a negative error code.*/
ssize_t v9fs_iov_unmarshal(struct iovec *out_sg, int out_num, size_t offset,
int bswap, const char *fmt, ...)
{
ssize_t ret;
va_list ap;
va_start(ap, fmt);
ret = v9fs_iov_vunmarshal(out_sg, out_num, offset, bswap, fmt, ap);
va_end(ap);
return ret;
}
/*This function is a convenience wrapper around v9fs_iov_vunmarshal(). It takes a variable number of arguments that represent the values to be unmarshalled from the out_sg scatter-gather list, using the format string fmt. The arguments are passed to v9fs_iov_vunmarshal() using the va_list interface.
The bswap parameter specifies whether the byte order of the unmarshalled values should be swapped. If it is non-zero, byte swapping will be performed.
The function returns the number of bytes unmarshalled, or an error code if an error occurs.*/
ssize_t v9fs_iov_vmarshal(struct iovec *in_sg, int in_num, size_t offset,
int bswap, const char *fmt, va_list ap)
{
int i;
ssize_t copied = 0;
size_t old_offset = offset;
for (i = 0; fmt[i]; i++) {
switch (fmt[i]) {
case 'b': {
uint8_t val = va_arg(ap, int);
copied = v9fs_pack(in_sg, in_num, offset, &val, sizeof(val));
break;
}
case 'w': {
uint16_t val = va_arg(ap, int);
if (bswap) {
val = cpu_to_le16(val);
}
copied = v9fs_pack(in_sg, in_num, offset, &val, sizeof(val));
break;
}
case 'd': {
uint32_t val = va_arg(ap, uint32_t);
if (bswap) {
val = cpu_to_le32(val);
}
copied = v9fs_pack(in_sg, in_num, offset, &val, sizeof(val));
break;
}
case 'q': {
uint64_t val = va_arg(ap, uint64_t);
if (bswap) {
val = cpu_to_le64(val);
}
copied = v9fs_pack(in_sg, in_num, offset, &val, sizeof(val));
break;
}
case 's': {
V9fsString *str = va_arg(ap, V9fsString *);
copied = v9fs_iov_marshal(in_sg, in_num, offset, bswap,
"w", str->size);
if (copied > 0) {
offset += copied;
copied = v9fs_pack(in_sg, in_num, offset, str->data, str->size);
}
break;
}
case 'Q': {
V9fsQID *qidp = va_arg(ap, V9fsQID *);
copied = v9fs_iov_marshal(in_sg, in_num, offset, bswap, "bdq",
qidp->type, qidp->version,
qidp->path);
break;
}
case 'S': {
V9fsStat *statp = va_arg(ap, V9fsStat *);
copied = v9fs_iov_marshal(in_sg, in_num, offset, bswap,
"wwdQdddqsssssddd",
statp->size, statp->type, statp->dev,
&statp->qid, statp->mode, statp->atime,
statp->mtime, statp->length,
&statp->name,
&statp->uid, &statp->gid, &statp->muid,
&statp->extension, statp->n_uid,
statp->n_gid, statp->n_muid);
break;
}
case 'A': {
V9fsStatDotl *statp = va_arg(ap, V9fsStatDotl *);
copied = v9fs_iov_marshal(in_sg, in_num, offset, bswap,
"qQdddqqqqqqqqqqqqqqq",
statp->st_result_mask,
&statp->qid, statp->st_mode,
statp->st_uid, statp->st_gid,
statp->st_nlink, statp->st_rdev,
statp->st_size, statp->st_blksize,
statp->st_blocks, statp->st_atime_sec,
statp->st_atime_nsec,
statp->st_mtime_sec,
statp->st_mtime_nsec,
statp->st_ctime_sec,
statp->st_ctime_nsec,
statp->st_btime_sec,
statp->st_btime_nsec, statp->st_gen,
statp->st_data_version);
break;
}
default:
g_assert_not_reached();
}
if (copied < 0) {
return copied;
}
offset += copied;
}
return offset - old_offset;
}
/*This is a C function called v9fs_iov_vmarshal which marshals data according to a given format string fmt and appends it to an array of struct iovec called in_sg. The function takes a variable argument list ap which contains the values to be marshaled.
The function iterates over the format string character by character and for each character, it marshals the corresponding value from the argument list ap. The format string characters are as follows:
b: marshals an 8-bit unsigned integer
w: marshals a 16-bit unsigned integer
d: marshals a 32-bit unsigned integer
q: marshals a 64-bit unsigned integer
s: marshals a V9fsString structure, which contains a size field followed by a string of characters
Q: marshals a V9fsQID structure, which contains three fields: type, version, and path
S: marshals a V9fsStat structure, which contains fields for file size, type, device ID, QID, mode, access time, modification time, length, name, owner user ID, group ID, and modify user ID
A: marshals a V9fsStatDotl structure, which contains fields for result mask, QID, mode, owner user ID, group ID, number of links, device ID, file size, block size, number of blocks, access time, modification time, creation time, birth time, generation number, and data version
The function also takes a starting offset offset, which is used to determine where in the in_sg array the marshaled data should be written. If the marshaling is successful, the function returns the number of bytes written to the in_sg array. If an error occurs during marshaling, the function returns a negative value.
* */
ssize_t v9fs_iov_marshal(struct iovec *in_sg, int in_num, size_t offset, int bswap, const char *fmt, ...) {
ssize_t ret;
va_list ap;
va_start(ap, fmt);
ret = v9fs_iov_vmarshal(in_sg, in_num, offset, bswap, fmt, ap);
va_end(ap);
return ret;
}
/*v9fs_iov_marshal is a convenience function that provides a simpler interface to the v9fs_iov_vmarshal function. It takes a variable number of arguments, which are passed to v9fs_iov_vmarshal along with a format string. The format string specifies the types of the arguments and the order in which they are passed.
Inside v9fs_iov_marshal, the va_list type is used to work with the variable argument list. va_start is called to initialize ap with the address of the first argument following fmt. v9fs_iov_vmarshal is then called with in_sg, in_num, offset, bswap, fmt, and ap. Finally, va_end is called to clean up the argument list.
The function returns the result of v9fs_iov_vmarshal, which is the number of bytes that were copied into the in_sg buffer, or a negative error code if an error occurred.*/
The code has a flaw in that the ring->pages array of pointers is not completely initialized. Specifically, if the tbl array contains a NULL pointer, then the corresponding ring->pages entry will not be initialized. This could cause issues later on if the uninitialized entry is used.
Additionally, there is no check for whether npages is a valid value or not. If npages is zero or negative, then ring->pages will be allocated with an invalid size, which could lead to memory issues or crashes.
No comments:
Post a Comment