commit | 9abb0e482b03551241d9a729058ae65279e5a384 | [log] [tgz] |
---|---|---|
author | Android Build Coastguard Worker <[email protected]> | Thu Aug 08 01:15:17 2024 +0000 |
committer | Android Build Coastguard Worker <[email protected]> | Thu Aug 08 01:15:17 2024 +0000 |
tree | 3ae04698b7a8e2f4866ad51b77d76a0b870ee15e | |
parent | 8f5c09e6248870971cbb0177c87b378e3caa7d51 [diff] | |
parent | d9e5b4fa989a1339c9e6fcd473eaca5aefbb9a78 [diff] |
Snap for 12199973 from d9e5b4fa989a1339c9e6fcd473eaca5aefbb9a78 to 24Q4-release Change-Id: I10fe8be4499d0a09040e5463f383a5cb73536c8f
The virtio-queue
crate provides a virtio device implementation for a virtio queue, a virtio descriptor and a chain of such descriptors. Two formats of virtio queues are defined in the specification: split virtqueues and packed virtqueues. The virtio-queue
crate offers support only for the split virtqueues format. The purpose of the virtio-queue API is to be consumed by virtio device implementations (such as the block device or vsock device). The main abstraction is the Queue
. The crate is also defining a state object for the queue, i.e. QueueState
.
Let’s take a concrete example of how a device would work with a queue, using the MMIO bus.
First, it is important to mention that the mandatory parts of the virtio interface are the following:
Each virtqueue consists of three parts:
Before booting the virtual machine (VM), the VMM does the following set up:
After the boot of the VM, the driver starts sending read/write requests to configure things like:
set_size
→ for setting the size of the queue.set_ready
→ configure the queue to the ready for processing
state.set_desc_table_address
, set_avail_ring_address
, set_used_ring_address
→ configure the guest address of the constituent parts of the queue.set_event_idx
→ it is called as part of the features' negotiation in the virtio-device
crate, and is enabling or disabling the VIRTIO_F_RING_EVENT_IDX feature.Once the queues are ready, the device can be used.
The steady state operation of a virtio device follows a model where the driver produces descriptor chains which are consumed by the device, and both parties need to be notified when new elements have been placed on the associate ring to avoid busy polling. The precise notification mechanism is left up to the VMM that incorporates the devices and queues (it usually involves things like MMIO vm exits and interrupt injection into the guest). The queue implementation is agnostic to the notification mechanism in use, and it exposes methods and functionality (such as iterators) that are called from the outside in response to a notification event.
The basic principle of how the queues are used by the device/driver is the following, as showed in the diagram below as well:
idx
with the number of new entries, the diagram shows the simple use case of only one new entry.idx
field from the available ring. This can be directly achieved with Queue::avail_idx
, but we do not recommend to the consumers of the crate to use this because it is already called behind the scenes by the iterator over all available descriptor chain heads.idx
value.Queue::add_used
; the entry is defined in the spec as virtq_used_elem
, and in virtio-queue
as VirtqUsedElem
. This structure is holding both the index of the descriptor chain and the number of bytes that were written to the memory as part of serving the request.idx
from the used ring; this is done as part of the Queue::add_used
that was mentioned above.A descriptor is storing four fields, with the first two, addr
and len
, pointing to the data in memory to which the descriptor refers, as shown in the diagram below. The flags
field is useful for indicating if, for example, the buffer is device readable or writable, or if we have another descriptor chained after this one (VIRTQ_DESC_F_NEXT flag set). next
field is storing the index of the next descriptor if VIRTQ_DESC_F_NEXT is set.
Requirements for device implementation
DescriptorChain
can be used to parse descriptors provided by the device, which represent input or output memory areas for device I/O. A descriptor is essentially an (address, length) pair, which is subsequently used by the device model operation. We do not check the validity of the descriptors, and instead expect any validations to happen when the device implementation is attempting to access the corresponding areas. Early checks can add non-negligible additional costs, and exclusively relying upon them may lead to time-of-check-to-time-of-use race conditions.QueueT
is a trait that allows different implementations for a Queue
object for single-threaded context and multi-threaded context. The implementations provided in virtio-queue
are:
Queue
→ it is used for the single-threaded context.QueueSync
→ it is used for the multi-threaded context, and is simply a wrapper over an Arc<Mutex<Queue>>
.Besides the above abstractions, the virtio-queue
crate provides also the following ones:
Descriptor
→ which mostly offers accessors for the members of the Descriptor
.DescriptorChain
→ provides accessors for the DescriptorChain
’s members and an Iterator
implementation for iterating over the DescriptorChain
, there is also an abstraction for iterators over just the device readable or just the device writable descriptors (DescriptorChainRwIter
).AvailIter
- is a consuming iterator over all available descriptor chain heads in the queue.The Queue
allows saving the state through the state
function which returns a QueueState
. Queue
objects can be created from a previously saved state by using QueueState::try_from
. The VMM should check for errors when restoring a Queue
from a previously saved state.
A big part of the virtio-queue
crate consists of the notification suppression support. As already mentioned, the driver can send an available buffer notification to the device when there are new entries in the available ring, and the device can send a used buffer notification to the driver when there are new entries in the used ring. There might be cases when sending a notification each time these scenarios happen is not efficient, for example when the driver is processing the used ring, it would not need to receive another used buffer notification. The mechanism for suppressing the notifications is detailed in the following sections from the specification:
The Queue
abstraction is proposing the following sequence of steps for processing new available ring entries:
Queue::disable_notification
. Notifications are disabled by the device either if VIRTIO_F_EVENT_IDX is not negotiated, and VIRTQ_USED_F_NO_NOTIFY is set in the flags
field of the used ring, or if VIRTIO_F_EVENT_IDX is negotiated, and avail_event
value is not updated, i.e. it remains set to the latest idx
value of the available ring that was already notified by the driver.AvailIter
iterator.Queue::enable_notification
. Notifications are enabled by the device either if VIRTIO_F_EVENT_IDX is not negotiated, and 0 is set in the flags
field of the used ring, or if VIRTIO_F_EVENT_IDX is negotiated, and avail_event
value is set to the smallest idx
value of the available ring that was not already notified by the driver. This way the device makes sure that it won’t miss any notification.The above steps should be done in a loop to also handle the less likely case where the driver added new entries just before we re-enabled notifications.
On the driver side, the Queue
provides the needs_notification
method which should be used each time the device adds a new entry to the used ring. Depending on the used_event
value and on the last used value (signalled_used
), needs_notification
returns true to let the device know it should send a notification to the guest.
We assume the users of the Queue
implementation won’t attempt to use the queue before checking that the ready
bit is set. This can be verified by calling Queue::is_valid
which, besides this, is also checking that the three queue parts are valid memory regions. We assume consumers will use AvailIter::go_to_previous_position
only in single-threaded contexts. We assume the users will consume the entries from the available ring in the recommended way from the documentation, i.e. device starts processing the available ring entries, disables the notifications, processes the entries, and then re-enables notifications.
This project is licensed under either of