Frame Pipeline
Frame Pipeline
Overview
The Waxed Frame Pipeline manages the flow of rendered frames from plugins to physical displays using DRM/KMS atomic mode setting with explicit synchronization ( fences ). The pipeline implements zero-copy DMA-BUF sharing, triple-buffered swapchains, and GPU-to-Display synchronization without CPU blocking.
Key Characteristics:
- Zero-copy DMA-BUF transfer from GPU to display
- Core-owned swapchain buffers (ABI v5)
- Explicit synchronization via DRM fences
- Triple buffering for consistent frame pacing
- Per-display independent refresh rates
Architecture Overview
FrameHandle Structure
FrameHandle represents a rendered frame from a plugin. In ABI v5, the core owns the DMA-BUF and plugins render directly into core-allocated buffers.
struct FrameHandle {
void* data = nullptr; // CPU-accessible pixel data (if mapped)
core::utils::UniqueFd dma_buf_fd; // DMA-BUF file descriptor (RAII-managed)
uint32_t width = 0; // Frame width in pixels
uint32_t height = 0; // Frame height in pixels
uint32_t stride = 0; // Bytes per row
uint32_t format = 0; // Pixel format (0 = BGRA 8-bit)
uint64_t modifier = DRM_FORMAT_MOD_INVALID; // DRM format modifier
void* internal_handle = nullptr; // Internal handle for cleanup (BufferSlot*)
core::utils::UniqueFd render_fence_fd; // Render complete fence (IN_FENCE_FD)
uint64_t buffer_id = 0; // Stable buffer identifier (ABI v4)
uint64_t generation = 0; // Content change counter (ABI v4)
};
Move-Only Semantics
FrameHandle uses RAII and move-only semantics to enforce ownership transfer:
FrameHandle() = default;
FrameHandle(FrameHandle&&) noexcept = default;
FrameHandle& operator=(FrameHandle&&) noexcept = default;
// Deleted copy operations (compile-time enforcement)
FrameHandle(const FrameHandle&) = delete;
FrameHandle& operator=(const FrameHandle&) = delete;
Field Descriptions
| Field | Type | Purpose |
|---|---|---|
data | void* | CPU-accessible pixel data (if memory-mapped) |
dma_buf_fd | UniqueFd | DMA-BUF file descriptor (RAII-managed, closes on destruct) |
width | uint32_t | Frame width in pixels |
height | uint32_t | Frame height in pixels |
stride | uint32_t | Bytes per row (row pitch) |
format | uint32_t | DRM format code (e.g., DRM_FORMAT_XBGR8888) |
modifier | uint64_t | DRM format modifier (0 = linear, or tiled modifier) |
internal_handle | void* | Internal handle (points to owning BufferSlot) |
render_fence_fd | UniqueFd | Render completion fence (IN_FENCE for DRM) |
buffer_id | uint64_t | Stable buffer ID for fast-path validation |
generation | uint64_t | Incremented on content change |
RenderTarget Structure (ABI v5)
RenderTarget represents a core-allocated DMA-BUF that plugins render into. The core allocates and owns the buffer, the plugin receives a borrowed handle.
struct RenderTarget {
int dma_buf_fd; // Core-owned FD (Plugin MUST NOT close)
uint32_t buffer_id; // Buffer index: 0, 1, or 2 (for import caching)
uint32_t width; // Buffer width in pixels
uint32_t height; // Buffer height in pixels
uint32_t stride; // Row pitch in bytes
uint32_t format; // DRM FourCC format code
uint64_t modifier; // DRM format modifier
int release_fence_fd; // KMS completion fence (Plugin waits before writing)
// Cursor handoff fields (ABI v9)
uint32_t display_id; // Display identifier (0, 1, 2, ...)
int32_t cursor_x; // Display-local cursor X position
int32_t cursor_y; // Display-local cursor Y position
};
RenderTarget Ownership Model (v9)
The core allocates DMA-BUF buffers and manages their lifecycle. Plugins receive borrowed references.
Key Rules:
- Core allocates DMA-BUF via Vulkan with external memory export
- Core owns the FD throughout its lifetime
- Plugin receives borrowed FD, must NOT close it
- Plugin waits on
release_fence_fdbefore rendering (if valid) - Plugin returns
render_fence_fdfor completion signaling
DMA-BUF Ownership Model
Core-Owned Buffers (ABI v5)
In ABI v5, the core allocates all swapchain buffers. This simplifies ownership:
- Core allocates buffers via Vulkan with DMA-BUF export capability
- Core owns the FD for the lifetime of the buffer
- Plugin borrows the FD via
RenderTarget::dma_buf_fd - Plugin renders into the borrowed buffer
- Plugin returns a render fence for completion signaling
- Core submits the DMA-BUF to DRM for scanout
// Core allocates (DisplayManager)
auto allocate_swapchain_buffers(DisplayState& display) -> Result<void> {
for (size_t i = 0; i < 3; ++i) {
// Create Vulkan image with external memory
vk::ImageCreateInfo image_info = {};
image_info.flags = vk::ImageCreateFlagBits::eMutableFormat;
image_info.imageType = vk::ImageType::e2D;
image_info.extent.width = width;
image_info.extent.height = height;
image_info.extent.depth = 1;
image_info.mipLevels = 1;
image_info.arrayLayers = 1;
image_info.format = vk::Format::eR8G8B8A8Unorm;
image_info.tiling = vk::ImageTiling::eLinear; // Required for DMA-BUF
image_info.initialLayout = vk::ImageLayout::eUndefined;
image_info.usage = vk::ImageUsageFlagBits::eColorAttachment |
vk::ImageUsageFlagBits::eTransferSrc;
image_info.sharingMode = vk::SharingMode::eExclusive;
// Export memory with DMA-BUF capability
vk::ExportMemoryAllocateInfo export_info = {};
export_info.handleTypes = vk::ExternalMemoryHandleTypeFlagBits::eDmaBufEXT;
vk::MemoryAllocateInfo alloc_info = {};
alloc_info.pNext = &export_info;
alloc_info.allocationSize = memory_requirements.size;
alloc_info.memoryTypeIndex = memory_type_index;
// Allocate and bind
vk::raii::DeviceMemory vk_memory(device, alloc_info);
vk::raii::Image vk_image(device, image_info);
vk_image.bindMemory(vk_memory, 0);
// Export as DMA-BUF FD
vk::MemoryGetFdInfoKHR get_fd_info = {};
get_fd_info.pNext = nullptr;
get_fd_info.memory = *vk_memory;
get_fd_info.handleType = vk::ExternalMemoryHandleTypeFlagBits::eDmaBufEXT;
int dma_buf_fd = device.getMemoryFdKHR(get_fd_info);
// Store in swapchain_buffers[i]
swapchain_buffers[i].fd.reset(dma_buf_fd);
swapchain_buffers[i].width = width;
swapchain_buffers[i].height = height;
swapchain_buffers[i].stride = row_pitch;
swapchain_buffers[i].format = DRM_FORMAT_XBGR8888;
swapchain_buffers[i].modifier = 0; // Linear
swapchain_buffers[i].id = i;
}
}
Transfer Semantics
When submitting to KMS, the core duplicates the FD for the frame slot:
// In render_loop.cpp, render_display()
slot->frame.dma_buf_fd.reset(dup(core_buf.fd.get()));
This allows the core to retain ownership of the original while DRM takes ownership of the duplicate. DRM closes the duplicate when the framebuffer is released.
dup() for Caching
Plugins that want to retain access to a buffer must use dup():
// Plugin wants to keep the buffer for next frame
int retained_fd = dup(render_target.dma_buf_fd);
// ... use retained_fd ...
close(retained_fd); // Close when done
When to close:
- Core: Never closes swapchain buffers (owned for lifetime)
- Plugin: Never closes
RenderTarget::dma_buf_fd(borrowed) - DRM: Closes the dup’d FD when framebuffer is removed
v5 Render API Flow
The v5 API uses waxed_plugin_render_v5() which receives a RenderTarget:
auto waxed_plugin_render_v5(
PluginState* state,
const RenderTarget* target
) noexcept -> int;
Render Flow Sequence
Plugin Implementation Example
extern "C" auto waxed_plugin_render_v5(
PluginState* state,
const RenderTarget* target
) noexcept -> int {
// 1. Wait for previous scanout to complete
if (target->release_fence_fd >= 0) {
sync_wait(target->release_fence_fd, -1); // -1 = infinite wait
close(target->release_fence_fd); // We own this fence
}
// 2. Import DMA-BUF to Vulkan
VkMemoryDedicatedAllocateInfo dedicated_info = {
.sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO,
.image = VK_NULL_HANDLE, // Will fill in
};
VkImportMemoryFdInfoKHR import_info = {
.sType = VK_STRUCTURE_TYPE_IMPORT_MEMORY_FD_INFO_KHR,
.handleType = VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_FD_BIT_EXT,
.fd = dup(target->dma_buf_fd), // dup() - we don't own the original
.pNext = &dedicated_info,
};
VkMemoryAllocateInfo alloc_info = {
.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
.pNext = &import_info,
.allocationSize = size,
.memoryTypeIndex = memory_type,
};
VkDeviceMemory vk_memory;
vkAllocateMemory(device, &alloc_info, nullptr, &vk_memory);
// 3. Bind image to imported memory
vkBindImageMemory(device, vk_image, vk_memory, 0);
// 4. Render frame
vkCmdBeginCommandBuffer(cmd, &begin_info);
// ... rendering commands ...
vkCmdEndCommandBuffer(cmd);
vkQueueSubmit(queue, 1, &submit_info, VK_NULL_HANDLE);
// 5. Create render completion fence (sync_file)
int render_fence_fd = -1;
VkExportSemaphoreCreateInfo export_info = {
.sType = VK_STRUCTURE_TYPE_EXPORT_SEMAPHORE_CREATE_INFO,
.handleTypes = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT_EXT,
};
VkSemaphoreCreateInfo sem_info = {
.sType = VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO,
.pNext = &export_info,
};
VkSemaphore vk_semaphore;
vkCreateSemaphore(device, &sem_info, nullptr, &vk_semaphore);
// Submit with semaphore signal
VkSemaphoreSubmitInfoKHR signal_info = {
.sType = VK_STRUCTURE_TYPE_SEMAPHORE_SUBMIT_INFO_KHR,
.semaphore = vk_semaphore,
};
// ... submit with signal ...
// Export as sync_file (FD)
VkSemaphoreGetFdInfoKHR get_fd_info = {
.sType = VK_STRUCTURE_TYPE_SEMAPHORE_GET_FD_INFO_KHR,
.semaphore = vk_semaphore,
.handleType = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT_EXT,
};
vkGetSemaphoreFdKHR(device, &get_fd_info, &render_fence_fd);
vkDestroySemaphore(device, vk_semaphore, nullptr);
// 6. Free imported memory (FD ownership transferred to kernel)
vkFreeMemory(device, vk_memory, nullptr); // Closes the dup'd FD
return render_fence_fd; // Ownership transferred to core
}
Fence Lifecycle
The frame pipeline uses two types of fences for explicit synchronization:
IN_FENCE_FD (Render Completion)
Signals when GPU rendering is complete. DRM waits for this fence before displaying the buffer.
Code Flow:
// Plugin creates render fence
int render_fence_fd = create_sync_file_from_gpu_work();
// Returns render_fence_fd via waxed_plugin_render_v5()
// Core adds to atomic request
drmModeAtomicAddProperty(req, plane_id, in_fence_fd_prop, render_fence_fd);
// DRM waits for fence before scanout
drmModeAtomicCommit(drm_fd, req, flags, nullptr);
OUT_FENCE_PTR (Scanout Completion)
Signals when the buffer has completed scanout and left the screen. The next frame must wait for this fence before rendering to the same buffer.
Code Flow:
// In submit_frame(), slot->out_fence_storage is persistent
slot->out_fence_storage = -1;
// Pass pointer to kernel
drmModeAtomicAddProperty(req, crtc_id, out_fence_ptr_prop,
reinterpret_cast<uint64_t>(&slot->out_fence_storage));
// Kernel writes fence FD to slot->out_fence_storage during commit
drmModeAtomicCommit(drm_fd, req, flags, nullptr);
// Extract fence (may have AMDGPU encoding)
int fence_fd = extract_fence_from_storage(slot->out_fence_storage);
// Pass to plugin as release_fence_fd in next RenderTarget
target.release_fence_fd = pending_release_fence_fd.release();
Fence State Flow
Buffer ID Caching for Fast Path
Plugins can cache imported DMA-BUFs to avoid re-importing on every frame. The buffer_id field identifies which core buffer is being used (0, 1, or 2).
// Plugin maintains cache
struct CachedImport {
uint32_t buffer_id;
VkDeviceMemory vk_memory;
VkImage vk_image;
bool valid;
};
std::array<CachedImport, 3> import_cache;
extern "C" auto waxed_plugin_render_v5(
PluginState* state,
const RenderTarget* target
) noexcept -> int {
uint32_t buffer_id = target->buffer_id; // 0, 1, or 2
if (import_cache[buffer_id].valid) {
// Fast path: Reuse cached import
VkImage image = import_cache[buffer_id].vk_image;
} else {
// Slow path: Import new DMA-BUF
import_cache[buffer_id].vk_memory = import_dma_buf(target);
import_cache[buffer_id].valid = true;
}
// ... render ...
return render_fence_fd;
}
Important: Cache invalidation occurs on display configuration changes (hotplug, resolution change). Plugins should clear cache on waxed_plugin_visibility_changed(false, true) or when dimensions mismatch.
Frame Lifecycle Diagrams
Frame Ownership Transfer
Fence Signaling Flow
Timeline of Frame Through Pipeline
Triple Buffer Slot State
BufferSlot tracks the state of each buffer in the triple-buffered swapchain:
struct BufferSlot {
FrameHandle frame; // Frame data
std::atomic<bool> in_use{false}; // True if display or worker is using
uint64_t sequence_number{0}; // Frame sequence counter
uint64_t acquire_time_ms{0}; // When we acquired this frame
uint64_t fence_generation{0}; // Generation counter for ABA prevention
std::atomic<int> dump_ref_count{0}; // Active dump operations count
waxed::core::utils::UniqueFd release_fence_fd; // OUT_FENCE (KMS completion)
waxed::core::utils::UniqueFd pending_release_fence_fd; // OUT_FENCE for plugin
int64_t out_fence_storage{0}; // Persistent storage for OUT_FENCE_PTR
uint32_t slot_index{UINT32_MAX}; // Slot index (0, 1, 2)
uint32_t display_id{UINT32_MAX}; // Owner display ID
FenceClosure fence_closure; // Pre-allocated fence event data
};
Slot States
Summary
The Waxed Frame Pipeline provides:
- Zero-copy rendering via DMA-BUF sharing between GPU and display
- Core-owned buffers simplifying ownership and lifetime management
- Explicit synchronization via IN_FENCE and OUT_FENCE for GPU-Display coordination
- Triple buffering for consistent frame pacing
- Per-display VSync enabling mixed refresh rate configurations