Skip to content
Waxed Display Server
← Back to Docs

Frame Pipeline

Frame Pipeline

Overview

The Waxed Frame Pipeline manages the flow of rendered frames from plugins to physical displays using DRM/KMS atomic mode setting with explicit synchronization ( fences ). The pipeline implements zero-copy DMA-BUF sharing, triple-buffered swapchains, and GPU-to-Display synchronization without CPU blocking.

Key Characteristics:

  • Zero-copy DMA-BUF transfer from GPU to display
  • Core-owned swapchain buffers (ABI v5)
  • Explicit synchronization via DRM fences
  • Triple buffering for consistent frame pacing
  • Per-display independent refresh rates

Architecture Overview

DRM KMS - Display

Core Composer - RenderLoop

Plugin - Vulkan GPU

DMA BUF

DMA BUF

IN FENCE render complete

release fence

OUT FENCE scanout complete

GPU renders to buffer

Core owns buffers

Display scans out buffer

FrameHandle Structure

FrameHandle represents a rendered frame from a plugin. In ABI v5, the core owns the DMA-BUF and plugins render directly into core-allocated buffers.

struct FrameHandle {
    void* data = nullptr;                          // CPU-accessible pixel data (if mapped)
    core::utils::UniqueFd dma_buf_fd;              // DMA-BUF file descriptor (RAII-managed)
    uint32_t width = 0;                            // Frame width in pixels
    uint32_t height = 0;                           // Frame height in pixels
    uint32_t stride = 0;                           // Bytes per row
    uint32_t format = 0;                           // Pixel format (0 = BGRA 8-bit)
    uint64_t modifier = DRM_FORMAT_MOD_INVALID;    // DRM format modifier
    void* internal_handle = nullptr;               // Internal handle for cleanup (BufferSlot*)
    core::utils::UniqueFd render_fence_fd;         // Render complete fence (IN_FENCE_FD)
    uint64_t buffer_id = 0;                        // Stable buffer identifier (ABI v4)
    uint64_t generation = 0;                       // Content change counter (ABI v4)
};

Move-Only Semantics

FrameHandle uses RAII and move-only semantics to enforce ownership transfer:

FrameHandle() = default;
FrameHandle(FrameHandle&&) noexcept = default;
FrameHandle& operator=(FrameHandle&&) noexcept = default;

// Deleted copy operations (compile-time enforcement)
FrameHandle(const FrameHandle&) = delete;
FrameHandle& operator=(const FrameHandle&) = delete;

Field Descriptions

FieldTypePurpose
datavoid*CPU-accessible pixel data (if memory-mapped)
dma_buf_fdUniqueFdDMA-BUF file descriptor (RAII-managed, closes on destruct)
widthuint32_tFrame width in pixels
heightuint32_tFrame height in pixels
strideuint32_tBytes per row (row pitch)
formatuint32_tDRM format code (e.g., DRM_FORMAT_XBGR8888)
modifieruint64_tDRM format modifier (0 = linear, or tiled modifier)
internal_handlevoid*Internal handle (points to owning BufferSlot)
render_fence_fdUniqueFdRender completion fence (IN_FENCE for DRM)
buffer_iduint64_tStable buffer ID for fast-path validation
generationuint64_tIncremented on content change

RenderTarget Structure (ABI v5)

RenderTarget represents a core-allocated DMA-BUF that plugins render into. The core allocates and owns the buffer, the plugin receives a borrowed handle.

struct RenderTarget {
    int dma_buf_fd;          // Core-owned FD (Plugin MUST NOT close)
    uint32_t buffer_id;      // Buffer index: 0, 1, or 2 (for import caching)
    uint32_t width;          // Buffer width in pixels
    uint32_t height;         // Buffer height in pixels
    uint32_t stride;         // Row pitch in bytes
    uint32_t format;         // DRM FourCC format code
    uint64_t modifier;       // DRM format modifier
    int release_fence_fd;    // KMS completion fence (Plugin waits before writing)

    // Cursor handoff fields (ABI v9)
    uint32_t display_id;     // Display identifier (0, 1, 2, ...)
    int32_t cursor_x;        // Display-local cursor X position
    int32_t cursor_y;        // Display-local cursor Y position
};

RenderTarget Ownership Model (v9)

The core allocates DMA-BUF buffers and manages their lifecycle. Plugins receive borrowed references.

Plugin Vulkan GPUCore DisplayManagerPlugin Vulkan GPUCore DisplayManager1. allocate_swapchain_buffers()Vulkan creates DMA-BUFExport FD via vkGetMemoryFdKHRStore in swapchain_buffers[]2. RenderTarget setupdma_buf_fd = swapchain_buffers[id]BORROWED FD (plugin must NOT close)3. Render into bufferImport FD to VulkanRender frameReturn render_fence4. Submit to KMSImport DMA-BUFQueue scanoutReturn OUT_FENCEBorrow FDrender_fence_fd

Key Rules:

  • Core allocates DMA-BUF via Vulkan with external memory export
  • Core owns the FD throughout its lifetime
  • Plugin receives borrowed FD, must NOT close it
  • Plugin waits on release_fence_fd before rendering (if valid)
  • Plugin returns render_fence_fd for completion signaling

DMA-BUF Ownership Model

Core-Owned Buffers (ABI v5)

In ABI v5, the core allocates all swapchain buffers. This simplifies ownership:

  1. Core allocates buffers via Vulkan with DMA-BUF export capability
  2. Core owns the FD for the lifetime of the buffer
  3. Plugin borrows the FD via RenderTarget::dma_buf_fd
  4. Plugin renders into the borrowed buffer
  5. Plugin returns a render fence for completion signaling
  6. Core submits the DMA-BUF to DRM for scanout
// Core allocates (DisplayManager)
auto allocate_swapchain_buffers(DisplayState& display) -> Result<void> {
    for (size_t i = 0; i < 3; ++i) {
        // Create Vulkan image with external memory
        vk::ImageCreateInfo image_info = {};
        image_info.flags = vk::ImageCreateFlagBits::eMutableFormat;
        image_info.imageType = vk::ImageType::e2D;
        image_info.extent.width = width;
        image_info.extent.height = height;
        image_info.extent.depth = 1;
        image_info.mipLevels = 1;
        image_info.arrayLayers = 1;
        image_info.format = vk::Format::eR8G8B8A8Unorm;
        image_info.tiling = vk::ImageTiling::eLinear;  // Required for DMA-BUF
        image_info.initialLayout = vk::ImageLayout::eUndefined;
        image_info.usage = vk::ImageUsageFlagBits::eColorAttachment |
                          vk::ImageUsageFlagBits::eTransferSrc;
        image_info.sharingMode = vk::SharingMode::eExclusive;

        // Export memory with DMA-BUF capability
        vk::ExportMemoryAllocateInfo export_info = {};
        export_info.handleTypes = vk::ExternalMemoryHandleTypeFlagBits::eDmaBufEXT;

        vk::MemoryAllocateInfo alloc_info = {};
        alloc_info.pNext = &export_info;
        alloc_info.allocationSize = memory_requirements.size;
        alloc_info.memoryTypeIndex = memory_type_index;

        // Allocate and bind
        vk::raii::DeviceMemory vk_memory(device, alloc_info);
        vk::raii::Image vk_image(device, image_info);
        vk_image.bindMemory(vk_memory, 0);

        // Export as DMA-BUF FD
        vk::MemoryGetFdInfoKHR get_fd_info = {};
        get_fd_info.pNext = nullptr;
        get_fd_info.memory = *vk_memory;
        get_fd_info.handleType = vk::ExternalMemoryHandleTypeFlagBits::eDmaBufEXT;

        int dma_buf_fd = device.getMemoryFdKHR(get_fd_info);

        // Store in swapchain_buffers[i]
        swapchain_buffers[i].fd.reset(dma_buf_fd);
        swapchain_buffers[i].width = width;
        swapchain_buffers[i].height = height;
        swapchain_buffers[i].stride = row_pitch;
        swapchain_buffers[i].format = DRM_FORMAT_XBGR8888;
        swapchain_buffers[i].modifier = 0;  // Linear
        swapchain_buffers[i].id = i;
    }
}

Transfer Semantics

When submitting to KMS, the core duplicates the FD for the frame slot:

// In render_loop.cpp, render_display()
slot->frame.dma_buf_fd.reset(dup(core_buf.fd.get()));

This allows the core to retain ownership of the original while DRM takes ownership of the duplicate. DRM closes the duplicate when the framebuffer is released.

dup() for Caching

Plugins that want to retain access to a buffer must use dup():

// Plugin wants to keep the buffer for next frame
int retained_fd = dup(render_target.dma_buf_fd);
// ... use retained_fd ...
close(retained_fd);  // Close when done

When to close:

  • Core: Never closes swapchain buffers (owned for lifetime)
  • Plugin: Never closes RenderTarget::dma_buf_fd (borrowed)
  • DRM: Closes the dup’d FD when framebuffer is removed

v5 Render API Flow

The v5 API uses waxed_plugin_render_v5() which receives a RenderTarget:

auto waxed_plugin_render_v5(
    PluginState* state,
    const RenderTarget* target
) noexcept -> int;

Render Flow Sequence

1. RenderLoop render_display

Acquire BufferSlot

Get CoreBuffer from swapchain_buffers slot_id

2. Build RenderTarget

dma_buf_fd = core_buffer.fd.get()

buffer_id = slot_id (0, 1, or 2)

width, height, stride, format, modifier

release_fence_fd = pending_release_fence from previous cycle

display_id, cursor_x, cursor_y

3. PluginManager render_to_target

Call waxed_plugin_render_v5

4. Plugin Implementation

Wait on release_fence_fd if >= 0

Import dma_buf_fd to Vulkan

Render frame GPU commands

Create render completion fence sync_file

Return render_fence_fd

5. Store render_fence in slot frame render_fence_fd
6. Submit to KMS AtomicKMSOutput submit_frame

Import DMA-BUF via drmPrimeFDToHandle

Create framebuffer via drmModeAddFB2WithModifiers

Add IN_FENCE_FD property if available

Add OUT_FENCE_PTR property

drmModeAtomicCommit NONBLOCK PAGE_FLIP_EVENT

7. Display scans out buffer

Plugin Implementation Example

extern "C" auto waxed_plugin_render_v5(
    PluginState* state,
    const RenderTarget* target
) noexcept -> int {
    // 1. Wait for previous scanout to complete
    if (target->release_fence_fd >= 0) {
        sync_wait(target->release_fence_fd, -1);  // -1 = infinite wait
        close(target->release_fence_fd);  // We own this fence
    }

    // 2. Import DMA-BUF to Vulkan
    VkMemoryDedicatedAllocateInfo dedicated_info = {
        .sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO,
        .image = VK_NULL_HANDLE,  // Will fill in
    };

    VkImportMemoryFdInfoKHR import_info = {
        .sType = VK_STRUCTURE_TYPE_IMPORT_MEMORY_FD_INFO_KHR,
        .handleType = VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_FD_BIT_EXT,
        .fd = dup(target->dma_buf_fd),  // dup() - we don't own the original
        .pNext = &dedicated_info,
    };

    VkMemoryAllocateInfo alloc_info = {
        .sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
        .pNext = &import_info,
        .allocationSize = size,
        .memoryTypeIndex = memory_type,
    };

    VkDeviceMemory vk_memory;
    vkAllocateMemory(device, &alloc_info, nullptr, &vk_memory);

    // 3. Bind image to imported memory
    vkBindImageMemory(device, vk_image, vk_memory, 0);

    // 4. Render frame
    vkCmdBeginCommandBuffer(cmd, &begin_info);
    // ... rendering commands ...
    vkCmdEndCommandBuffer(cmd);
    vkQueueSubmit(queue, 1, &submit_info, VK_NULL_HANDLE);

    // 5. Create render completion fence (sync_file)
    int render_fence_fd = -1;

    VkExportSemaphoreCreateInfo export_info = {
        .sType = VK_STRUCTURE_TYPE_EXPORT_SEMAPHORE_CREATE_INFO,
        .handleTypes = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT_EXT,
    };

    VkSemaphoreCreateInfo sem_info = {
        .sType = VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO,
        .pNext = &export_info,
    };

    VkSemaphore vk_semaphore;
    vkCreateSemaphore(device, &sem_info, nullptr, &vk_semaphore);

    // Submit with semaphore signal
    VkSemaphoreSubmitInfoKHR signal_info = {
        .sType = VK_STRUCTURE_TYPE_SEMAPHORE_SUBMIT_INFO_KHR,
        .semaphore = vk_semaphore,
    };
    // ... submit with signal ...

    // Export as sync_file (FD)
    VkSemaphoreGetFdInfoKHR get_fd_info = {
        .sType = VK_STRUCTURE_TYPE_SEMAPHORE_GET_FD_INFO_KHR,
        .semaphore = vk_semaphore,
        .handleType = VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT_EXT,
    };

    vkGetSemaphoreFdKHR(device, &get_fd_info, &render_fence_fd);

    vkDestroySemaphore(device, vk_semaphore, nullptr);

    // 6. Free imported memory (FD ownership transferred to kernel)
    vkFreeMemory(device, vk_memory, nullptr);  // Closes the dup'd FD

    return render_fence_fd;  // Ownership transferred to core
}

Fence Lifecycle

The frame pipeline uses two types of fences for explicit synchronization:

IN_FENCE_FD (Render Completion)

Signals when GPU rendering is complete. DRM waits for this fence before displaying the buffer.

DisplayDRM KernelPlugin GPUDisplayDRM KernelPlugin GPU1. Submit GPU render2. Create sync_file (render_fence_fd)3. Add IN_FENCE_FD property4. drmModeAtomicCommit5. Wait for IN_FENCE6. Display buffer

Code Flow:

// Plugin creates render fence
int render_fence_fd = create_sync_file_from_gpu_work();

// Returns render_fence_fd via waxed_plugin_render_v5()

// Core adds to atomic request
drmModeAtomicAddProperty(req, plane_id, in_fence_fd_prop, render_fence_fd);

// DRM waits for fence before scanout
drmModeAtomicCommit(drm_fd, req, flags, nullptr);

OUT_FENCE_PTR (Scanout Completion)

Signals when the buffer has completed scanout and left the screen. The next frame must wait for this fence before rendering to the same buffer.

DisplayDRM KernelPlugin GPUDisplayDRM KernelPlugin GPU1. OUT_FENCE_PTR2. Wait on fence before rendering3. Scanout completes4. Signal fenceSignal fence5. Render to buffer

Code Flow:

// In submit_frame(), slot->out_fence_storage is persistent
slot->out_fence_storage = -1;

// Pass pointer to kernel
drmModeAtomicAddProperty(req, crtc_id, out_fence_ptr_prop,
                         reinterpret_cast<uint64_t>(&slot->out_fence_storage));

// Kernel writes fence FD to slot->out_fence_storage during commit
drmModeAtomicCommit(drm_fd, req, flags, nullptr);

// Extract fence (may have AMDGPU encoding)
int fence_fd = extract_fence_from_storage(slot->out_fence_storage);

// Pass to plugin as release_fence_fd in next RenderTarget
target.release_fence_fd = pending_release_fence_fd.release();

Fence State Flow

Frame N-1

Frame N

Wait for OUT_FENCE before writing

Store as pending_release_fence used in next RenderTarget

OUT_FENCE from previous frame

Plugin renders to buffer N

Create IN_FENCE during render

Submit to DRM

OUT_FENCE written to slot storage

Plugin renders to buffer N+1

Wait for OUT_FENCE from Frame N

Submit to DRM

OUT_FENCE written to slot storage

Buffer ID Caching for Fast Path

Plugins can cache imported DMA-BUFs to avoid re-importing on every frame. The buffer_id field identifies which core buffer is being used (0, 1, or 2).

// Plugin maintains cache
struct CachedImport {
    uint32_t buffer_id;
    VkDeviceMemory vk_memory;
    VkImage vk_image;
    bool valid;
};
std::array<CachedImport, 3> import_cache;

extern "C" auto waxed_plugin_render_v5(
    PluginState* state,
    const RenderTarget* target
) noexcept -> int {
    uint32_t buffer_id = target->buffer_id;  // 0, 1, or 2

    if (import_cache[buffer_id].valid) {
        // Fast path: Reuse cached import
        VkImage image = import_cache[buffer_id].vk_image;
    } else {
        // Slow path: Import new DMA-BUF
        import_cache[buffer_id].vk_memory = import_dma_buf(target);
        import_cache[buffer_id].valid = true;
    }

    // ... render ...

    return render_fence_fd;
}

Important: Cache invalidation occurs on display configuration changes (hotplug, resolution change). Plugins should clear cache on waxed_plugin_visibility_changed(false, true) or when dimensions mismatch.

Frame Lifecycle Diagrams

Frame Ownership Transfer

DRMPluginCoreDRMPluginCoreFrame LifecycleRenderTarget (FD borrowed)FB 0IN_FENCE_FDOUT_FENCE_PTRswapchain_buf[0] (borrow FD)import DMA-BUFwait for OUT_FENCEOUT_FENCE signaledrender to buffercreate IN_FENCE_FDscanoutcreate OUT_FENCE_PTRstore as pending_release

Fence Signaling Flow

Frame N+1

Frame N

Signal OUT_FENCE

Plugin renders to buffer N

Create IN_FENCE render_fence_fd

Submit to DRM with IN_FENCE_FD

DRM waits for IN_FENCE

GPU rendering completes

Display scans buffer N

Request OUT_FENCE_PTR

Buffer leaves screen

Wait for OUT_FENCE before rendering

Plugin renders to buffer N+1

Timeline of Frame Through Pipeline

000ms000ms000ms000ms000ms000ms000ms000ms000ms000ms000ms000ms000ms000ms000ms000ms000msImport DMA-BUF Submit GPU commands Create render_fence Add IN_FENCE to request Add OUT_FENCE_PTR Atomic commit Wait for IN_FENCE Schedule scanout Buffer on screen OUT_FENCE signals Plugin receives fence Plugin waits for fence Start rendering next PluginCoreDRMDisplayNext FrameFrame Pipeline Timeline

Triple Buffer Slot State

BufferSlot tracks the state of each buffer in the triple-buffered swapchain:

struct BufferSlot {
    FrameHandle frame;                          // Frame data
    std::atomic<bool> in_use{false};           // True if display or worker is using
    uint64_t sequence_number{0};               // Frame sequence counter
    uint64_t acquire_time_ms{0};               // When we acquired this frame
    uint64_t fence_generation{0};              // Generation counter for ABA prevention
    std::atomic<int> dump_ref_count{0};        // Active dump operations count
    waxed::core::utils::UniqueFd release_fence_fd;     // OUT_FENCE (KMS completion)
    waxed::core::utils::UniqueFd pending_release_fence_fd;  // OUT_FENCE for plugin
    int64_t out_fence_storage{0};              // Persistent storage for OUT_FENCE_PTR
    uint32_t slot_index{UINT32_MAX};           // Slot index (0, 1, 2)
    uint32_t display_id{UINT32_MAX};           // Owner display ID
    FenceClosure fence_closure;                // Pre-allocated fence event data
};

Slot States

acquire_frame_for_display()

submit_frame() succeeds

Fence signals

Slot available, no fence pending

Slot acquired, rendering in progress

Buffer on screen, OUT_FENCE pending

Summary

The Waxed Frame Pipeline provides:

  1. Zero-copy rendering via DMA-BUF sharing between GPU and display
  2. Core-owned buffers simplifying ownership and lifetime management
  3. Explicit synchronization via IN_FENCE and OUT_FENCE for GPU-Display coordination
  4. Triple buffering for consistent frame pacing
  5. Per-display VSync enabling mixed refresh rate configurations