Skip to content
Waxed Display Server
← Back to Docs

Atomic KMS Output

Atomic KMS Output System

1. Overview

The AtomicKMSOutput class handles display output using the Linux DRM (Direct Rendering Manager) atomic Mode-Setting (KMS) API. It provides zero-copy DMA-BUF import, explicit synchronization between GPU rendering and display scanout, hardware cursor support, and Variable Refresh Rate (VRR) capabilities.

Key Features

  • Zero-copy DMA-BUF import: Vulkan renders directly to buffers that DRM displays without copying
  • Explicit sync: GPU and display synchronize via kernel fences, eliminating CPU blocking
  • Hardware cursor: Independent cursor plane updated at 1000Hz+ via async commits
  • VRR support: AMDGPU Freesync/VRR for variable refresh rates
  • Triple buffering: Proper framebuffer management for smooth frame pacing

Architecture

DRM DisplayAtomicKMSOutputBufferSlotVulkan GPU Render LoopDRM DisplayAtomicKMSOutputBufferSlotVulkan GPU Render LoopScanout Plane displays DMA-BUF FDrenderrender_fence_fd (GPU → Display)submit_frame()out_fence_fd (Display → GPU)out_fence_fd

2. Plane Property Caching

Atomic commits require setting multiple properties on DRM objects (planes, CRTCs, connectors). Property names are string-based, and looking them up by name on every frame would be prohibitively expensive.

The PlaneProperties structure caches property IDs during initialization:

struct PlaneProperties {
    uint32_t plane_id = 0;           // Plane object ID
    uint32_t type_prop = 0;          // Plane type (primary/overlay/cursor)
    uint32_t fb_prop = 0;            // FB_ID property
    uint32_t crtc_prop = 0;          // CRTC_ID property
    uint32_t src_x_prop = 0;         // Source X position (16.16 fixed)
    uint32_t src_y_prop = 0;         // Source Y position (16.16 fixed)
    uint32_t src_w_prop = 0;         // Source width (16.16 fixed)
    uint32_t src_h_prop = 0;         // Source height (16.16 fixed)
    uint32_t dst_x_prop = 0;         // Destination X position
    uint32_t dst_y_prop = 0;         // Destination Y position
    uint32_t dst_w_prop = 0;         // Destination width
    uint32_t dst_h_prop = 0;         // Destination height
    uint32_t in_fence_fd_prop = 0;   // IN_FENCE_FD property (explicit sync)
};

Discovery Process

During init(), the system discovers planes compatible with the CRTC and caches their properties:

  1. Get CRTC index: The possible_crtcs bitmask uses CRTC index bits, not CRTC IDs
  2. Find planes: Iterate through all planes, checking possible_crtcs compatibility
  3. Check type: Use the type property to identify DRM_PLANE_TYPE_PRIMARY or DRM_PLANE_TYPE_CURSOR
  4. Cache properties: Look up each property by name and store the ID

Driver Compatibility

The cache_plane_properties() function supports fallback property names for driver compatibility:

PreferredFallbackPurpose
DST_XCRTC_XDestination X position
DST_YCRTC_YDestination Y position
DST_WCRTC_WDestination width
DST_HCRTC_HDestination height

3. CRTC Properties

The CRTCProperties structure caches CRTC-level properties:

struct CRTCProperties {
    uint32_t crtc_id = 0;            // CRTC object ID
    uint32_t mode_id_prop = 0;       // MODE_ID property (blob)
    uint32_t active_prop = 0;        // ACTIVE property
    uint32_t out_fence_ptr_prop = 0; // OUT_FENCE_PTR property (explicit sync)
    uint32_t vrr_enabled_prop = 0;   // VRR_ENABLED property (AMDGPU Freesync/VRR)
};

Property Descriptions

  • MODE_ID: Blob property containing display mode (resolution, refresh rate). Only set on first frame (modeset).
  • ACTIVE: Boolean enabling/disabling the CRTC. Set to 1 during modeset.
  • OUT_FENCE_PTR: Pointer to memory where DRM writes the release fence FD. Used for explicit sync.
  • VRR_ENABLED: Enables Variable Refresh Rate on AMDGPU. Must be set on every commit.

4. Connector Properties

The ConnectorProperties structure links the connector to the CRTC:

struct ConnectorProperties {
    uint32_t connector_id = 0;       // Connector object ID
    uint32_t crtc_id_prop = 0;       // CRTC_ID property (links connector to CRTC)
};

CRITICAL: Connector-CRTC Linkage

The CRTC_ID property on the connector is mandatory for a functional display pipeline. Without setting this property in the atomic commit, the connector is not linked to the CRTC, and the screen remains black even though the commit succeeds.

CRTC_ID property

Connector

CRTC

Plane

Framebuffer

5. Framebuffer Slot Management

The system does not use a framebuffer cache. Instead, each import_dma_buf() call creates a unique framebuffer ID. This is critical for proper triple buffering:

struct FramebufferSlot {
    uint32_t fb_id;              // DRM framebuffer ID
    uint32_t gem_handle;         // GEM handle (for cleanup)
    int dma_buf_fd;              // Original DMA-BUF FD (if we own it)
    uint64_t sequence_number;    // Frame sequence for debugging
};

Triple Buffering with Unique FB IDs

Triple Buffer State

BufferSlot A

FB_ID=100

gem_handle=50

BufferSlot B

FB_ID=101

gem_handle=50

BufferSlot C

FB_ID=102

gem_handle=50

Shared GEM Memory

(DMA-BUF)

Why unique FB IDs?

Each FB ID = unique framebuffer object in DRM

DRM tracks which FB is currently scanning out

Prevents GPU overwrite while DRM displays

GEM Handle Reference Counting

The kernel’s GEM handle reference counting ensures the DMA-BUF memory stays alive until all framebuffers are destroyed:

gem_handle += 1

uses gem_handle

releases FB reference

gem_handle -= 1

drmPrimeFDToHandle()

drmModeAddFB2WithModifiers()

drmModeRmFB()

drmCloseBufferHandle()

gem_handle == 0

Release DMA-BUF memory

6. DMA-BUF Import Process

The import_dma_buf() function imports a DMA-BUF file descriptor into DRM:

DMA-BUF Import Flow

drmPrimeFDToHandle(fd, &gem_handle)

Convert DMA-BUF FD to GEM handle

Same DMA-BUF = same gem_handle (reference counted)

drmModeAddFB2WithModifiers(...)

Create framebuffer from GEM handle

Use explicit modifier from Vulkan

Returns unique fb_id

Return (gem_handle, fb_id)

Format and Modifiers

  • Format: DRM_FORMAT_XBGR8888 (matches VK_FORMAT_R8G8B8A8_UNORM)
  • Modifier: Explicit modifier from frame->modifier (e.g., DRM_FORMAT_MOD_INVALID for linear, or tile-specific modifiers)
  • Flags: DRM_MODE_FB_MODIFIERS to enable modifier support

Cleanup

After the atomic commit succeeds, the previous framebuffer is released via release_framebuffer():

drmModeRmFB(drm_fd_, old_fb_id);        // Remove framebuffer
drmCloseBufferHandle(drm_fd_, gem_handle);  // Close GEM handle

7. Explicit Synchronization

Explicit sync eliminates CPU blocking by having the GPU and display synchronize via kernel-managed fences.

Fences

  • IN_FENCE_FD: Render completion fence from Vulkan. Tells DRM “don’t show this buffer until the GPU finishes rendering.”
  • OUT_FENCE_PTR: Release fence from DRM. Tells the GPU “don’t render again until this buffer is no longer on screen.”
OUT_FENCEDRM DisplayIN_FENCEGPU RenderOUT_FENCEDRM DisplayIN_FENCEGPU RenderFrame NFrame N+1GPU rendersIN_FENCE signalsDRM shows NOUT_FENCE signalsGPU rendersIN_FENCE signalsDRM shows N+1OUT_FENCE signals

AMDGPU OUT_FENCE Encoding

AMDGPU encodes the OUT_FENCE FD in a specific format:

AMDGPU_Out_Fence_Format

Pattern: 0xffffffff000000XX

upper 32 bits: 0xffffffff

lower 32 bits: actual FD

Extraction: fence_fd = storage & 0xFFFFFFFFULL

8. submit_frame() Detailed Flow

The submit_frame() method orchestrates the entire frame submission process:

Yes

No

submit_frame()

Import DMA-BUF

drmPrimeFDToHandle() → gem_handle

drmModeAddFB2WithModifiers() → fb_id

Allocate atomic request

drmModeAtomicAlloc()

Add primary plane properties

FB_ID, CRTC_ID

SRC_X/Y/W/H (16.16 fixed)

DST_X/Y/W/H (pixels)

IN_FENCE_FD

Add cursor plane properties

Must include cursor in every primary commit

Prevents jump back artifact

First frame?

!crtc_enabled_

Modeset required

Normal frame

Add CRTC properties

MODE_ID, ACTIVE=1, Connector.CRTC_ID

VRR_ENABLED (if requested)

OUT_FENCE_PTR

Add CRTC properties

VRR_ENABLED (if requested)

OUT_FENCE_PTR

Configure commit flags

DRM_MODE_ATOMIC_ALLOW_MODESET (blocking)

Configure commit flags

NONBLOCK | PAGE_FLIP_EVENT

Execute atomic commit

drmModeAtomicCommit() with EBUSY retry

Handle OUT_FENCE

Extract fence FD from storage

Handle AMDGPU encoding

Release previous framebuffer

drmModeRmFB(old_fb_id)

drmCloseBufferHandle(old_gem_handle)

EBUSY Retry Logic

The kernel returns EBUSY when a previous commit is still processing. The code retries up to 3 times with a 5ms delay:

constexpr int MAX_EBUSY_RETRIES = 3;
constexpr int EBUSY_RETRY_DELAY_MS = 5;

do {
    ret = drmModeAtomicCommit(drm_fd_, req, commit_flags, user_data);
    if (ret == -EBUSY && retry_count < MAX_EBUSY_RETRIES) {
        usleep(EBUSY_RETRY_DELAY_MS * 1000);
        continue;
    }
    break;
} while (true);

9. Hardware Cursor Integration

Hardware cursors use a dedicated cursor plane that overlays the primary plane:

Plane_Composition

overlay

Cursor Plane

(64x64 RGBA)

Top layer, hardware composited

Primary Plane

(1920x1080 XBGR)

Bottom layer, frame content

Cursor Plane Properties

The cursor plane uses the same property types as the primary plane, but with different semantics:

PropertyCursor ValueDescription
FB_IDcursor_fb_id_Hardware cursor framebuffer
CRTC_IDcrtc_id_Link to CRTC
SRC_X/Y0 << 16No source cropping (usually)
SRC_W/Hcursor_size_ << 16Full cursor size
DST_X/Ycursor_x/y - hotspotScreen position with hotspot offset
DST_W/Hcursor_size_Full cursor size (hardware clips)

Hotspot Offset

The hotspot is the point within the cursor image that aligns with the mouse coordinates:

Cursor: 64x64 pixels

hotspot=(5,5)

Mouse at (100, 100)

Cursor DST_X = 100 - 5 = 95

Cursor DST_Y = 100 - 5 = 95

Cursor's pixel (5,5) aligns

with mouse (100,100)

Hiding the Cursor

To hide the cursor, set both FB_ID=0 AND CRTC_ID=0:

drmModeAtomicAddProperty(req, cursor_plane_.plane_id, cursor_plane_.fb_prop, 0);
drmModeAtomicAddProperty(req, cursor_plane_.plane_id, cursor_plane_.crtc_prop, 0);

10. Async Cursor Position Updates

Cursor updates use DRM_MODE_PAGE_FLIP_ASYNC for immediate, non-blocking updates:

000 ms000 ms000 ms000 ms000 ms000 ms000 ms000 ms000 ms000 ms000 msPF0 C1 C2 C3 C4 C5 C6 C7 PF1 PF2 Primary CommitsAsync CursorFrame vs Cursor Update Frequency

update_cursor_position_async()

This function bypasses the normal commit queuing logic:

  1. Validate: Check CRTC is enabled (first frame completed)
  2. Allocate request: Lightweight cursor-only atomic request
  3. Add properties: Only cursor plane DST_X/Y changed
  4. Commit with ASYNC flag: DRM_MODE_PAGE_FLIP_ASYNC
  5. No VBlank wait: Takes effect immediately on next scanline

Why Include Cursor in Primary Commits?

Even though cursor updates happen at 1000Hz via async commits, the primary 60Hz commit MUST include the latest cursor position. Otherwise, the cursor will “jump back” to the old position for one frame when the primary commit occurs.

Primary Commits (60Hz)t=0msPrimary commit(cursor at 100,100)t=16msPrimary commitMUST include(200,200)Async Cursor (1000Hz)t=5msAsync cursor updateto (150,150)t=8msAsync cursor updateto (200,200)Cursor stays smoothat 1000HzNo jump backartifactResultCursor Position Consistency

11. VRR Support

Variable Refresh Rate (VRR) allows the display to refresh immediately when a frame is ready, reducing latency:

if (vrr_enabled_ && crtc_props_.vrr_enabled_prop != 0) {
    drmModeAtomicAddProperty(req, crtc_id_, crtc_props_.vrr_enabled_prop, 1);
}

VRR Property Location

  • AMDGPU: VRR property is on the CRTC (search for VRR_ENABLED, vrr_enabled, or freesync)
  • VRR must be set on every commit, not just the first frame

Enabling VRR

Call set_vrr_enabled(true) before the first frame submission:

output.set_vrr_enabled(true);
output.submit_frame(...);  // First frame enables VRR

12. ASCII Diagrams

Atomic Commit Flow

DRM KernelAtomicKMSOutputVulkan Render LoopDRM KernelAtomicKMSOutputVulkan Render LoopGPU waits for IN_FENCEFrameHandle* framedrmPrimeFDToHandle()gem_handledrmModeAddFB2()fb_iddrmModeAtomicAlloc()Build request (Plane properties, Cursor position, IN_FENCE_FD, CRTC properties)drmModeAtomicCommit(flags = NONBLOCK | PAGE_FLIP_EVENT)render_fence_fd (IN_FENCE)OUT_FENCE_FD (Display → GPU)Release old FB

Property Relationships

CRTC_ID

Connector (HDMI-A-1)

CRTC 42

Plane 50 (Primary)

Plane 51 (Cursor)

MODE_ID (blob: 1920x1080@144Hz)

ACTIVE (1 = enabled)

VRR_ENABLED (1 = VRR on)

OUT_FENCE_PTR (pointer to fence storage)

type (DRM_PLANE_TYPE_PRIMARY)

FB_ID (100 = framebuffer to display)

CRTC_ID (42 = which CRTC)

SRC_X/Y/W/H (0,0,1920<<16,1080<<16)

DST_X/Y/W/H (0,0,1920,1080)

IN_FENCE_FD (5 = wait for render)

type (DRM_PLANE_TYPE_CURSOR)

FB_ID (200 = cursor framebuffer)

CRTC_ID (42 = which CRTC)

SRC_X/Y/W/H (0,0,64<<16,64<<16)

DST_X/Y/W/H (100,100,64,64)

Triple Buffer Timeline

000ms000ms000ms000ms000ms000ms000ms000ms000ms000ms000ms000ms000msR1 B1_FB100 D1 R2 B2_FB101 D2 R3 B3_FB102 D3 R1_2 B1_FB103 D1_2 GPU RenderingDMA-BUF BufferDRM ScanoutTriple Buffer Frame Progression

Key points:

  • Each render creates a NEW FB ID (100, 101, 102, …)
  • Each buffer cycles: Render → Display → Render
  • GPU never overwrites buffer while DRM displays it
  • OUT_FENCE from frame N signals when frame N leaves screen

Cursor Update Flow

Mouse InputDRM KernelAtomicKMSOutputRender LoopMouse InputDRM KernelAtomicKMSOutputRender LoopPrimary Frame Path (60 Hz at VBlank)Async Cursor Path (1000 Hz on mouse move)Result: Cursor moves smoothly at 1000 Hzwhile primary frames update at 60 Hzsubmit_frame()Add cursor properties to primary request(cursor_fb_id_, cursor_x/y, hotspot offsets)drmModeAtomicCommit()(includes primary+cursor)move_eventupdate_cursor_position_async()Allocate cursor-only request(Only DST_X/Y changed)drmModeAtomicCommit()(flag = PAGE_FLIP_ASYNC)commit returns immediately

13. RAII Wrappers (drm_raii.h)

The drm_raii.h header provides RAII wrappers for DRM resources to prevent leaks:

// RAII wrapper for drmModeConnector
using DrmModeConnectorPtr = std::unique_ptr<drmModeConnector, DrmModeConnectorDeleter>;

// RAII wrapper for property blobs with FD management
struct PropertyBlobPtr {
    int fd{-1};
    drmModePropertyBlobRes* ptr{nullptr};
    ~PropertyBlobPtr() {
        if (ptr && fd >= 0) {
            drmModeDestroyPropertyBlob(fd, ptr->id);
        }
    }
};

Usage

DrmUniquePtr<drmModeRes> resources(drmModeGetResources(drm_fd_));
// Automatically calls drmModeFreeResources() when going out of scope

14. Error Codes

The import_dma_buf() function returns Result<> types with specific error codes:

Error CodeDescription
DRMInvalidFileDescriptorInvalid DRM device FD
DRMResourcesFailedFailed to get DRM resources
DRMGetCrtcFailedFailed to get CRTC
DRMPlaneNotFoundNo compatible plane found
DRMPropertyNotFoundRequired property not found
DRMDmaBufImportFailedFailed to import DMA-BUF
DRMFramebufferFailedFailed to create framebuffer
DRMAtomicAllocFailedFailed to allocate atomic request
DRMAtomicCommitFailedAtomic commit failed
DRMAsyncUpdateNotSupportedDriver doesn’t support async updates