DRM Memory Management

Modern Linux systems require large amount of graphics memory to store frame buffers, textures, vertices and other graphics-related data. Given the very dynamic nature of many of that data, managing graphics memory efficiently is thus crucial for the graphics stack and plays a central role in the DRM infrastructure.

The DRM core includes two memory managers, namely Translation Table Manager (TTM) and Graphics Execution Manager (GEM). TTM was the first DRM memory manager to be developed and tried to be a one-size-fits-them all solution. It provides a single userspace API to accommodate the need of all hardware, supporting both Unified Memory Architecture (UMA) devices and devices with dedicated video RAM (i.e. most discrete video cards). This resulted in a large, complex piece of code that turned out to be hard to use for driver development.

GEM started as an Intel-sponsored project in reaction to TTM’s complexity. Its design philosophy is completely different: instead of providing a solution to every graphics memory-related problems, GEM identified common code between drivers and created a support library to share it. GEM has simpler initialization and execution requirements than TTM, but has no video RAM management capabilities and is thus limited to UMA devices.

The Translation Table Manager (TTM)

TTM is a memory manager for accelerator devices with dedicated memory.

The basic idea is that resources are grouped together in buffer objects of certain size and TTM handles lifetime, movement and CPU mappings of those objects.

TODO: Add more design background and information here.

enum ttm_caching

CPU caching and BUS snooping behavior.

Constants

ttm_uncached

Most defensive option for device mappings, don’t even allow write combining.

ttm_write_combined

Don’t cache read accesses, but allow at least writes to be combined.

ttm_cached

Fully cached like normal system memory, requires that devices snoop the CPU cache on accesses.

TTM device object reference

struct ttm_global

Buffer object driver global data.

Definition

struct ttm_global {
  struct page *dummy_read_page;
  struct list_head device_list;
  atomic_t bo_count;
};

Members

dummy_read_page

Pointer to a dummy page used for mapping requests of unpopulated pages. Constant after init.

device_list

List of buffer object devices. Protected by ttm_global_mutex.

bo_count

Number of buffer objects allocated by devices.

struct ttm_device

Buffer object driver device-specific data.

Definition

struct ttm_device {
  struct list_head device_list;
  struct ttm_device_funcs *funcs;
  struct ttm_resource_manager sysman;
  struct ttm_resource_manager *man_drv[TTM_NUM_MEM_TYPES];
  struct drm_vma_offset_manager *vma_manager;
  struct ttm_pool pool;
  spinlock_t lru_lock;
  struct list_head ddestroy;
  struct list_head pinned;
  struct address_space *dev_mapping;
  struct delayed_work wq;
};

Members

device_list

Our entry in the global device list. Constant after bo device init

funcs

Function table for the device. Constant after bo device init

sysman

Resource manager for the system domain. Access via ttm_manager_type.

man_drv

An array of resource_managers, one per resource type.

vma_manager

Address space manager for finding BOs to mmap.

pool

page pool for the device.

lru_lock

Protection for the per manager LRU and ddestroy lists.

ddestroy

Destroyed but not yet cleaned up buffer objects.

pinned

Buffer objects which are pinned and so not on any LRU list.

dev_mapping

A pointer to the struct address_space for invalidating CPU mappings on buffer move. Protected by load/unload sync.

wq

Work queue structure for the delayed delete workqueue.

int ttm_device_init(struct ttm_device *bdev, struct ttm_device_funcs *funcs, struct device *dev, struct address_space *mapping, struct drm_vma_offset_manager *vma_manager, bool use_dma_alloc, bool use_dma32)

Parameters

struct ttm_device *bdev

A pointer to a struct ttm_device to initialize.

struct ttm_device_funcs *funcs

Function table for the device.

struct device *dev

The core kernel device pointer for DMA mappings and allocations.

struct address_space *mapping

The address space to use for this bo.

struct drm_vma_offset_manager *vma_manager

A pointer to a vma manager.

bool use_dma_alloc

If coherent DMA allocation API should be used.

bool use_dma32

If we should use GFP_DMA32 for device memory allocations.

Description

Initializes a struct ttm_device:

Return

!0: Failure.

TTM resource placement reference

struct ttm_place

Definition

struct ttm_place {
  unsigned fpfn;
  unsigned lpfn;
  uint32_t mem_type;
  uint32_t flags;
};

Members

fpfn

first valid page frame number to put the object

lpfn

last valid page frame number to put the object

mem_type

One of TTM_PL_* where the resource should be allocated from.

flags

memory domain and caching flags for the object

Description

Structure indicating a possible place to put an object.

struct ttm_placement

Definition

struct ttm_placement {
  unsigned num_placement;
  const struct ttm_place  *placement;
  unsigned num_busy_placement;
  const struct ttm_place  *busy_placement;
};

Members

num_placement

number of preferred placements

placement

preferred placements

num_busy_placement

number of preferred placements when need to evict buffer

busy_placement

preferred placements when need to evict buffer

Description

Structure indicating the placement you request for an object.

TTM resource object reference

struct ttm_resource_manager

Definition

struct ttm_resource_manager {
  bool use_type;
  bool use_tt;
  struct ttm_device *bdev;
  uint64_t size;
  const struct ttm_resource_manager_func *func;
  spinlock_t move_lock;
  struct dma_fence *move;
  struct list_head lru[TTM_MAX_BO_PRIORITY];
  uint64_t usage;
};

Members

use_type

The memory type is enabled.

use_tt

If a TT object should be used for the backing store.

bdev

ttm device this manager belongs to

size

Size of the managed region.

func

structure pointer implementing the range manager. See above

move_lock

lock for move fence

move

The fence of the last pipelined move operation.

lru

The lru list for this memory type.

usage

How much of the resources are used, protected by the bdev->lru_lock.

Description

This structure is used to identify and manage memory types for a device.

struct ttm_bus_placement

Definition

struct ttm_bus_placement {
  void *addr;
  phys_addr_t offset;
  bool is_iomem;
  enum ttm_caching        caching;
};

Members

addr

mapped virtual address

offset

physical addr

is_iomem

is this io memory ?

caching

See enum ttm_caching

Description

Structure indicating the bus placement of an object.

struct ttm_resource

Definition

struct ttm_resource {
  unsigned long start;
  unsigned long num_pages;
  uint32_t mem_type;
  uint32_t placement;
  struct ttm_bus_placement bus;
  struct ttm_buffer_object *bo;
  struct list_head lru;
};

Members

start

Start of the allocation.

num_pages

Actual size of resource in pages.

mem_type

Resource type of the allocation.

placement

Placement flags.

bus

Placement on io bus accessible to the CPU

bo

weak reference to the BO, protected by ttm_device::lru_lock

lru

Least recently used list, see ttm_resource_manager.lru

Description

Structure indicating the placement and space resources used by a buffer object.

struct ttm_resource_cursor

Definition

struct ttm_resource_cursor {
  unsigned int priority;
};

Members

priority

the current priority

Description

Cursor to iterate over the resources in a manager.

struct ttm_lru_bulk_move_pos

Definition

struct ttm_lru_bulk_move_pos {
  struct ttm_resource *first;
  struct ttm_resource *last;
};

Members

first

first res in the bulk move range

last

last res in the bulk move range

Description

Range of resources for a lru bulk move.

struct ttm_lru_bulk_move

Definition

struct ttm_lru_bulk_move {
  struct ttm_lru_bulk_move_pos pos[TTM_NUM_MEM_TYPES][TTM_MAX_BO_PRIORITY];
};

Members

pos

first/last lru entry for resources in the each domain/priority

Description

Container for the current bulk move state. Should be used with ttm_lru_bulk_move_init() and ttm_bo_set_bulk_move().

struct ttm_kmap_iter_iomap

Specialization for a struct io_mapping + struct sg_table backed struct ttm_resource.

Definition

struct ttm_kmap_iter_iomap {
  struct ttm_kmap_iter base;
  struct io_mapping *iomap;
  struct sg_table *st;
  resource_size_t start;
  struct {
    struct scatterlist *sg;
    pgoff_t i;
    pgoff_t end;
    pgoff_t offs;
  } cache;
};

Members

base

Embedded struct ttm_kmap_iter providing the usage interface.

iomap

struct io_mapping representing the underlying linear io_memory.

st

sg_table into iomap, representing the memory of the struct ttm_resource.

start

Offset that needs to be subtracted from st to make sg_dma_address(st->sgl) - start == 0 for iomap start.

cache

Scatterlist traversal cache for fast lookups.

cache.sg

Pointer to the currently cached scatterlist segment.

cache.i

First index of sg. PAGE_SIZE granularity.

cache.end

Last index + 1 of sg. PAGE_SIZE granularity.

cache.offs

First offset into iomap of sg. PAGE_SIZE granularity.

struct ttm_kmap_iter_linear_io

Iterator specialization for linear io

Definition

struct ttm_kmap_iter_linear_io {
  struct ttm_kmap_iter base;
  struct iosys_map dmap;
  bool needs_unmap;
};

Members

base

The base iterator

dmap

Points to the starting address of the region

needs_unmap

Whether we need to unmap on fini

void ttm_resource_manager_set_used(struct ttm_resource_manager *man, bool used)

Parameters

struct ttm_resource_manager *man

A memory manager object.

bool used

usage state to set.

Description

Set the manager in use flag. If disabled the manager is no longer used for object placement.

bool ttm_resource_manager_used(struct ttm_resource_manager *man)

Parameters

struct ttm_resource_manager *man

Manager to get used state for

Description

Get the in use flag for a manager.

Return

true is used, false if not.

void ttm_resource_manager_cleanup(struct ttm_resource_manager *man)

Parameters

struct ttm_resource_manager *man

A memory manager object.

Description

Cleanup the move fences from the memory manager object.

ttm_resource_manager_for_each_res

ttm_resource_manager_for_each_res (man, cursor, res)

iterate over all resources

Parameters

man

the resource manager

cursor

struct ttm_resource_cursor for the current position

res

the current resource

Description

Iterate over all the evictable resources in a resource manager.

void ttm_lru_bulk_move_init(struct ttm_lru_bulk_move *bulk)

initialize a bulk move structure

Parameters

struct ttm_lru_bulk_move *bulk

the structure to init

Description

For now just memset the structure to zero.

void ttm_lru_bulk_move_tail(struct ttm_lru_bulk_move *bulk)

bulk move range of resources to the LRU tail.

Parameters

struct ttm_lru_bulk_move *bulk

bulk move structure

Description

Bulk move BOs to the LRU tail, only valid to use when driver makes sure that resource order never changes. Should be called with ttm_device.lru_lock held.

void ttm_resource_init(struct ttm_buffer_object *bo, const struct ttm_place *place, struct ttm_resource *res)

resource object constructure

Parameters

struct ttm_buffer_object *bo

buffer object this resources is allocated for

const struct ttm_place *place

placement of the resource

struct ttm_resource *res

the resource object to inistilize

Description

Initialize a new resource object. Counterpart of ttm_resource_fini().

void ttm_resource_fini(struct ttm_resource_manager *man, struct ttm_resource *res)

resource destructor

Parameters

struct ttm_resource_manager *man

the resource manager this resource belongs to

struct ttm_resource *res

the resource to clean up

Description

Should be used by resource manager backends to clean up the TTM resource objects before freeing the underlying structure. Makes sure the resource is removed from the LRU before destruction. Counterpart of ttm_resource_init().

bool ttm_resource_compat(struct ttm_resource *res, struct ttm_placement *placement)

check if resource is compatible with placement

Parameters

struct ttm_resource *res

the resource to check

struct ttm_placement *placement

the placement to check against

Description

Returns true if the placement is compatible.

void ttm_resource_manager_init(struct ttm_resource_manager *man, struct ttm_device *bdev, uint64_t size)

Parameters

struct ttm_resource_manager *man

memory manager object to init

struct ttm_device *bdev

ttm device this manager belongs to

uint64_t size

size of managed resources in arbitrary units

Description

Initialise core parts of a manager object.

uint64_t ttm_resource_manager_usage(struct ttm_resource_manager *man)

Parameters

struct ttm_resource_manager *man

A memory manager object.

Description

Return how many resources are currently used.

void ttm_resource_manager_debug(struct ttm_resource_manager *man, struct drm_printer *p)

Parameters

struct ttm_resource_manager *man

manager type to dump.

struct drm_printer *p

printer to use for debug.

struct ttm_kmap_iter *ttm_kmap_iter_iomap_init(struct ttm_kmap_iter_iomap *iter_io, struct io_mapping *iomap, struct sg_table *st, resource_size_t start)

Initialize a struct ttm_kmap_iter_iomap

Parameters

struct ttm_kmap_iter_iomap *iter_io

The struct ttm_kmap_iter_iomap to initialize.

struct io_mapping *iomap

The struct io_mapping representing the underlying linear io_memory.

struct sg_table *st

sg_table into iomap, representing the memory of the struct ttm_resource.

resource_size_t start

Offset that needs to be subtracted from st to make sg_dma_address(st->sgl) - start == 0 for iomap start.

Return

Pointer to the embedded struct ttm_kmap_iter.

void ttm_resource_manager_create_debugfs(struct ttm_resource_manager *man, struct dentry *parent, const char *name)

Create debugfs entry for specified resource manager.

Parameters

struct ttm_resource_manager *man

The TTM resource manager for which the debugfs stats file be creates

struct dentry * parent

debugfs directory in which the file will reside

const char *name

The filename to create.

Description

This function setups up a debugfs file that can be used to look at debug statistics of the specified ttm_resource_manager.

TTM TT object reference

struct ttm_tt

This is a structure holding the pages, caching- and aperture binding status for a buffer object that isn’t backed by fixed (VRAM / AGP) memory.

Definition

struct ttm_tt {
  struct page **pages;
#define TTM_TT_FLAG_SWAPPED             (1 << 0);
#define TTM_TT_FLAG_ZERO_ALLOC          (1 << 1);
#define TTM_TT_FLAG_EXTERNAL            (1 << 2);
#define TTM_TT_FLAG_EXTERNAL_MAPPABLE   (1 << 3);
#define TTM_TT_FLAG_PRIV_POPULATED  (1U << 31);
  uint32_t page_flags;
  uint32_t num_pages;
  struct sg_table *sg;
  dma_addr_t *dma_address;
  struct file *swap_storage;
  enum ttm_caching caching;
};

Members

pages

Array of pages backing the data.

page_flags

The page flags.

Supported values:

TTM_TT_FLAG_SWAPPED: Set by TTM when the pages have been unpopulated and swapped out by TTM. Calling ttm_tt_populate() will then swap the pages back in, and unset the flag. Drivers should in general never need to touch this.

TTM_TT_FLAG_ZERO_ALLOC: Set if the pages will be zeroed on allocation.

TTM_TT_FLAG_EXTERNAL: Set if the underlying pages were allocated externally, like with dma-buf or userptr. This effectively disables TTM swapping out such pages. Also important is to prevent TTM from ever directly mapping these pages.

Note that enum ttm_bo_type.ttm_bo_type_sg objects will always enable this flag.

TTM_TT_FLAG_EXTERNAL_MAPPABLE: Same behaviour as TTM_TT_FLAG_EXTERNAL, but with the reduced restriction that it is still valid to use TTM to map the pages directly. This is useful when implementing a ttm_tt backend which still allocates driver owned pages underneath(say with shmem).

Note that since this also implies TTM_TT_FLAG_EXTERNAL, the usage here should always be:

page_flags = TTM_TT_FLAG_EXTERNAL |

TTM_TT_FLAG_EXTERNAL_MAPPABLE;

TTM_TT_FLAG_PRIV_POPULATED: TTM internal only. DO NOT USE. This is set by TTM after ttm_tt_populate() has successfully returned, and is then unset when TTM calls ttm_tt_unpopulate().

num_pages

Number of pages in the page array.

sg

for SG objects via dma-buf.

dma_address

The DMA (bus) addresses of the pages.

swap_storage

Pointer to shmem struct file for swap storage.

caching

The current caching state of the pages, see enum ttm_caching.

struct ttm_kmap_iter_tt

Specialization of a mappig iterator for a tt.

Definition

struct ttm_kmap_iter_tt {
  struct ttm_kmap_iter base;
  struct ttm_tt *tt;
  pgprot_t prot;
};

Members

base

Embedded struct ttm_kmap_iter providing the usage interface

tt

Cached struct ttm_tt.

prot

Cached page protection for mapping.

int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc)

Parameters

struct ttm_buffer_object *bo

pointer to a struct ttm_buffer_object

bool zero_alloc

true if allocated pages needs to be zeroed

Description

Make sure we have a TTM structure allocated for the given BO. No pages are actually allocated.

int ttm_tt_init(struct ttm_tt *ttm, struct ttm_buffer_object *bo, uint32_t page_flags, enum ttm_caching caching, unsigned long extra_pages)

Parameters

struct ttm_tt *ttm

The struct ttm_tt.

struct ttm_buffer_object *bo

The buffer object we create the ttm for.

uint32_t page_flags

Page flags as identified by TTM_TT_FLAG_XX flags.

enum ttm_caching caching

the desired caching state of the pages

unsigned long extra_pages

Extra pages needed for the driver.

Description

Create a struct ttm_tt to back data with system memory pages. No pages are actually allocated.

Return

NULL: Out of memory.

void ttm_tt_fini(struct ttm_tt *ttm)

Parameters

struct ttm_tt *ttm

the ttm_tt structure.

Description

Free memory of ttm_tt structure

void ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt *ttm)

Parameters

struct ttm_device *bdev

the ttm_device this object belongs to

struct ttm_tt *ttm

The struct ttm_tt.

Description

Unbind, unpopulate and destroy common struct ttm_tt.

int ttm_tt_swapin(struct ttm_tt *ttm)

Parameters

struct ttm_tt *ttm

The struct ttm_tt.

Description

Swap in a previously swap out ttm_tt.

int ttm_tt_populate(struct ttm_device *bdev, struct ttm_tt *ttm, struct ttm_operation_ctx *ctx)

allocate pages for a ttm

Parameters

struct ttm_device *bdev

the ttm_device this object belongs to

struct ttm_tt *ttm

Pointer to the ttm_tt structure

struct ttm_operation_ctx *ctx

operation context for populating the tt object.

Description

Calls the driver method to allocate pages for a ttm

void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm)

free pages from a ttm

Parameters

struct ttm_device *bdev

the ttm_device this object belongs to

struct ttm_tt *ttm

Pointer to the ttm_tt structure

Description

Calls the driver method to free all pages from a ttm

void ttm_tt_mark_for_clear(struct ttm_tt *ttm)

Mark pages for clearing on populate.

Parameters

struct ttm_tt *ttm

Pointer to the ttm_tt structure

Description

Marks pages for clearing so that the next time the page vector is populated, the pages will be cleared.

struct ttm_tt *ttm_agp_tt_create(struct ttm_buffer_object *bo, struct agp_bridge_data *bridge, uint32_t page_flags)

Parameters

struct ttm_buffer_object *bo

Buffer object we allocate the ttm for.

struct agp_bridge_data *bridge

The agp bridge this device is sitting on.

uint32_t page_flags

Page flags as identified by TTM_TT_FLAG_XX flags.

Description

Create a TTM backend that uses the indicated AGP bridge as an aperture for TT memory. This function uses the linux agpgart interface to bind and unbind memory backing a ttm_tt.

struct ttm_kmap_iter *ttm_kmap_iter_tt_init(struct ttm_kmap_iter_tt *iter_tt, struct ttm_tt *tt)

Initialize a struct ttm_kmap_iter_tt

Parameters

struct ttm_kmap_iter_tt *iter_tt

The struct ttm_kmap_iter_tt to initialize.

struct ttm_tt *tt

Struct ttm_tt holding page pointers of the struct ttm_resource.

Return

Pointer to the embedded struct ttm_kmap_iter.

TTM page pool reference

struct ttm_pool_type

Pool for a certain memory type

Definition

struct ttm_pool_type {
  struct ttm_pool *pool;
  unsigned int order;
  enum ttm_caching caching;
  struct list_head shrinker_list;
  spinlock_t lock;
  struct list_head pages;
};

Members

pool

the pool we belong to, might be NULL for the global ones

order

the allocation order our pages have

caching

the caching type our pages have

shrinker_list

our place on the global shrinker list

lock

protection of the page list

pages

the list of pages in the pool

struct ttm_pool

Pool for all caching and orders

Definition

struct ttm_pool {
  struct device *dev;
  bool use_dma_alloc;
  bool use_dma32;
  struct {
    struct ttm_pool_type orders[MAX_ORDER];
  } caching[TTM_NUM_CACHING_TYPES];
};

Members

dev

the device we allocate pages for

use_dma_alloc

if coherent DMA allocations should be used

use_dma32

if GFP_DMA32 should be used

caching

pools for each caching/order

int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt, struct ttm_operation_ctx *ctx)

Fill a ttm_tt object

Parameters

struct ttm_pool *pool

ttm_pool to use

struct ttm_tt *tt

ttm_tt object to fill

struct ttm_operation_ctx *ctx

operation context

Description

Fill the ttm_tt object with pages and also make sure to DMA map them when necessary.

Return

0 on successe, negative error code otherwise.

void ttm_pool_free(struct ttm_pool *pool, struct ttm_tt *tt)

Free the backing pages from a ttm_tt object

Parameters

struct ttm_pool *pool

Pool to give pages back to.

struct ttm_tt *tt

ttm_tt object to unpopulate

Description

Give the packing pages back to a pool or free them

int ttm_pool_debugfs(struct ttm_pool *pool, struct seq_file *m)

Debugfs dump function for a pool

Parameters

struct ttm_pool *pool

the pool to dump the information for

struct seq_file *m

seq_file to dump to

Description

Make a debugfs dump with the per pool and global information.

The Graphics Execution Manager (GEM)

The GEM design approach has resulted in a memory manager that doesn’t provide full coverage of all (or even all common) use cases in its userspace or kernel API. GEM exposes a set of standard memory-related operations to userspace and a set of helper functions to drivers, and let drivers implement hardware-specific operations with their own private API.

The GEM userspace API is described in the GEM - the Graphics Execution Manager article on LWN. While slightly outdated, the document provides a good overview of the GEM API principles. Buffer allocation and read and write operations, described as part of the common GEM API, are currently implemented using driver-specific ioctls.

GEM is data-agnostic. It manages abstract buffer objects without knowing what individual buffers contain. APIs that require knowledge of buffer contents or purpose, such as buffer allocation or synchronization primitives, are thus outside of the scope of GEM and must be implemented using driver-specific ioctls.

On a fundamental level, GEM involves several operations:

  • Memory allocation and freeing

  • Command execution

  • Aperture management at command execution time

Buffer object allocation is relatively straightforward and largely provided by Linux’s shmem layer, which provides memory to back each object.

Device-specific operations, such as command execution, pinning, buffer read & write, mapping, and domain ownership transfers are left to driver-specific ioctls.

GEM Initialization

Drivers that use GEM must set the DRIVER_GEM bit in the struct struct drm_driver driver_features field. The DRM core will then automatically initialize the GEM core before calling the load operation. Behind the scene, this will create a DRM Memory Manager object which provides an address space pool for object allocation.

In a KMS configuration, drivers need to allocate and initialize a command ring buffer following core GEM initialization if required by the hardware. UMA devices usually have what is called a “stolen” memory region, which provides space for the initial framebuffer and large, contiguous memory regions required by the device. This space is typically not managed by GEM, and must be initialized separately into its own DRM MM object.

GEM Objects Creation

GEM splits creation of GEM objects and allocation of the memory that backs them in two distinct operations.

GEM objects are represented by an instance of struct struct drm_gem_object. Drivers usually need to extend GEM objects with private information and thus create a driver-specific GEM object structure type that embeds an instance of struct struct drm_gem_object.

To create a GEM object, a driver allocates memory for an instance of its specific GEM object type and initializes the embedded struct struct drm_gem_object with a call to drm_gem_object_init(). The function takes a pointer to the DRM device, a pointer to the GEM object and the buffer object size in bytes.

GEM uses shmem to allocate anonymous pageable memory. drm_gem_object_init() will create an shmfs file of the requested size and store it into the struct struct drm_gem_object filp field. The memory is used as either main storage for the object when the graphics hardware uses system memory directly or as a backing store otherwise.

Drivers are responsible for the actual physical pages allocation by calling shmem_read_mapping_page_gfp() for each page. Note that they can decide to allocate pages when initializing the GEM object, or to delay allocation until the memory is needed (for instance when a page fault occurs as a result of a userspace memory access or when the driver needs to start a DMA transfer involving the memory).

Anonymous pageable memory allocation is not always desired, for instance when the hardware requires physically contiguous system memory as is often the case in embedded devices. Drivers can create GEM objects with no shmfs backing (called private GEM objects) by initializing them with a call to drm_gem_private_object_init() instead of drm_gem_object_init(). Storage for private GEM objects must be managed by drivers.

GEM Objects Lifetime

All GEM objects are reference-counted by the GEM core. References can be acquired and release by calling drm_gem_object_get() and drm_gem_object_put() respectively.

When the last reference to a GEM object is released the GEM core calls the struct drm_gem_object_funcs free operation. That operation is mandatory for GEM-enabled drivers and must free the GEM object and all associated resources.

void (*free) (struct drm_gem_object *obj); Drivers are responsible for freeing all GEM object resources. This includes the resources created by the GEM core, which need to be released with drm_gem_object_release().

GEM Objects Naming

Communication between userspace and the kernel refers to GEM objects using local handles, global names or, more recently, file descriptors. All of those are 32-bit integer values; the usual Linux kernel limits apply to the file descriptors.

GEM handles are local to a DRM file. Applications get a handle to a GEM object through a driver-specific ioctl, and can use that handle to refer to the GEM object in other standard or driver-specific ioctls. Closing a DRM file handle frees all its GEM handles and dereferences the associated GEM objects.

To create a handle for a GEM object drivers call drm_gem_handle_create(). The function takes a pointer to the DRM file and the GEM object and returns a locally unique handle. When the handle is no longer needed drivers delete it with a call to drm_gem_handle_delete(). Finally the GEM object associated with a handle can be retrieved by a call to drm_gem_object_lookup().

Handles don’t take ownership of GEM objects, they only take a reference to the object that will be dropped when the handle is destroyed. To avoid leaking GEM objects, drivers must make sure they drop the reference(s) they own (such as the initial reference taken at object creation time) as appropriate, without any special consideration for the handle. For example, in the particular case of combined GEM object and handle creation in the implementation of the dumb_create operation, drivers must drop the initial reference to the GEM object before returning the handle.

GEM names are similar in purpose to handles but are not local to DRM files. They can be passed between processes to reference a GEM object globally. Names can’t be used directly to refer to objects in the DRM API, applications must convert handles to names and names to handles using the DRM_IOCTL_GEM_FLINK and DRM_IOCTL_GEM_OPEN ioctls respectively. The conversion is handled by the DRM core without any driver-specific support.

GEM also supports buffer sharing with dma-buf file descriptors through PRIME. GEM-based drivers must use the provided helpers functions to implement the exporting and importing correctly. See ?. Since sharing file descriptors is inherently more secure than the easily guessable and global GEM names it is the preferred buffer sharing mechanism. Sharing buffers through GEM names is only supported for legacy userspace. Furthermore PRIME also allows cross-device buffer sharing since it is based on dma-bufs.

GEM Objects Mapping

Because mapping operations are fairly heavyweight GEM favours read/write-like access to buffers, implemented through driver-specific ioctls, over mapping buffers to userspace. However, when random access to the buffer is needed (to perform software rendering for instance), direct access to the object can be more efficient.

The mmap system call can’t be used directly to map GEM objects, as they don’t have their own file handle. Two alternative methods currently co-exist to map GEM objects to userspace. The first method uses a driver-specific ioctl to perform the mapping operation, calling do_mmap() under the hood. This is often considered dubious, seems to be discouraged for new GEM-enabled drivers, and will thus not be described here.

The second method uses the mmap system call on the DRM file handle. void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset); DRM identifies the GEM object to be mapped by a fake offset passed through the mmap offset argument. Prior to being mapped, a GEM object must thus be associated with a fake offset. To do so, drivers must call drm_gem_create_mmap_offset() on the object.

Once allocated, the fake offset value must be passed to the application in a driver-specific way and can then be used as the mmap offset argument.

The GEM core provides a helper method drm_gem_mmap() to handle object mapping. The method can be set directly as the mmap file operation handler. It will look up the GEM object based on the offset value and set the VMA operations to the struct drm_driver gem_vm_ops field. Note that drm_gem_mmap() doesn’t map memory to userspace, but relies on the driver-provided fault handler to map pages individually.

To use drm_gem_mmap(), drivers must fill the struct struct drm_driver gem_vm_ops field with a pointer to VM operations.

The VM operations is a struct vm_operations_struct made up of several fields, the more interesting ones being:

struct vm_operations_struct {
        void (*open)(struct vm_area_struct * area);
        void (*close)(struct vm_area_struct * area);
        vm_fault_t (*fault)(struct vm_fault *vmf);
};

The open and close operations must update the GEM object reference count. Drivers can use the drm_gem_vm_open() and drm_gem_vm_close() helper functions directly as open and close handlers.

The fault operation handler is responsible for mapping individual pages to userspace when a page fault occurs. Depending on the memory allocation scheme, drivers can allocate pages at fault time, or can decide to allocate memory for the GEM object at the time the object is created.

Drivers that want to map the GEM object upfront instead of handling page faults can implement their own mmap file operation handler.

For platforms without MMU the GEM core provides a helper method drm_gem_dma_get_unmapped_area(). The mmap() routines will call this to get a proposed address for the mapping.

To use drm_gem_dma_get_unmapped_area(), drivers must fill the struct struct file_operations get_unmapped_area field with a pointer on drm_gem_dma_get_unmapped_area().

More detailed information about get_unmapped_area can be found in No-MMU memory mapping support

Memory Coherency

When mapped to the device or used in a command buffer, backing pages for an object are flushed to memory and marked write combined so as to be coherent with the GPU. Likewise, if the CPU accesses an object after the GPU has finished rendering to the object, then the object must be made coherent with the CPU’s view of memory, usually involving GPU cache flushing of various kinds. This core CPU<->GPU coherency management is provided by a device-specific ioctl, which evaluates an object’s current domain and performs any necessary flushing or synchronization to put the object into the desired coherency domain (note that the object may be busy, i.e. an active render target; in that case, setting the domain blocks the client and waits for rendering to complete before performing any necessary flushing operations).

Command Execution

Perhaps the most important GEM function for GPU devices is providing a command execution interface to clients. Client programs construct command buffers containing references to previously allocated memory objects, and then submit them to GEM. At that point, GEM takes care to bind all the objects into the GTT, execute the buffer, and provide necessary synchronization between clients accessing the same buffers. This often involves evicting some objects from the GTT and re-binding others (a fairly expensive operation), and providing relocation support which hides fixed GTT offsets from clients. Clients must take care not to submit command buffers that reference more objects than can fit in the GTT; otherwise, GEM will reject them and no rendering will occur. Similarly, if several objects in the buffer require fence registers to be allocated for correct rendering (e.g. 2D blits on pre-965 chips), care must be taken not to require more fence registers than are available to the client. Such resource management should be abstracted from the client in libdrm.

GEM Function Reference

struct drm_gem_object_funcs

GEM object functions

Definition

struct drm_gem_object_funcs {
  void (*free)(struct drm_gem_object *obj);
  int (*open)(struct drm_gem_object *obj, struct drm_file *file);
  void (*close)(struct drm_gem_object *obj, struct drm_file *file);
  void (*print_info)(struct drm_printer *p, unsigned int indent, const struct drm_gem_object *obj);
  struct dma_buf *(*export)(struct drm_gem_object *obj, int flags);
  int (*pin)(struct drm_gem_object *obj);
  void (*unpin)(struct drm_gem_object *obj);
  struct sg_table *(*get_sg_table)(struct drm_gem_object *obj);
  int (*vmap)(struct drm_gem_object *obj, struct iosys_map *map);
  void (*vunmap)(struct drm_gem_object *obj, struct iosys_map *map);
  int (*mmap)(struct drm_gem_object *obj, struct vm_area_struct *vma);
  const struct vm_operations_struct *vm_ops;
};

Members

free

Deconstructor for drm_gem_objects.

This callback is mandatory.

open

Called upon GEM handle creation.

This callback is optional.

close

Called upon GEM handle release.

This callback is optional.

print_info

If driver subclasses struct drm_gem_object, it can implement this optional hook for printing additional driver specific info.

drm_printf_indent() should be used in the callback passing it the indent argument.

This callback is called from drm_gem_print_info().

This callback is optional.

export

Export backing buffer as a dma_buf. If this is not set drm_gem_prime_export() is used.

This callback is optional.

pin

Pin backing buffer in memory. Used by the drm_gem_map_attach() helper.

This callback is optional.

unpin

Unpin backing buffer. Used by the drm_gem_map_detach() helper.

This callback is optional.

get_sg_table

Returns a Scatter-Gather table representation of the buffer. Used when exporting a buffer by the drm_gem_map_dma_buf() helper. Releasing is done by calling dma_unmap_sg_attrs() and sg_free_table() in drm_gem_unmap_buf(), therefore these helpers and this callback here cannot be used for sg tables pointing at driver private memory ranges.

See also drm_prime_pages_to_sg().

vmap

Returns a virtual address for the buffer. Used by the drm_gem_dmabuf_vmap() helper.

This callback is optional.

vunmap

Releases the address previously returned by vmap. Used by the drm_gem_dmabuf_vunmap() helper.

This callback is optional.

mmap

Handle mmap() of the gem object, setup vma accordingly.

This callback is optional.

The callback is used by both drm_gem_mmap_obj() and drm_gem_prime_mmap(). When mmap is present vm_ops is not used, the mmap callback must set vma->vm_ops instead.

vm_ops

Virtual memory operations used with mmap.

This is optional but necessary for mmap support.

struct drm_gem_lru

A simple LRU helper

Definition

struct drm_gem_lru {
  struct mutex *lock;
  long count;
  struct list_head list;
};

Members

lock

Lock protecting movement of GEM objects between LRUs. All LRUs that the object can move between should be protected by the same lock.

count

The total number of backing pages of the GEM objects in this LRU.

list

The LRU list.

Description

A helper for tracking GEM objects in a given state, to aid in driver’s shrinker implementation. Tracks the count of pages for lockless shrinker.count_objects, and provides drm_gem_lru_scan for driver’s shrinker.scan_objects implementation.

struct drm_gem_object

GEM buffer object

Definition

struct drm_gem_object {
  struct kref refcount;
  unsigned handle_count;
  struct drm_device *dev;
  struct file *filp;
  struct drm_vma_offset_node vma_node;
  size_t size;
  int name;
  struct dma_buf *dma_buf;
  struct dma_buf_attachment *import_attach;
  struct dma_resv *resv;
  struct dma_resv _resv;
  const struct drm_gem_object_funcs *funcs;
  struct list_head lru_node;
  struct drm_gem_lru *lru;
};

Members

refcount

Reference count of this object

Please use drm_gem_object_get() to acquire and drm_gem_object_put_locked() or drm_gem_object_put() to release a reference to a GEM buffer object.

handle_count

This is the GEM file_priv handle count of this object.

Each handle also holds a reference. Note that when the handle_count drops to 0 any global names (e.g. the id in the flink namespace) will be cleared.

Protected by drm_device.object_name_lock.

dev

DRM dev this object belongs to.

filp

SHMEM file node used as backing storage for swappable buffer objects. GEM also supports driver private objects with driver-specific backing storage (contiguous DMA memory, special reserved blocks). In this case filp is NULL.

vma_node

Mapping info for this object to support mmap. Drivers are supposed to allocate the mmap offset using drm_gem_create_mmap_offset(). The offset itself can be retrieved using drm_vma_node_offset_addr().

Memory mapping itself is handled by drm_gem_mmap(), which also checks that userspace is allowed to access the object.

size

Size of the object, in bytes. Immutable over the object’s lifetime.

name

Global name for this object, starts at 1. 0 means unnamed. Access is covered by drm_device.object_name_lock. This is used by the GEM_FLINK and GEM_OPEN ioctls.

dma_buf

dma-buf associated with this GEM object.

Pointer to the dma-buf associated with this gem object (either through importing or exporting). We break the resulting reference loop when the last gem handle for this object is released.

Protected by drm_device.object_name_lock.

import_attach

dma-buf attachment backing this object.

Any foreign dma_buf imported as a gem object has this set to the attachment point for the device. This is invariant over the lifetime of a gem object.

The drm_gem_object_funcs.free callback is responsible for cleaning up the dma_buf attachment and references acquired at import time.

Note that the drm gem/prime core does not depend upon drivers setting this field any more. So for drivers where this doesn’t make sense (e.g. virtual devices or a displaylink behind an usb bus) they can simply leave it as NULL.

resv

Pointer to reservation object associated with the this GEM object.

Normally (resv == &**_resv**) except for imported GEM objects.

_resv

A reservation object for this GEM object.

This is unused for imported GEM objects.

funcs

Optional GEM object functions. If this is set, it will be used instead of the corresponding drm_driver GEM callbacks.

New drivers should use this.

lru_node

List node in a drm_gem_lru.

lru

The current LRU list that the GEM object is on.

Description

This structure defines the generic parts for GEM buffer objects, which are mostly around handling mmap and userspace handles.

Buffer objects are often abbreviated to BO.

DRM_GEM_FOPS

DRM_GEM_FOPS ()

Default drm GEM file operations

Parameters

Description

This macro provides a shorthand for setting the GEM file ops in the file_operations structure. If all you need are the default ops, use DEFINE_DRM_GEM_FOPS instead.

DEFINE_DRM_GEM_FOPS

DEFINE_DRM_GEM_FOPS (name)

macro to generate file operations for GEM drivers

Parameters

name

name for the generated structure

Description

This macro autogenerates a suitable struct file_operations for GEM based drivers, which can be assigned to drm_driver.fops. Note that this structure cannot be shared between drivers, because it contains a reference to the current module using THIS_MODULE.

Note that the declaration is already marked as static - if you need a non-static version of this you’re probably doing it wrong and will break the THIS_MODULE reference by accident.

void drm_gem_object_get(struct drm_gem_object *obj)

acquire a GEM buffer object reference

Parameters

struct drm_gem_object *obj

GEM buffer object

Description

This function acquires an additional reference to obj. It is illegal to call this without already holding a reference. No locks required.

void drm_gem_object_put(struct drm_gem_object *obj)

drop a GEM buffer object reference

Parameters

struct drm_gem_object *obj

GEM buffer object

Description

This releases a reference to obj.

int drm_gem_object_init(struct drm_device *dev, struct drm_gem_object *obj, size_t size)

initialize an allocated shmem-backed GEM object

Parameters

struct drm_device *dev

drm_device the object should be initialized for

struct drm_gem_object *obj

drm_gem_object to initialize

size_t size

object size

Description

Initialize an already allocated GEM object of the specified size with shmfs backing store.

void drm_gem_private_object_init(struct drm_device *dev, struct drm_gem_object *obj, size_t size)

initialize an allocated private GEM object

Parameters

struct drm_device *dev

drm_device the object should be initialized for

struct drm_gem_object *obj

drm_gem_object to initialize

size_t size

object size

Description

Initialize an already allocated GEM object of the specified size with no GEM provided backing store. Instead the caller is responsible for backing the object and handling it.

int drm_gem_handle_delete(struct drm_file *filp, u32 handle)

deletes the given file-private handle

Parameters

struct drm_file *filp

drm file-private structure to use for the handle look up

u32 handle

userspace handle to delete

Description

Removes the GEM handle from the filp lookup table which has been added with drm_gem_handle_create(). If this is the last handle also cleans up linked resources like GEM names.

int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev, u32 handle, u64 *offset)

return the fake mmap offset for a gem object

Parameters

struct drm_file *file

drm file-private structure containing the gem object

struct drm_device *dev

corresponding drm_device

u32 handle

gem object handle

u64 *offset

return location for the fake mmap offset

Description

This implements the drm_driver.dumb_map_offset kms driver callback for drivers which use gem to manage their backing storage.

Return

0 on success or a negative error code on failure.

int drm_gem_handle_create(struct drm_file *file_priv, struct drm_gem_object *obj, u32 *handlep)

create a gem handle for an object

Parameters

struct drm_file *file_priv

drm file-private structure to register the handle for

struct drm_gem_object *obj

object to register

u32 *handlep

pointer to return the created handle to the caller

Description

Create a handle for this object. This adds a handle reference to the object, which includes a regular reference count. Callers will likely want to dereference the object afterwards.

Since this publishes obj to userspace it must be fully set up by this point, drivers must call this last in their buffer object creation callbacks.

void drm_gem_free_mmap_offset(struct drm_gem_object *obj)

release a fake mmap offset for an object

Parameters

struct drm_gem_object *obj

obj in question

Description

This routine frees fake offsets allocated by drm_gem_create_mmap_offset().

Note that drm_gem_object_release() already calls this function, so drivers don’t have to take care of releasing the mmap offset themselves when freeing the GEM object.

int drm_gem_create_mmap_offset_size(struct drm_gem_object *obj, size_t size)

create a fake mmap offset for an object

Parameters

struct drm_gem_object *obj

obj in question

size_t size

the virtual size

Description

GEM memory mapping works by handing back to userspace a fake mmap offset it can use in a subsequent mmap(2) call. The DRM core code then looks up the object based on the offset and sets up the various memory mapping structures.

This routine allocates and attaches a fake offset for obj, in cases where the virtual size differs from the physical size (ie. drm_gem_object.size). Otherwise just use drm_gem_create_mmap_offset().

This function is idempotent and handles an already allocated mmap offset transparently. Drivers do not need to check for this case.

int drm_gem_create_mmap_offset(struct drm_gem_object *obj)

create a fake mmap offset for an object

Parameters

struct drm_gem_object *obj

obj in question

Description

GEM memory mapping works by handing back to userspace a fake mmap offset it can use in a subsequent mmap(2) call. The DRM core code then looks up the object based on the offset and sets up the various memory mapping structures.

This routine allocates and attaches a fake offset for obj.

Drivers can call drm_gem_free_mmap_offset() before freeing obj to release the fake offset again.

struct page **drm_gem_get_pages(struct drm_gem_object *obj)

helper to allocate backing pages for a GEM object from shmem

Parameters

struct drm_gem_object *obj

obj in question

Description

This reads the page-array of the shmem-backing storage of the given gem object. An array of pages is returned. If a page is not allocated or swapped-out, this will allocate/swap-in the required pages. Note that the whole object is covered by the page-array and pinned in memory.

Use drm_gem_put_pages() to release the array and unpin all pages.

This uses the GFP-mask set on the shmem-mapping (see mapping_set_gfp_mask()). If you require other GFP-masks, you have to do those allocations yourself.

Note that you are not allowed to change gfp-zones during runtime. That is, shmem_read_mapping_page_gfp() must be called with the same gfp_zone(gfp) as set during initialization. If you have special zone constraints, set them after drm_gem_object_init() via mapping_set_gfp_mask(). shmem-core takes care to keep pages in the required zone during swap-in.

This function is only valid on objects initialized with drm_gem_object_init(), but not for those initialized with drm_gem_private_object_init() only.

void drm_gem_put_pages(struct drm_gem_object *obj, struct page **pages, bool dirty, bool accessed)

helper to free backing pages for a GEM object

Parameters

struct drm_gem_object *obj

obj in question

struct page **pages

pages to free

bool dirty

if true, pages will be marked as dirty

bool accessed

if true, the pages will be marked as accessed

int drm_gem_objects_lookup(struct drm_file *filp, void __user *bo_handles, int count, struct drm_gem_object ***objs_out)

look up GEM objects from an array of handles

Parameters

struct drm_file *filp

DRM file private date

void __user *bo_handles

user pointer to array of userspace handle

int count

size of handle array

struct drm_gem_object ***objs_out

returned pointer to array of drm_gem_object pointers

Description

Takes an array of userspace handles and returns a newly allocated array of GEM objects.

For a single handle lookup, use drm_gem_object_lookup().

objs filled in with GEM object pointers. Returned GEM objects need to be released with drm_gem_object_put(). -ENOENT is returned on a lookup failure. 0 is returned on success.

Return

struct drm_gem_object *drm_gem_object_lookup(struct drm_file *filp, u32 handle)

look up a GEM object from its handle

Parameters

struct drm_file *filp

DRM file private date

u32 handle

userspace handle

Return

Description

A reference to the object named by the handle if such exists on filp, NULL otherwise.

If looking up an array of handles, use drm_gem_objects_lookup().

long drm_gem_dma_resv_wait(struct drm_file *filep, u32 handle, bool wait_all, unsigned long timeout)

Wait on GEM object’s reservation’s objects shared and/or exclusive fences.

Parameters

struct drm_file *filep

DRM file private date

u32 handle

userspace handle

bool wait_all

if true, wait on all fences, else wait on just exclusive fence

unsigned long timeout

timeout value in jiffies or zero to return immediately

Return

Description

Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or greater than 0 on success.

void drm_gem_object_release(struct drm_gem_object *obj)

release GEM buffer object resources

Parameters

struct drm_gem_object *obj

GEM buffer object

Description

This releases any structures and resources used by obj and is the inverse of drm_gem_object_init().

void drm_gem_object_free(struct kref *kref)

free a GEM object

Parameters

struct kref *kref

kref of the object to free

Description

Called after the last reference to the object has been lost.

Frees the object

void drm_gem_vm_open(struct vm_area_struct *vma)

vma->ops->open implementation for GEM

Parameters

struct vm_area_struct *vma

VM area structure

Description

This function implements the #vm_operations_struct open() callback for GEM drivers. This must be used together with drm_gem_vm_close().

void drm_gem_vm_close(struct vm_area_struct *vma)

vma->ops->close implementation for GEM

Parameters

struct vm_area_struct *vma

VM area structure

Description

This function implements the #vm_operations_struct close() callback for GEM drivers. This must be used together with drm_gem_vm_open().

int drm_gem_mmap_obj(struct drm_gem_object *obj, unsigned long obj_size, struct vm_area_struct *vma)

memory map a GEM object

Parameters

struct drm_gem_object *obj

the GEM object to map

unsigned long obj_size

the object size to be mapped, in bytes

struct vm_area_struct *vma

VMA for the area to be mapped

Description

Set up the VMA to prepare mapping of the GEM object using the GEM object’s vm_ops. Depending on their requirements, GEM objects can either provide a fault handler in their vm_ops (in which case any accesses to the object will be trapped, to perform migration, GTT binding, surface register allocation, or performance monitoring), or mmap the buffer memory synchronously after calling drm_gem_mmap_obj.

This function is mainly intended to implement the DMABUF mmap operation, when the GEM object is not looked up based on its fake offset. To implement the DRM mmap operation, drivers should use the drm_gem_mmap() function.

drm_gem_mmap_obj() assumes the user is granted access to the buffer while drm_gem_mmap() prevents unprivileged users from mapping random objects. So callers must verify access restrictions before calling this helper.

Return 0 or success or -EINVAL if the object size is smaller than the VMA size, or if no vm_ops are provided.

int drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)

memory map routine for GEM objects

Parameters

struct file *filp

DRM file pointer

struct vm_area_struct *vma

VMA for the area to be mapped

Description

If a driver supports GEM object mapping, mmap calls on the DRM file descriptor will end up here.

Look up the GEM object based on the offset passed in (vma->vm_pgoff will contain the fake offset we created when the GTT map ioctl was called on the object) and map it with a call to drm_gem_mmap_obj().

If the caller is not granted access to the buffer object, the mmap will fail with EACCES. Please see the vma manager for more information.

int drm_gem_lock_reservations(struct drm_gem_object **objs, int count, struct ww_acquire_ctx *acquire_ctx)

Sets up the ww context and acquires the lock on an array of GEM objects.

Parameters

struct drm_gem_object **objs

drm_gem_objects to lock

int count

Number of objects in objs

struct ww_acquire_ctx *acquire_ctx

struct ww_acquire_ctx that will be initialized as part of tracking this set of locked reservations.

Description

Once you’ve locked your reservations, you’ll want to set up space for your shared fences (if applicable), submit your job, then drm_gem_unlock_reservations().

void drm_gem_lru_init(struct drm_gem_lru *lru, struct mutex *lock)

initialize a LRU

Parameters

struct drm_gem_lru *lru

The LRU to initialize

struct mutex *lock

The lock protecting the LRU

void drm_gem_lru_remove(struct drm_gem_object *obj)

remove object from whatever LRU it is in

Parameters

struct drm_gem_object *obj

The GEM object to remove from current LRU

Description

If the object is currently in any LRU, remove it.

void drm_gem_lru_move_tail(struct drm_gem_lru *lru, struct drm_gem_object *obj)

move the object to the tail of the LRU

Parameters

struct drm_gem_lru *lru

The LRU to move the object into.

struct drm_gem_object *obj

The GEM object to move into this LRU

Description

If the object is already in this LRU it will be moved to the tail. Otherwise it will be removed from whichever other LRU it is in (if any) and moved into this LRU.

unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru, unsigned int nr_to_scan, unsigned long *remaining, bool (*shrink)(struct drm_gem_object *obj))

helper to implement shrinker.scan_objects

Parameters

struct drm_gem_lru *lru

The LRU to scan

unsigned int nr_to_scan

The number of pages to try to reclaim

unsigned long *remaining

The number of pages left to reclaim, should be initialized by caller

bool (*shrink)(struct drm_gem_object *obj)

Callback to try to shrink/reclaim the object.

Description

If the shrink callback succeeds, it is expected that the driver move the object out of this LRU.

If the LRU possibly contain active buffers, it is the responsibility of the shrink callback to check for this (ie. dma_resv_test_signaled()) or if necessary block until the buffer becomes idle.

GEM DMA Helper Functions Reference

The DRM GEM/DMA helpers are a means to provide buffer objects that are presented to the device as a contiguous chunk of memory. This is useful for devices that do not support scatter-gather DMA (either directly or by using an intimately attached IOMMU).

For devices that access the memory bus through an (external) IOMMU then the buffer objects are allocated using a traditional page-based allocator and may be scattered through physical memory. However they are contiguous in the IOVA space so appear contiguous to devices using them.

For other devices then the helpers rely on CMA to provide buffer objects that are physically contiguous in memory.

For GEM callback helpers in struct drm_gem_object functions, see likewise named functions with an _object_ infix (e.g., drm_gem_dma_object_vmap() wraps drm_gem_dma_vmap()). These helpers perform the necessary type conversion.

struct drm_gem_dma_object

GEM object backed by DMA memory allocations

Definition

struct drm_gem_dma_object {
  struct drm_gem_object base;
  dma_addr_t dma_addr;
  struct sg_table *sgt;
  void *vaddr;
  bool map_noncoherent;
};

Members

base

base GEM object

dma_addr

DMA address of the backing memory

sgt

scatter/gather table for imported PRIME buffers. The table can have more than one entry but they are guaranteed to have contiguous DMA addresses.

vaddr

kernel virtual address of the backing memory

map_noncoherent

if true, the GEM object is backed by non-coherent memory

void drm_gem_dma_object_free(struct drm_gem_object *obj)

GEM object function for drm_gem_dma_free()

Parameters

struct drm_gem_object *obj

GEM object to free

Description

This function wraps drm_gem_dma_free_object(). Drivers that employ the DMA helpers should use it as their drm_gem_object_funcs.free handler.

void drm_gem_dma_object_print_info(struct drm_printer *p, unsigned int indent, const struct drm_gem_object *obj)

Print drm_gem_dma_object info for debugfs

Parameters

struct drm_printer *p

DRM printer

unsigned int indent

Tab indentation level

const struct drm_gem_object *obj

GEM object

Description

This function wraps drm_gem_dma_print_info(). Drivers that employ the DMA helpers should use this function as their drm_gem_object_funcs.print_info handler.

struct sg_table *drm_gem_dma_object_get_sg_table(struct drm_gem_object *obj)

GEM object function for drm_gem_dma_get_sg_table()

Parameters

struct drm_gem_object *obj

GEM object

Description

This function wraps drm_gem_dma_get_sg_table(). Drivers that employ the DMA helpers should use it as their drm_gem_object_funcs.get_sg_table handler.

Return

A pointer to the scatter/gather table of pinned pages or NULL on failure.

int drm_gem_dma_object_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma)

GEM object function for drm_gem_dma_mmap()

Parameters

struct drm_gem_object *obj

GEM object

struct vm_area_struct *vma

VMA for the area to be mapped

Description

This function wraps drm_gem_dma_mmap(). Drivers that employ the dma helpers should use it as their drm_gem_object_funcs.mmap handler.

Return

0 on success or a negative error code on failure.

DRM_GEM_DMA_DRIVER_OPS_WITH_DUMB_CREATE

DRM_GEM_DMA_DRIVER_OPS_WITH_DUMB_CREATE (dumb_create_func)

DMA GEM driver operations

Parameters

dumb_create_func

callback function for .dumb_create

Description

This macro provides a shortcut for setting the default GEM operations in the drm_driver structure.

This macro is a variant of DRM_GEM_DMA_DRIVER_OPS for drivers that override the default implementation of struct rm_driver.dumb_create. Use DRM_GEM_DMA_DRIVER_OPS if possible. Drivers that require a virtual address on imported buffers should use DRM_GEM_DMA_DRIVER_OPS_VMAP_WITH_DUMB_CREATE() instead.

DRM_GEM_DMA_DRIVER_OPS

DRM_GEM_DMA_DRIVER_OPS ()

DMA GEM driver operations

Parameters

Description

This macro provides a shortcut for setting the default GEM operations in the drm_driver structure.

Drivers that come with their own implementation of struct drm_driver.dumb_create should use DRM_GEM_DMA_DRIVER_OPS_WITH_DUMB_CREATE() instead. Use DRM_GEM_DMA_DRIVER_OPS if possible. Drivers that require a virtual address on imported buffers should use DRM_GEM_DMA_DRIVER_OPS_VMAP instead.

DRM_GEM_DMA_DRIVER_OPS_VMAP_WITH_DUMB_CREATE

DRM_GEM_DMA_DRIVER_OPS_VMAP_WITH_DUMB_CREATE (dumb_create_func)

DMA GEM driver operations ensuring a virtual address on the buffer

Parameters

dumb_create_func

callback function for .dumb_create

Description

This macro provides a shortcut for setting the default GEM operations in the drm_driver structure for drivers that need the virtual address also on imported buffers.

This macro is a variant of DRM_GEM_DMA_DRIVER_OPS_VMAP for drivers that override the default implementation of struct drm_driver.dumb_create. Use DRM_GEM_DMA_DRIVER_OPS_VMAP if possible. Drivers that do not require a virtual address on imported buffers should use DRM_GEM_DMA_DRIVER_OPS_WITH_DUMB_CREATE() instead.

DRM_GEM_DMA_DRIVER_OPS_VMAP

DRM_GEM_DMA_DRIVER_OPS_VMAP ()

DMA GEM driver operations ensuring a virtual address on the buffer

Parameters

Description

This macro provides a shortcut for setting the default GEM operations in the drm_driver structure for drivers that need the virtual address also on imported buffers.

Drivers that come with their own implementation of struct drm_driver.dumb_create should use DRM_GEM_DMA_DRIVER_OPS_VMAP_WITH_DUMB_CREATE() instead. Use DRM_GEM_DMA_DRIVER_OPS_VMAP if possible. Drivers that do not require a virtual address on imported buffers should use DRM_GEM_DMA_DRIVER_OPS instead.

DEFINE_DRM_GEM_DMA_FOPS

DEFINE_DRM_GEM_DMA_FOPS (name)

macro to generate file operations for DMA drivers

Parameters

name

name for the generated structure

Description

This macro autogenerates a suitable struct file_operations for DMA based drivers, which can be assigned to drm_driver.fops. Note that this structure cannot be shared between drivers, because it contains a reference to the current module using THIS_MODULE.

Note that the declaration is already marked as static - if you need a non-static version of this you’re probably doing it wrong and will break the THIS_MODULE reference by accident.

struct drm_gem_dma_object *drm_gem_dma_create(struct drm_device *drm, size_t size)

allocate an object with the given size

Parameters

struct drm_device *drm

DRM device

size_t size

size of the object to allocate

Description

This function creates a DMA GEM object and allocates memory as backing store. The allocated memory will occupy a contiguous chunk of bus address space.

For devices that are directly connected to the memory bus then the allocated memory will be physically contiguous. For devices that access through an IOMMU, then the allocated memory is not expected to be physically contiguous because having contiguous IOVAs is sufficient to meet a devices DMA requirements.

Return

A struct drm_gem_dma_object * on success or an ERR_PTR()-encoded negative error code on failure.

void drm_gem_dma_free(struct drm_gem_dma_object *dma_obj)

free resources associated with a DMA GEM object

Parameters

struct drm_gem_dma_object *dma_obj

DMA GEM object to free

Description

This function frees the backing memory of the DMA GEM object, cleans up the GEM object state and frees the memory used to store the object itself. If the buffer is imported and the virtual address is set, it is released.

int drm_gem_dma_dumb_create_internal(struct drm_file *file_priv, struct drm_device *drm, struct drm_mode_create_dumb *args)

create a dumb buffer object

Parameters

struct drm_file *file_priv

DRM file-private structure to create the dumb buffer for

struct drm_device *drm

DRM device

struct drm_mode_create_dumb *args

IOCTL data

Description

This aligns the pitch and size arguments to the minimum required. This is an internal helper that can be wrapped by a driver to account for hardware with more specific alignment requirements. It should not be used directly as their drm_driver.dumb_create callback.

Return

0 on success or a negative error code on failure.

int drm_gem_dma_dumb_create(struct drm_file *file_priv, struct drm_device *drm, struct drm_mode_create_dumb *args)

create a dumb buffer object

Parameters

struct drm_file *file_priv

DRM file-private structure to create the dumb buffer for

struct drm_device *drm

DRM device

struct drm_mode_create_dumb *args

IOCTL data

Description

This function computes the pitch of the dumb buffer and rounds it up to an integer number of bytes per pixel. Drivers for hardware that doesn’t have any additional restrictions on the pitch can directly use this function as their drm_driver.dumb_create callback.

For hardware with additional restrictions, drivers can adjust the fields set up by userspace and pass the IOCTL data along to the drm_gem_dma_dumb_create_internal() function.

Return

0 on success or a negative error code on failure.

unsigned long drm_gem_dma_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags)

propose address for mapping in noMMU cases

Parameters

struct file *filp

file object

unsigned long addr

memory address

unsigned long len

buffer size

unsigned long pgoff

page offset

unsigned long flags

memory flags

Description

This function is used in noMMU platforms to propose address mapping for a given buffer. It’s intended to be used as a direct handler for the struct file_operations.get_unmapped_area operation.

Return

mapping address on success or a negative error code on failure.

void drm_gem_dma_print_info(const struct drm_gem_dma_object *dma_obj, struct drm_printer *p, unsigned int indent)

Print drm_gem_dma_object info for debugfs

Parameters

const struct drm_gem_dma_object *dma_obj

DMA GEM object

struct drm_printer *p

DRM printer

unsigned int indent

Tab indentation level

Description

This function prints dma_addr and vaddr for use in e.g. debugfs output.

struct sg_table *drm_gem_dma_get_sg_table(struct drm_gem_dma_object *dma_obj)

provide a scatter/gather table of pinned pages for a DMA GEM object

Parameters

struct drm_gem_dma_object *dma_obj

DMA GEM object

Description

This function exports a scatter/gather table by calling the standard DMA mapping API.

Return

A pointer to the scatter/gather table of pinned pages or NULL on failure.

struct drm_gem_object *drm_gem_dma_prime_import_sg_table(struct drm_device *dev, struct dma_buf_attachment *attach, struct sg_table *sgt)

produce a DMA GEM object from another driver’s scatter/gather table of pinned pages

Parameters

struct drm_device *dev

device to import into

struct dma_buf_attachment *attach

DMA-BUF attachment

struct sg_table *sgt

scatter/gather table of pinned pages

Description

This function imports a scatter/gather table exported via DMA-BUF by another driver. Imported buffers must be physically contiguous in memory (i.e. the scatter/gather table must contain a single entry). Drivers that use the DMA helpers should set this as their drm_driver.gem_prime_import_sg_table callback.

Return

A pointer to a newly created GEM object or an ERR_PTR-encoded negative error code on failure.

int drm_gem_dma_vmap(struct drm_gem_dma_object *dma_obj, struct iosys_map *map)

map a DMA GEM object into the kernel’s virtual address space

Parameters

struct drm_gem_dma_object *dma_obj

DMA GEM object

struct iosys_map *map

Returns the kernel virtual address of the DMA GEM object’s backing store.

Description

This function maps a buffer into the kernel’s virtual address space. Since the DMA buffers are already mapped into the kernel virtual address space this simply returns the cached virtual address.

Return

0 on success, or a negative error code otherwise.

int drm_gem_dma_mmap(struct drm_gem_dma_object *dma_obj, struct vm_area_struct *vma)

memory-map an exported DMA GEM object

Parameters

struct drm_gem_dma_object *dma_obj

DMA GEM object

struct vm_area_struct *vma

VMA for the area to be mapped

Description

This function maps a buffer into a userspace process’s address space. In addition to the usual GEM VMA setup it immediately faults in the entire object instead of using on-demand faulting.

Return

0 on success or a negative error code on failure.

struct drm_gem_object *drm_gem_dma_prime_import_sg_table_vmap(struct drm_device *dev, struct dma_buf_attachment *attach, struct sg_table *sgt)

PRIME import another driver’s scatter/gather table and get the virtual address of the buffer

Parameters

struct drm_device *dev

DRM device

struct dma_buf_attachment *attach

DMA-BUF attachment

struct sg_table *sgt

Scatter/gather table of pinned pages

Description

This function imports a scatter/gather table using drm_gem_dma_prime_import_sg_table() and uses dma_buf_vmap() to get the kernel virtual address. This ensures that a DMA GEM object always has its virtual address set. This address is released when the object is freed.

This function can be used as the drm_driver.gem_prime_import_sg_table callback. The DRM_GEM_DMA_DRIVER_OPS_VMAP macro provides a shortcut to set the necessary DRM driver operations.

Return

A pointer to a newly created GEM object or an ERR_PTR-encoded negative error code on failure.

GEM SHMEM Helper Function Reference

This library provides helpers for GEM objects backed by shmem buffers allocated using anonymous pageable memory.

Functions that operate on the GEM object receive struct drm_gem_shmem_object. For GEM callback helpers in struct drm_gem_object functions, see likewise named functions with an _object_ infix (e.g., drm_gem_shmem_object_vmap() wraps drm_gem_shmem_vmap()). These helpers perform the necessary type conversion.

struct drm_gem_shmem_object

GEM object backed by shmem

Definition

struct drm_gem_shmem_object {
  struct drm_gem_object base;
  struct mutex pages_lock;
  struct page **pages;
  unsigned int pages_use_count;
  int madv;
  struct list_head madv_list;
  unsigned int pages_mark_dirty_on_put    : 1;
  unsigned int pages_mark_accessed_on_put : 1;
  struct sg_table *sgt;
  struct mutex vmap_lock;
  void *vaddr;
  unsigned int vmap_use_count;
  bool map_wc;
};

Members

base

Base GEM object

pages_lock

Protects the page table and use count

pages

Page table

pages_use_count

Reference count on the pages table. The pages are put when the count reaches zero.

madv

State for madvise

0 is active/inuse. A negative value is the object is purged. Positive values are driver specific and not used by the helpers.

madv_list

List entry for madvise tracking

Typically used by drivers to track purgeable objects

pages_mark_dirty_on_put

Mark pages as dirty when they are put.

pages_mark_accessed_on_put

Mark pages as accessed when they are put.

sgt

Scatter/gather table for imported PRIME buffers

vmap_lock

Protects the vmap address and use count

vaddr

Kernel virtual address of the backing memory

vmap_use_count

Reference count on the virtual address. The address are un-mapped when the count reaches zero.

map_wc

map object write-combined (instead of using shmem defaults).

void drm_gem_shmem_object_free(struct drm_gem_object *obj)

GEM object function for drm_gem_shmem_free()

Parameters

struct drm_gem_object *obj

GEM object to free

Description

This function wraps drm_gem_shmem_free(). Drivers that employ the shmem helpers should use it as their drm_gem_object_funcs.free handler.

void drm_gem_shmem_object_print_info(struct drm_printer *p, unsigned int indent, const struct drm_gem_object *obj)

Print drm_gem_shmem_object info for debugfs

Parameters

struct drm_printer *p

DRM printer

unsigned int indent

Tab indentation level

const struct drm_gem_object *obj

GEM object

Description

This function wraps drm_gem_shmem_print_info(). Drivers that employ the shmem helpers should use this function as their drm_gem_object_funcs.print_info handler.

int drm_gem_shmem_object_pin(struct drm_gem_object *obj)

GEM object function for drm_gem_shmem_pin()

Parameters

struct drm_gem_object *obj

GEM object

Description

This function wraps drm_gem_shmem_pin(). Drivers that employ the shmem helpers should use it as their drm_gem_object_funcs.pin handler.

void drm_gem_shmem_object_unpin(struct drm_gem_object *obj)

GEM object function for drm_gem_shmem_unpin()

Parameters

struct drm_gem_object *obj

GEM object

Description

This function wraps drm_gem_shmem_unpin(). Drivers that employ the shmem helpers should use it as their drm_gem_object_funcs.unpin handler.

struct sg_table *drm_gem_shmem_object_get_sg_table(struct drm_gem_object *obj)

GEM object function for drm_gem_shmem_get_sg_table()

Parameters

struct drm_gem_object *obj

GEM object

Description

This function wraps drm_gem_shmem_get_sg_table(). Drivers that employ the shmem helpers should use it as their drm_gem_object_funcs.get_sg_table handler.

Return

A pointer to the scatter/gather table of pinned pages or error pointer on failure.

int drm_gem_shmem_object_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma)

GEM object function for drm_gem_shmem_mmap()

Parameters

struct drm_gem_object *obj

GEM object

struct vm_area_struct *vma

VMA for the area to be mapped

Description

This function wraps drm_gem_shmem_mmap(). Drivers that employ the shmem helpers should use it as their drm_gem_object_funcs.mmap handler.

Return

0 on success or a negative error code on failure.

DRM_GEM_SHMEM_DRIVER_OPS

DRM_GEM_SHMEM_DRIVER_OPS ()

Default shmem GEM operations

Parameters

Description

This macro provides a shortcut for setting the shmem GEM operations in the drm_driver structure.

struct drm_gem_shmem_object *drm_gem_shmem_create(struct drm_device *dev, size_t size)

Allocate an object with the given size

Parameters

struct drm_device *dev

DRM device

size_t size

Size of the object to allocate

Description

This function creates a shmem GEM object.

Return

A struct drm_gem_shmem_object * on success or an ERR_PTR()-encoded negative error code on failure.

void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)

Free resources associated with a shmem GEM object

Parameters

struct drm_gem_shmem_object *shmem

shmem GEM object to free

Description

This function cleans up the GEM object state and frees the memory used to store the object itself.

int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem)

Pin backing pages for a shmem GEM object

Parameters

struct drm_gem_shmem_object *shmem

shmem GEM object

Description

This function makes sure the backing pages are pinned in memory while the buffer is exported.

Return

0 on success or a negative error code on failure.

void drm_gem_shmem_unpin(struct drm_gem_shmem_object *shmem)

Unpin backing pages for a shmem GEM object

Parameters

struct drm_gem_shmem_object *shmem

shmem GEM object

Description

This function removes the requirement that the backing pages are pinned in memory.

int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev, struct drm_mode_create_dumb *args)

Create a dumb shmem buffer object

Parameters

struct drm_file *file

DRM file structure to create the dumb buffer for

struct drm_device *dev

DRM device

struct drm_mode_create_dumb *args

IOCTL data

Description

This function computes the pitch of the dumb buffer and rounds it up to an integer number of bytes per pixel. Drivers for hardware that doesn’t have any additional restrictions on the pitch can directly use this function as their drm_driver.dumb_create callback.

For hardware with additional restrictions, drivers can adjust the fields set up by userspace before calling into this function.

Return

0 on success or a negative error code on failure.

int drm_gem_shmem_mmap(struct drm_gem_shmem_object *shmem, struct vm_area_struct *vma)

Memory-map a shmem GEM object

Parameters

struct drm_gem_shmem_object *shmem

shmem GEM object

struct vm_area_struct *vma

VMA for the area to be mapped

Description

This function implements an augmented version of the GEM DRM file mmap operation for shmem objects.

Return

0 on success or a negative error code on failure.

void drm_gem_shmem_print_info(const struct drm_gem_shmem_object *shmem, struct drm_printer *p, unsigned int indent)

Print drm_gem_shmem_object info for debugfs

Parameters

const struct drm_gem_shmem_object *shmem

shmem GEM object

struct drm_printer *p

DRM printer

unsigned int indent

Tab indentation level

struct sg_table *drm_gem_shmem_get_sg_table(struct drm_gem_shmem_object *shmem)

Provide a scatter/gather table of pinned pages for a shmem GEM object

Parameters

struct drm_gem_shmem_object *shmem

shmem GEM object

Description

This function exports a scatter/gather table suitable for PRIME usage by calling the standard DMA mapping API.

Drivers who need to acquire an scatter/gather table for objects need to call drm_gem_shmem_get_pages_sgt() instead.

Return

A pointer to the scatter/gather table of pinned pages or error pointer on failure.

struct sg_table *drm_gem_shmem_get_pages_sgt(struct drm_gem_shmem_object *shmem)

Pin pages, dma map them, and return a scatter/gather table for a shmem GEM object.

Parameters

struct drm_gem_shmem_object *shmem

shmem GEM object

Description

This function returns a scatter/gather table suitable for driver usage. If the sg table doesn’t exist, the pages are pinned, dma-mapped, and a sg table created.

This is the main function for drivers to get at backing storage, and it hides and difference between dma-buf imported and natively allocated objects. drm_gem_shmem_get_sg_table() should not be directly called by drivers.

Return

A pointer to the scatter/gather table of pinned pages or errno on failure.

struct drm_gem_object *drm_gem_shmem_prime_import_sg_table(struct drm_device *dev, struct dma_buf_attachment *attach, struct sg_table *sgt)

Produce a shmem GEM object from another driver’s scatter/gather table of pinned pages

Parameters

struct drm_device *dev

Device to import into

struct dma_buf_attachment *attach

DMA-BUF attachment

struct sg_table *sgt

Scatter/gather table of pinned pages

Description

This function imports a scatter/gather table exported via DMA-BUF by another driver. Drivers that use the shmem helpers should set this as their drm_driver.gem_prime_import_sg_table callback.

Return

A pointer to a newly created GEM object or an ERR_PTR-encoded negative error code on failure.

GEM VRAM Helper Functions Reference

This library provides struct drm_gem_vram_object (GEM VRAM), a GEM buffer object that is backed by video RAM (VRAM). It can be used for framebuffer devices with dedicated memory.

The data structure struct drm_vram_mm and its helpers implement a memory manager for simple framebuffer devices with dedicated video memory. GEM VRAM buffer objects are either placed in the video memory or remain evicted to system memory.

With the GEM interface userspace applications create, manage and destroy graphics buffers, such as an on-screen framebuffer. GEM does not provide an implementation of these interfaces. It’s up to the DRM driver to provide an implementation that suits the hardware. If the hardware device contains dedicated video memory, the DRM driver can use the VRAM helper library. Each active buffer object is stored in video RAM. Active buffer are used for drawing the current frame, typically something like the frame’s scanout buffer or the cursor image. If there’s no more space left in VRAM, inactive GEM objects can be moved to system memory.

To initialize the VRAM helper library call drmm_vram_helper_init(). The function allocates and initializes an instance of struct drm_vram_mm in struct drm_device.vram_mm . Use DRM_GEM_VRAM_DRIVER to initialize struct drm_driver and DRM_VRAM_MM_FILE_OPERATIONS to initialize struct file_operations; as illustrated below.

struct file_operations fops ={
        .owner = THIS_MODULE,
        DRM_VRAM_MM_FILE_OPERATION
};
struct drm_driver drv = {
        .driver_feature = DRM_ ... ,
        .fops = &fops,
        DRM_GEM_VRAM_DRIVER
};

int init_drm_driver()
{
        struct drm_device *dev;
        uint64_t vram_base;
        unsigned long vram_size;
        int ret;

        // setup device, vram base and size
        // ...

        ret = drmm_vram_helper_init(dev, vram_base, vram_size);
        if (ret)
                return ret;
        return 0;
}

This creates an instance of struct drm_vram_mm, exports DRM userspace interfaces for GEM buffer management and initializes file operations to allow for accessing created GEM buffers. With this setup, the DRM driver manages an area of video RAM with VRAM MM and provides GEM VRAM objects to userspace.

You don’t have to clean up the instance of VRAM MM. drmm_vram_helper_init() is a managed interface that installs a clean-up handler to run during the DRM device’s release.

For drawing or scanout operations, rsp. buffer objects have to be pinned in video RAM. Call drm_gem_vram_pin() with DRM_GEM_VRAM_PL_FLAG_VRAM or DRM_GEM_VRAM_PL_FLAG_SYSTEM to pin a buffer object in video RAM or system memory. Call drm_gem_vram_unpin() to release the pinned object afterwards.

A buffer object that is pinned in video RAM has a fixed address within that memory region. Call drm_gem_vram_offset() to retrieve this value. Typically it’s used to program the hardware’s scanout engine for framebuffers, set the cursor overlay’s image for a mouse cursor, or use it as input to the hardware’s drawing engine.

To access a buffer object’s memory from the DRM driver, call drm_gem_vram_vmap(). It maps the buffer into kernel address space and returns the memory address. Use drm_gem_vram_vunmap() to release the mapping.

struct drm_gem_vram_object

GEM object backed by VRAM

Definition

struct drm_gem_vram_object {
  struct ttm_buffer_object bo;
  struct iosys_map map;
  unsigned int vmap_use_count;
  struct ttm_placement placement;
  struct ttm_place placements[2];
};

Members

bo

TTM buffer object

map

Mapping information for bo

vmap_use_count

Reference count on the virtual address. The address are un-mapped when the count reaches zero.

placement

TTM placement information. Supported placements are TTM_PL_VRAM and TTM_PL_SYSTEM

placements

TTM placement information.

Description

The type struct drm_gem_vram_object represents a GEM object that is backed by VRAM. It can be used for simple framebuffer devices with dedicated memory. The buffer object can be evicted to system memory if video memory becomes scarce.

GEM VRAM objects perform reference counting for pin and mapping operations. So a buffer object that has been pinned N times with drm_gem_vram_pin() must be unpinned N times with drm_gem_vram_unpin(). The same applies to pairs of drm_gem_vram_kmap() and drm_gem_vram_kunmap(), as well as pairs of drm_gem_vram_vmap() and drm_gem_vram_vunmap().

struct drm_gem_vram_object *drm_gem_vram_of_bo(struct ttm_buffer_object *bo)

Returns the container of type struct drm_gem_vram_object for field bo.

Parameters

struct ttm_buffer_object *bo

the VRAM buffer object

Return

The containing GEM VRAM object

struct drm_gem_vram_object *drm_gem_vram_of_gem(struct drm_gem_object *gem)

Returns the container of type struct drm_gem_vram_object for field gem.

Parameters

struct drm_gem_object *gem

the GEM object

Return

The containing GEM VRAM object

DRM_GEM_VRAM_PLANE_HELPER_FUNCS

DRM_GEM_VRAM_PLANE_HELPER_FUNCS ()

Initializes struct drm_plane_helper_funcs for VRAM handling

Parameters

Description

Drivers may use GEM BOs as VRAM helpers for the framebuffer memory. This macro initializes struct drm_plane_helper_funcs to use the respective helper functions.

DRM_GEM_VRAM_DRIVER

DRM_GEM_VRAM_DRIVER ()

default callback functions for struct drm_driver

Parameters

Description

Drivers that use VRAM MM and GEM VRAM can use this macro to initialize struct drm_driver with default functions.

struct drm_vram_mm

An instance of VRAM MM

Definition

struct drm_vram_mm {
  uint64_t vram_base;
  size_t vram_size;
  struct ttm_device bdev;
};

Members

vram_base

Base address of the managed video memory

vram_size

Size of the managed video memory in bytes

bdev

The TTM BO device.

Description

The fields struct drm_vram_mm.vram_base and struct drm_vram_mm.vrm_size are managed by VRAM MM, but are available for public read access. Use the field struct drm_vram_mm.bdev to access the TTM BO device.

struct drm_vram_mm *drm_vram_mm_of_bdev(struct ttm_device *bdev)

Returns the container of type struct ttm_device for field bdev.

Parameters

struct ttm_device *bdev

the TTM BO device

Return

The containing instance of struct drm_vram_mm

struct drm_gem_vram_object *drm_gem_vram_create(struct drm_device *dev, size_t size, unsigned long pg_align)

Creates a VRAM-backed GEM object

Parameters

struct drm_device *dev

the DRM device

size_t size

the buffer size in bytes

unsigned long pg_align

the buffer’s alignment in multiples of the page size

Description

GEM objects are allocated by calling struct drm_driver.gem_create_object, if set. Otherwise kzalloc() will be used. Drivers can set their own GEM object functions in struct drm_driver.gem_create_object. If no functions are set, the new GEM object will use the default functions from GEM VRAM helpers.

Return

A new instance of struct drm_gem_vram_object on success, or an ERR_PTR()-encoded error code otherwise.

void drm_gem_vram_put(struct drm_gem_vram_object *gbo)

Releases a reference to a VRAM-backed GEM object

Parameters

struct drm_gem_vram_object *gbo

the GEM VRAM object

Description

See ttm_bo_put() for more information.

s64 drm_gem_vram_offset(struct drm_gem_vram_object *gbo)

Returns a GEM VRAM object’s offset in video memory

Parameters

struct drm_gem_vram_object *gbo

the GEM VRAM object

Description

This function returns the buffer object’s offset in the device’s video memory. The buffer object has to be pinned to TTM_PL_VRAM.

Return

The buffer object’s offset in video memory on success, or a negative errno code otherwise.

int drm_gem_vram_pin(struct drm_gem_vram_object *gbo, unsigned long pl_flag)

Pins a GEM VRAM object in a region.

Parameters

struct drm_gem_vram_object *gbo

the GEM VRAM object

unsigned long pl_flag

a bitmask of possible memory regions

Description

Pinning a buffer object ensures that it is not evicted from a memory region. A pinned buffer object has to be unpinned before it can be pinned to another region. If the pl_flag argument is 0, the buffer is pinned at its current location (video RAM or system memory).

Small buffer objects, such as cursor images, can lead to memory fragmentation if they are pinned in the middle of video RAM. This is especially a problem on devices with only a small amount of video RAM. Fragmentation can prevent the primary framebuffer from fitting in, even though there’s enough memory overall. The modifier DRM_GEM_VRAM_PL_FLAG_TOPDOWN marks the buffer object to be pinned at the high end of the memory region to avoid fragmentation.

Return

0 on success, or a negative error code otherwise.

int drm_gem_vram_unpin(struct drm_gem_vram_object *gbo)

Unpins a GEM VRAM object

Parameters

struct drm_gem_vram_object *gbo

the GEM VRAM object

Return

0 on success, or a negative error code otherwise.

int drm_gem_vram_vmap(struct drm_gem_vram_object *gbo, struct iosys_map *map)

Pins and maps a GEM VRAM object into kernel address space

Parameters

struct drm_gem_vram_object *gbo

The GEM VRAM object to map

struct iosys_map *map

Returns the kernel virtual address of the VRAM GEM object’s backing store.

Description

The vmap function pins a GEM VRAM object to its current location, either system or video memory, and maps its buffer into kernel address space. As pinned object cannot be relocated, you should avoid pinning objects permanently. Call drm_gem_vram_vunmap() with the returned address to unmap and unpin the GEM VRAM object.

Return

0 on success, or a negative error code otherwise.

void drm_gem_vram_vunmap(struct drm_gem_vram_object *gbo, struct iosys_map *map)

Unmaps and unpins a GEM VRAM object

Parameters

struct drm_gem_vram_object *gbo

The GEM VRAM object to unmap

struct iosys_map *map

Kernel virtual address where the VRAM GEM object was mapped

Description

A call to drm_gem_vram_vunmap() unmaps and unpins a GEM VRAM buffer. See the documentation for drm_gem_vram_vmap() for more information.

int drm_gem_vram_fill_create_dumb(struct drm_file *file, struct drm_device *dev, unsigned long pg_align, unsigned long pitch_align, struct drm_mode_create_dumb *args)

Helper for implementing struct drm_driver.dumb_create

Parameters

struct drm_file *file

the DRM file

struct drm_device *dev

the DRM device

unsigned long pg_align

the buffer’s alignment in multiples of the page size

unsigned long pitch_align

the scanline’s alignment in powers of 2

struct drm_mode_create_dumb *args

the arguments as provided to struct drm_driver.dumb_create

Description

This helper function fills struct drm_mode_create_dumb, which is used by struct drm_driver.dumb_create. Implementations of this interface should forwards their arguments to this helper, plus the driver-specific parameters.

Return

0 on success, or a negative error code otherwise.

int drm_gem_vram_driver_dumb_create(struct drm_file *file, struct drm_device *dev, struct drm_mode_create_dumb *args)

Implements struct drm_driver.dumb_create

Parameters

struct drm_file *file

the DRM file

struct drm_device *dev

the DRM device

struct drm_mode_create_dumb *args

the arguments as provided to struct drm_driver.dumb_create

Description

This function requires the driver to use drm_device.vram_mm for its instance of VRAM MM.

Return

0 on success, or a negative error code otherwise.

int drm_gem_vram_plane_helper_prepare_fb(struct drm_plane *plane, struct drm_plane_state *new_state)

Parameters

struct drm_plane *plane

a DRM plane

struct drm_plane_state *new_state

the plane’s new state

Description

During plane updates, this function sets the plane’s fence and pins the GEM VRAM objects of the plane’s new framebuffer to VRAM. Call drm_gem_vram_plane_helper_cleanup_fb() to unpin them.

Return

0 on success, or a negative errno code otherwise.

void drm_gem_vram_plane_helper_cleanup_fb(struct drm_plane *plane, struct drm_plane_state *old_state)

Parameters

struct drm_plane *plane

a DRM plane

struct drm_plane_state *old_state

the plane’s old state

Description

During plane updates, this function unpins the GEM VRAM objects of the plane’s old framebuffer from VRAM. Complements drm_gem_vram_plane_helper_prepare_fb().

int drm_gem_vram_simple_display_pipe_prepare_fb(struct drm_simple_display_pipe *pipe, struct drm_plane_state *new_state)

Parameters

struct drm_simple_display_pipe *pipe

a simple display pipe

struct drm_plane_state *new_state

the plane’s new state

Description

During plane updates, this function pins the GEM VRAM objects of the plane’s new framebuffer to VRAM. Call drm_gem_vram_simple_display_pipe_cleanup_fb() to unpin them.

Return

0 on success, or a negative errno code otherwise.

void drm_gem_vram_simple_display_pipe_cleanup_fb(struct drm_simple_display_pipe *pipe, struct drm_plane_state *old_state)

Parameters

struct drm_simple_display_pipe *pipe

a simple display pipe

struct drm_plane_state *old_state

the plane’s old state

Description

During plane updates, this function unpins the GEM VRAM objects of the plane’s old framebuffer from VRAM. Complements drm_gem_vram_simple_display_pipe_prepare_fb().

void drm_vram_mm_debugfs_init(struct drm_minor *minor)

Register VRAM MM debugfs file.

Parameters

struct drm_minor *minor

drm minor device.

int drmm_vram_helper_init(struct drm_device *dev, uint64_t vram_base, size_t vram_size)

Initializes a device’s instance of struct drm_vram_mm

Parameters

struct drm_device *dev

the DRM device

uint64_t vram_base

the base address of the video memory

size_t vram_size

the size of the video memory in bytes

Description

Creates a new instance of struct drm_vram_mm and stores it in struct drm_device.vram_mm. The instance is auto-managed and cleaned up as part of device cleanup. Calling this function multiple times will generate an error message.

Return

0 on success, or a negative errno code otherwise.

enum drm_mode_status drm_vram_helper_mode_valid(struct drm_device *dev, const struct drm_display_mode *mode)

Tests if a display mode’s framebuffer fits into the available video memory.

Parameters

struct drm_device *dev

the DRM device

const struct drm_display_mode *mode

the mode to test

Description

This function tests if enough video memory is available for using the specified display mode. Atomic modesetting requires importing the designated framebuffer into video memory before evicting the active one. Hence, any framebuffer may consume at most half of the available VRAM. Display modes that require a larger framebuffer can not be used, even if the CRTC does support them. Each framebuffer is assumed to have 32-bit color depth.

Note

The function can only test if the display mode is supported in general. If there are too many framebuffers pinned to video memory, a display mode may still not be usable in practice. The color depth of 32-bit fits all current use case. A more flexible test can be added when necessary.

Return

MODE_OK if the display mode is supported, or an error code of type enum drm_mode_status otherwise.

GEM TTM Helper Functions Reference

This library provides helper functions for gem objects backed by ttm.

void drm_gem_ttm_print_info(struct drm_printer *p, unsigned int indent, const struct drm_gem_object *gem)

Print ttm_buffer_object info for debugfs

Parameters

struct drm_printer *p

DRM printer

unsigned int indent

Tab indentation level

const struct drm_gem_object *gem

GEM object

Description

This function can be used as drm_gem_object_funcs.print_info callback.

int drm_gem_ttm_vmap(struct drm_gem_object *gem, struct iosys_map *map)

vmap ttm_buffer_object

Parameters

struct drm_gem_object *gem

GEM object.

struct iosys_map *map

[out] returns the dma-buf mapping.

Description

Maps a GEM object with ttm_bo_vmap(). This function can be used as drm_gem_object_funcs.vmap callback.

Return

0 on success, or a negative errno code otherwise.

void drm_gem_ttm_vunmap(struct drm_gem_object *gem, struct iosys_map *map)

vunmap ttm_buffer_object

Parameters

struct drm_gem_object *gem

GEM object.

struct iosys_map *map

dma-buf mapping.

Description

Unmaps a GEM object with ttm_bo_vunmap(). This function can be used as drm_gem_object_funcs.vmap callback.

int drm_gem_ttm_mmap(struct drm_gem_object *gem, struct vm_area_struct *vma)

mmap ttm_buffer_object

Parameters

struct drm_gem_object *gem

GEM object.

struct vm_area_struct *vma

vm area.

Description

This function can be used as drm_gem_object_funcs.mmap callback.

int drm_gem_ttm_dumb_map_offset(struct drm_file *file, struct drm_device *dev, uint32_t handle, uint64_t *offset)

Implements struct drm_driver.dumb_map_offset

Parameters

struct drm_file *file

DRM file pointer.

struct drm_device *dev

DRM device.

uint32_t handle

GEM handle

uint64_t *offset

Returns the mapping’s memory offset on success

Description

Provides an implementation of struct drm_driver.dumb_map_offset for TTM-based GEM drivers. TTM allocates the offset internally and drm_gem_ttm_dumb_map_offset() returns it for dumb-buffer implementations.

See struct drm_driver.dumb_map_offset.

Return

0 on success, or a negative errno code otherwise.

VMA Offset Manager

The vma-manager is responsible to map arbitrary driver-dependent memory regions into the linear user address-space. It provides offsets to the caller which can then be used on the address_space of the drm-device. It takes care to not overlap regions, size them appropriately and to not confuse mm-core by inconsistent fake vm_pgoff fields. Drivers shouldn’t use this for object placement in VMEM. This manager should only be used to manage mappings into linear user-space VMs.

We use drm_mm as backend to manage object allocations. But it is highly optimized for alloc/free calls, not lookups. Hence, we use an rb-tree to speed up offset lookups.

You must not use multiple offset managers on a single address_space. Otherwise, mm-core will be unable to tear down memory mappings as the VM will no longer be linear.

This offset manager works on page-based addresses. That is, every argument and return code (with the exception of drm_vma_node_offset_addr()) is given in number of pages, not number of bytes. That means, object sizes and offsets must always be page-aligned (as usual). If you want to get a valid byte-based user-space address for a given offset, please see drm_vma_node_offset_addr().

Additionally to offset management, the vma offset manager also handles access management. For every open-file context that is allowed to access a given node, you must call drm_vma_node_allow(). Otherwise, an mmap() call on this open-file with the offset of the node will fail with -EACCES. To revoke access again, use drm_vma_node_revoke(). However, the caller is responsible for destroying already existing mappings, if required.

struct drm_vma_offset_node *drm_vma_offset_exact_lookup_locked(struct drm_vma_offset_manager *mgr, unsigned long start, unsigned long pages)

Look up node by exact address

Parameters

struct drm_vma_offset_manager *mgr

Manager object

unsigned long start

Start address (page-based, not byte-based)

unsigned long pages

Size of object (page-based)

Description

Same as drm_vma_offset_lookup_locked() but does not allow any offset into the node. It only returns the exact object with the given start address.

Return

Node at exact start address start.

void drm_vma_offset_lock_lookup(struct drm_vma_offset_manager *mgr)

Lock lookup for extended private use

Parameters

struct drm_vma_offset_manager *mgr

Manager object

Description

Lock VMA manager for extended lookups. Only locked VMA function calls are allowed while holding this lock. All other contexts are blocked from VMA until the lock is released via drm_vma_offset_unlock_lookup().

Use this if you need to take a reference to the objects returned by drm_vma_offset_lookup_locked() before releasing this lock again.

This lock must not be used for anything else than extended lookups. You must not call any other VMA helpers while holding this lock.

Note

You’re in atomic-context while holding this lock!

void drm_vma_offset_unlock_lookup(struct drm_vma_offset_manager *mgr)

Unlock lookup for extended private use

Parameters

struct drm_vma_offset_manager *mgr

Manager object

Description

Release lookup-lock. See drm_vma_offset_lock_lookup() for more information.

void drm_vma_node_reset(struct drm_vma_offset_node *node)

Initialize or reset node object

Parameters

struct drm_vma_offset_node *node

Node to initialize or reset

Description

Reset a node to its initial state. This must be called before using it with any VMA offset manager.

This must not be called on an already allocated node, or you will leak memory.

unsigned long drm_vma_node_start(const struct drm_vma_offset_node *node)

Return start address for page-based addressing

Parameters

const struct drm_vma_offset_node *node

Node to inspect

Description

Return the start address of the given node. This can be used as offset into the linear VM space that is provided by the VMA offset manager. Note that this can only be used for page-based addressing. If you need a proper offset for user-space mappings, you must apply “<< PAGE_SHIFT” or use the drm_vma_node_offset_addr() helper instead.

Return

Start address of node for page-based addressing. 0 if the node does not have an offset allocated.

unsigned long drm_vma_node_size(struct drm_vma_offset_node *node)

Return size (page-based)

Parameters

struct drm_vma_offset_node *node

Node to inspect

Description

Return the size as number of pages for the given node. This is the same size that was passed to drm_vma_offset_add(). If no offset is allocated for the node, this is 0.

Return

Size of node as number of pages. 0 if the node does not have an offset allocated.

__u64 drm_vma_node_offset_addr(struct drm_vma_offset_node *node)

Return sanitized offset for user-space mmaps

Parameters

struct drm_vma_offset_node *node

Linked offset node

Description

Same as drm_vma_node_start() but returns the address as a valid offset that can be used for user-space mappings during mmap(). This must not be called on unlinked nodes.

Return

Offset of node for byte-based addressing. 0 if the node does not have an object allocated.

void drm_vma_node_unmap(struct drm_vma_offset_node *node, struct address_space *file_mapping)

Unmap offset node

Parameters

struct drm_vma_offset_node *node

Offset node

struct address_space *file_mapping

Address space to unmap node from

Description

Unmap all userspace mappings for a given offset node. The mappings must be associated with the file_mapping address-space. If no offset exists nothing is done.

This call is unlocked. The caller must guarantee that drm_vma_offset_remove() is not called on this node concurrently.

int drm_vma_node_verify_access(struct drm_vma_offset_node *node, struct drm_file *tag)

Access verification helper for TTM

Parameters

struct drm_vma_offset_node *node

Offset node

struct drm_file *tag

Tag of file to check

Description

This checks whether tag is granted access to node. It is the same as drm_vma_node_is_allowed() but suitable as drop-in helper for TTM verify_access() callbacks.

Return

0 if access is granted, -EACCES otherwise.

void drm_vma_offset_manager_init(struct drm_vma_offset_manager *mgr, unsigned long page_offset, unsigned long size)

Initialize new offset-manager

Parameters

struct drm_vma_offset_manager *mgr

Manager object

unsigned long page_offset

Offset of available memory area (page-based)

unsigned long size

Size of available address space range (page-based)

Description

Initialize a new offset-manager. The offset and area size available for the manager are given as page_offset and size. Both are interpreted as page-numbers, not bytes.

Adding/removing nodes from the manager is locked internally and protected against concurrent access. However, node allocation and destruction is left for the caller. While calling into the vma-manager, a given node must always be guaranteed to be referenced.

void drm_vma_offset_manager_destroy(struct drm_vma_offset_manager *mgr)

Destroy offset manager

Parameters

struct drm_vma_offset_manager *mgr

Manager object

Description

Destroy an object manager which was previously created via drm_vma_offset_manager_init(). The caller must remove all allocated nodes before destroying the manager. Otherwise, drm_mm will refuse to free the requested resources.

The manager must not be accessed after this function is called.

struct drm_vma_offset_node *drm_vma_offset_lookup_locked(struct drm_vma_offset_manager *mgr, unsigned long start, unsigned long pages)

Find node in offset space

Parameters

struct drm_vma_offset_manager *mgr

Manager object

unsigned long start

Start address for object (page-based)

unsigned long pages

Size of object (page-based)

Description

Find a node given a start address and object size. This returns the _best_ match for the given node. That is, start may point somewhere into a valid region and the given node will be returned, as long as the node spans the whole requested area (given the size in number of pages as pages).

Note that before lookup the vma offset manager lookup lock must be acquired with drm_vma_offset_lock_lookup(). See there for an example. This can then be used to implement weakly referenced lookups using kref_get_unless_zero().

drm_vma_offset_lock_lookup(mgr);
node = drm_vma_offset_lookup_locked(mgr);
if (node)
    kref_get_unless_zero(container_of(node, sth, entr));
drm_vma_offset_unlock_lookup(mgr);

Example

Return

Returns NULL if no suitable node can be found. Otherwise, the best match is returned. It’s the caller’s responsibility to make sure the node doesn’t get destroyed before the caller can access it.

int drm_vma_offset_add(struct drm_vma_offset_manager *mgr, struct drm_vma_offset_node *node, unsigned long pages)

Add offset node to manager

Parameters

struct drm_vma_offset_manager *mgr

Manager object

struct drm_vma_offset_node *node

Node to be added

unsigned long pages

Allocation size visible to user-space (in number of pages)

Description

Add a node to the offset-manager. If the node was already added, this does nothing and return 0. pages is the size of the object given in number of pages. After this call succeeds, you can access the offset of the node until it is removed again.

If this call fails, it is safe to retry the operation or call drm_vma_offset_remove(), anyway. However, no cleanup is required in that case.

pages is not required to be the same size as the underlying memory object that you want to map. It only limits the size that user-space can map into their address space.

Return

0 on success, negative error code on failure.

void drm_vma_offset_remove(struct drm_vma_offset_manager *mgr, struct drm_vma_offset_node *node)

Remove offset node from manager

Parameters

struct drm_vma_offset_manager *mgr

Manager object

struct drm_vma_offset_node *node

Node to be removed

Description

Remove a node from the offset manager. If the node wasn’t added before, this does nothing. After this call returns, the offset and size will be 0 until a new offset is allocated via drm_vma_offset_add() again. Helper functions like drm_vma_node_start() and drm_vma_node_offset_addr() will return 0 if no offset is allocated.

int drm_vma_node_allow(struct drm_vma_offset_node *node, struct drm_file *tag)

Add open-file to list of allowed users

Parameters

struct drm_vma_offset_node *node

Node to modify

struct drm_file *tag

Tag of file to remove

Description

Add tag to the list of allowed open-files for this node. If tag is already on this list, the ref-count is incremented.

The list of allowed-users is preserved across drm_vma_offset_add() and drm_vma_offset_remove() calls. You may even call it if the node is currently not added to any offset-manager.

You must remove all open-files the same number of times as you added them before destroying the node. Otherwise, you will leak memory.

This is locked against concurrent access internally.

Return

0 on success, negative error code on internal failure (out-of-mem)

int drm_vma_node_allow_once(struct drm_vma_offset_node *node, struct drm_file *tag)

Add open-file to list of allowed users

Parameters

struct drm_vma_offset_node *node

Node to modify

struct drm_file *tag

Tag of file to remove

Description

Add tag to the list of allowed open-files for this node.

The list of allowed-users is preserved across drm_vma_offset_add() and drm_vma_offset_remove() calls. You may even call it if the node is currently not added to any offset-manager.

This is not ref-counted unlike drm_vma_node_allow() hence drm_vma_node_revoke() should only be called once after this.

This is locked against concurrent access internally.

Return

0 on success, negative error code on internal failure (out-of-mem)

void drm_vma_node_revoke(struct drm_vma_offset_node *node, struct drm_file *tag)

Remove open-file from list of allowed users

Parameters

struct drm_vma_offset_node *node

Node to modify

struct drm_file *tag

Tag of file to remove

Description

Decrement the ref-count of tag in the list of allowed open-files on node. If the ref-count drops to zero, remove tag from the list. You must call this once for every drm_vma_node_allow() on tag.

This is locked against concurrent access internally.

If tag is not on the list, nothing is done.

bool drm_vma_node_is_allowed(struct drm_vma_offset_node *node, struct drm_file *tag)

Check whether an open-file is granted access

Parameters

struct drm_vma_offset_node *node

Node to check

struct drm_file *tag

Tag of file to remove

Description

Search the list in node whether tag is currently on the list of allowed open-files (see drm_vma_node_allow()).

This is locked against concurrent access internally.

Return

true if filp is on the list

PRIME Buffer Sharing

PRIME is the cross device buffer sharing framework in drm, originally created for the OPTIMUS range of multi-gpu platforms. To userspace PRIME buffers are dma-buf based file descriptors.

Overview and Lifetime Rules

Similar to GEM global names, PRIME file descriptors are also used to share buffer objects across processes. They offer additional security: as file descriptors must be explicitly sent over UNIX domain sockets to be shared between applications, they can’t be guessed like the globally unique GEM names.

Drivers that support the PRIME API implement the drm_driver.prime_handle_to_fd and drm_driver.prime_fd_to_handle operations. GEM based drivers must use drm_gem_prime_handle_to_fd() and drm_gem_prime_fd_to_handle() to implement these. For GEM based drivers the actual driver interfaces is provided through the drm_gem_object_funcs.export and drm_driver.gem_prime_import hooks.

dma_buf_ops implementations for GEM drivers are all individually exported for drivers which need to overwrite or reimplement some of them.

Reference Counting for GEM Drivers

On the export the dma_buf holds a reference to the exported buffer object, usually a drm_gem_object. It takes this reference in the PRIME_HANDLE_TO_FD IOCTL, when it first calls drm_gem_object_funcs.export and stores the exporting GEM object in the dma_buf.priv field. This reference needs to be released when the final reference to the dma_buf itself is dropped and its dma_buf_ops.release function is called. For GEM-based drivers, the dma_buf should be exported using drm_gem_dmabuf_export() and then released by drm_gem_dmabuf_release().

Thus the chain of references always flows in one direction, avoiding loops: importing GEM object -> dma-buf -> exported GEM bo. A further complication are the lookup caches for import and export. These are required to guarantee that any given object will always have only one unique userspace handle. This is required to allow userspace to detect duplicated imports, since some GEM drivers do fail command submissions if a given buffer object is listed more than once. These import and export caches in drm_prime_file_private only retain a weak reference, which is cleaned up when the corresponding object is released.

Self-importing: If userspace is using PRIME as a replacement for flink then it will get a fd->handle request for a GEM object that it created. Drivers should detect this situation and return back the underlying object from the dma-buf private. For GEM based drivers this is handled in drm_gem_prime_import() already.

PRIME Helper Functions

Drivers can implement drm_gem_object_funcs.export and drm_driver.gem_prime_import in terms of simpler APIs by using the helper functions drm_gem_prime_export() and drm_gem_prime_import(). These functions implement dma-buf support in terms of some lower-level helpers, which are again exported for drivers to use individually:

Exporting buffers

Optional pinning of buffers is handled at dma-buf attach and detach time in drm_gem_map_attach() and drm_gem_map_detach(). Backing storage itself is handled by drm_gem_map_dma_buf() and drm_gem_unmap_dma_buf(), which relies on drm_gem_object_funcs.get_sg_table.

For kernel-internal access there’s drm_gem_dmabuf_vmap() and drm_gem_dmabuf_vunmap(). Userspace mmap support is provided by drm_gem_dmabuf_mmap().

Note that these export helpers can only be used if the underlying backing storage is fully coherent and either permanently pinned, or it is safe to pin it indefinitely.

FIXME: The underlying helper functions are named rather inconsistently.

Importing buffers

Importing dma-bufs using drm_gem_prime_import() relies on drm_driver.gem_prime_import_sg_table.

Note that similarly to the export helpers this permanently pins the underlying backing storage. Which is ok for scanout, but is not the best option for sharing lots of buffers for rendering.

PRIME Function References

struct drm_prime_file_private

per-file tracking for PRIME

Definition

struct drm_prime_file_private {
};

Members

Description

This just contains the internal struct dma_buf and handle caches for each struct drm_file used by the PRIME core code.

struct dma_buf *drm_gem_dmabuf_export(struct drm_device *dev, struct dma_buf_export_info *exp_info)

dma_buf export implementation for GEM

Parameters

struct drm_device *dev

parent device for the exported dmabuf

struct dma_buf_export_info *exp_info

the export information used by dma_buf_export()

Description

This wraps dma_buf_export() for use by generic GEM drivers that are using drm_gem_dmabuf_release(). In addition to calling dma_buf_export(), we take a reference to the drm_device and the exported drm_gem_object (stored in dma_buf_export_info.priv) which is released by drm_gem_dmabuf_release().

Returns the new dmabuf.

void drm_gem_dmabuf_release(struct dma_buf *dma_buf)

dma_buf release implementation for GEM

Parameters

struct dma_buf *dma_buf

buffer to be released

Description

Generic release function for dma_bufs exported as PRIME buffers. GEM drivers must use this in their dma_buf_ops structure as the release callback. drm_gem_dmabuf_release() should be used in conjunction with drm_gem_dmabuf_export().

int drm_gem_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv, int prime_fd, uint32_t *handle)

PRIME import function for GEM drivers

Parameters

struct drm_device *dev

dev to export the buffer from

struct drm_file *file_priv

drm file-private structure

int prime_fd

fd id of the dma-buf which should be imported

uint32_t *handle

pointer to storage for the handle of the imported buffer object

Description

This is the PRIME import function which must be used mandatorily by GEM drivers to ensure correct lifetime management of the underlying GEM object. The actual importing of GEM object from the dma-buf is done through the drm_driver.gem_prime_import driver callback.

Returns 0 on success or a negative error code on failure.

int drm_gem_prime_handle_to_fd(struct drm_device *dev, struct drm_file *file_priv, uint32_t handle, uint32_t flags, int *prime_fd)

PRIME export function for GEM drivers

Parameters

struct drm_device *dev

dev to export the buffer from

struct drm_file *file_priv

drm file-private structure

uint32_t handle

buffer handle to export

uint32_t flags

flags like DRM_CLOEXEC

int *prime_fd

pointer to storage for the fd id of the create dma-buf

Description

This is the PRIME export function which must be used mandatorily by GEM drivers to ensure correct lifetime management of the underlying GEM object. The actual exporting from GEM object to a dma-buf is done through the drm_gem_object_funcs.export callback.

int drm_gem_map_attach(struct dma_buf *dma_buf, struct dma_buf_attachment *attach)

dma_buf attach implementation for GEM

Parameters

struct dma_buf *dma_buf

buffer to attach device to

struct dma_buf_attachment *attach

buffer attachment data

Description

Calls drm_gem_object_funcs.pin for device specific handling. This can be used as the dma_buf_ops.attach callback. Must be used together with drm_gem_map_detach().

Returns 0 on success, negative error code on failure.

void drm_gem_map_detach(struct dma_buf *dma_buf, struct dma_buf_attachment *attach)

dma_buf detach implementation for GEM

Parameters

struct dma_buf *dma_buf

buffer to detach from

struct dma_buf_attachment *attach

attachment to be detached

Description

Calls drm_gem_object_funcs.pin for device specific handling. Cleans up dma_buf_attachment from drm_gem_map_attach(). This can be used as the dma_buf_ops.detach callback.

struct sg_table *drm_gem_map_dma_buf(struct dma_buf_attachment *attach, enum dma_data_direction dir)

map_dma_buf implementation for GEM

Parameters

struct dma_buf_attachment *attach

attachment whose scatterlist is to be returned

enum dma_data_direction dir

direction of DMA transfer

Description

Calls drm_gem_object_funcs.get_sg_table and then maps the scatterlist. This can be used as the dma_buf_ops.map_dma_buf callback. Should be used together with drm_gem_unmap_dma_buf().

Return

sg_table containing the scatterlist to be returned; returns ERR_PTR on error. May return -EINTR if it is interrupted by a signal.

void drm_gem_unmap_dma_buf(struct dma_buf_attachment *attach, struct sg_table *sgt, enum dma_data_direction dir)

unmap_dma_buf implementation for GEM

Parameters

struct dma_buf_attachment *attach

attachment to unmap buffer from

struct sg_table *sgt

scatterlist info of the buffer to unmap

enum dma_data_direction dir

direction of DMA transfer

Description

This can be used as the dma_buf_ops.unmap_dma_buf callback.

int drm_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct iosys_map *map)

dma_buf vmap implementation for GEM

Parameters

struct dma_buf *dma_buf

buffer to be mapped

struct iosys_map *map

the virtual address of the buffer

Description

Sets up a kernel virtual mapping. This can be used as the dma_buf_ops.vmap callback. Calls into drm_gem_object_funcs.vmap for device specific handling. The kernel virtual address is returned in map.

Returns 0 on success or a negative errno code otherwise.

void drm_gem_dmabuf_vunmap(struct dma_buf *dma_buf, struct iosys_map *map)

dma_buf vunmap implementation for GEM

Parameters

struct dma_buf *dma_buf

buffer to be unmapped

struct iosys_map *map

the virtual address of the buffer

Description

Releases a kernel virtual mapping. This can be used as the dma_buf_ops.vunmap callback. Calls into drm_gem_object_funcs.vunmap for device specific handling.

int drm_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma)

PRIME mmap function for GEM drivers

Parameters

struct drm_gem_object *obj

GEM object

struct vm_area_struct *vma

Virtual address range

Description

This function sets up a userspace mapping for PRIME exported buffers using the same codepath that is used for regular GEM buffer mapping on the DRM fd. The fake GEM offset is added to vma->vm_pgoff and drm_driver->fops->mmap is called to set up the mapping.

Drivers can use this as their drm_driver.gem_prime_mmap callback.

int drm_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct *vma)

dma_buf mmap implementation for GEM

Parameters

struct dma_buf *dma_buf

buffer to be mapped

struct vm_area_struct *vma

virtual address range

Description

Provides memory mapping for the buffer. This can be used as the dma_buf_ops.mmap callback. It just forwards to drm_driver.gem_prime_mmap, which should be set to drm_gem_prime_mmap().

FIXME: There’s really no point to this wrapper, drivers which need anything else but drm_gem_prime_mmap can roll their own dma_buf_ops.mmap callback.

Returns 0 on success or a negative error code on failure.

struct sg_table *drm_prime_pages_to_sg(struct drm_device *dev, struct page **pages, unsigned int nr_pages)

converts a page array into an sg list

Parameters

struct drm_device *dev

DRM device

struct page **pages

pointer to the array of page pointers to convert

unsigned int nr_pages

length of the page vector

Description

This helper creates an sg table object from a set of pages the driver is responsible for mapping the pages into the importers address space for use with dma_buf itself.

This is useful for implementing drm_gem_object_funcs.get_sg_table.

unsigned long drm_prime_get_contiguous_size(struct sg_table *sgt)

returns the contiguous size of the buffer

Parameters

struct sg_table *sgt

sg_table describing the buffer to check

Description

This helper calculates the contiguous size in the DMA address space of the buffer described by the provided sg_table.

This is useful for implementing drm_gem_object_funcs.gem_prime_import_sg_table.

struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj, int flags)

helper library implementation of the export callback

Parameters

struct drm_gem_object *obj

GEM object to export

int flags

flags like DRM_CLOEXEC and DRM_RDWR

Description

This is the implementation of the drm_gem_object_funcs.export functions for GEM drivers using the PRIME helpers. It is used as the default in drm_gem_prime_handle_to_fd().

struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, struct dma_buf *dma_buf, struct device *attach_dev)

core implementation of the import callback

Parameters

struct drm_device *dev

drm_device to import into

struct dma_buf *dma_buf

dma-buf object to import

struct device *attach_dev

struct device to dma_buf attach

Description

This is the core of drm_gem_prime_import(). It’s designed to be called by drivers who want to use a different device structure than drm_device.dev for attaching via dma_buf. This function calls drm_driver.gem_prime_import_sg_table internally.

Drivers must arrange to call drm_prime_gem_destroy() from their drm_gem_object_funcs.free hook when using this function.

struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf)

helper library implementation of the import callback

Parameters

struct drm_device *dev

drm_device to import into

struct dma_buf *dma_buf

dma-buf object to import

Description

This is the implementation of the gem_prime_import functions for GEM drivers using the PRIME helpers. Drivers can use this as their drm_driver.gem_prime_import implementation. It is used as the default implementation in drm_gem_prime_fd_to_handle().

Drivers must arrange to call drm_prime_gem_destroy() from their drm_gem_object_funcs.free hook when using this function.

int drm_prime_sg_to_page_array(struct sg_table *sgt, struct page **pages, int max_entries)

convert an sg table into a page array

Parameters

struct sg_table *sgt

scatter-gather table to convert

struct page **pages

array of page pointers to store the pages in

int max_entries

size of the passed-in array

Description

Exports an sg table into an array of pages.

This function is deprecated and strongly discouraged to be used. The page array is only useful for page faults and those can corrupt fields in the struct page if they are not handled by the exporting driver.

int drm_prime_sg_to_dma_addr_array(struct sg_table *sgt, dma_addr_t *addrs, int max_entries)

convert an sg table into a dma addr array

Parameters

struct sg_table *sgt

scatter-gather table to convert

dma_addr_t *addrs

array to store the dma bus address of each page

int max_entries

size of both the passed-in arrays

Description

Exports an sg table into an array of addresses.

Drivers should use this in their drm_driver.gem_prime_import_sg_table implementation.

void drm_prime_gem_destroy(struct drm_gem_object *obj, struct sg_table *sg)

helper to clean up a PRIME-imported GEM object

Parameters

struct drm_gem_object *obj

GEM object which was created from a dma-buf

struct sg_table *sg

the sg-table which was pinned at import time

Description

This is the cleanup functions which GEM drivers need to call when they use drm_gem_prime_import() or drm_gem_prime_import_dev() to import dma-bufs.

DRM MM Range Allocator

Overview

drm_mm provides a simple range allocator. The drivers are free to use the resource allocator from the linux core if it suits them, the upside of drm_mm is that it’s in the DRM core. Which means that it’s easier to extend for some of the crazier special purpose needs of gpus.

The main data struct is drm_mm, allocations are tracked in drm_mm_node. Drivers are free to embed either of them into their own suitable datastructures. drm_mm itself will not do any memory allocations of its own, so if drivers choose not to embed nodes they need to still allocate them themselves.

The range allocator also supports reservation of preallocated blocks. This is useful for taking over initial mode setting configurations from the firmware, where an object needs to be created which exactly matches the firmware’s scanout target. As long as the range is still free it can be inserted anytime after the allocator is initialized, which helps with avoiding looped dependencies in the driver load sequence.

drm_mm maintains a stack of most recently freed holes, which of all simplistic datastructures seems to be a fairly decent approach to clustering allocations and avoiding too much fragmentation. This means free space searches are O(num_holes). Given that all the fancy features drm_mm supports something better would be fairly complex and since gfx thrashing is a fairly steep cliff not a real concern. Removing a node again is O(1).

drm_mm supports a few features: Alignment and range restrictions can be supplied. Furthermore every drm_mm_node has a color value (which is just an opaque unsigned long) which in conjunction with a driver callback can be used to implement sophisticated placement restrictions. The i915 DRM driver uses this to implement guard pages between incompatible caching domains in the graphics TT.

Two behaviors are supported for searching and allocating: bottom-up and top-down. The default is bottom-up. Top-down allocation can be used if the memory area has different restrictions, or just to reduce fragmentation.

Finally iteration helpers to walk all nodes and all holes are provided as are some basic allocator dumpers for debugging.

Note that this range allocator is not thread-safe, drivers need to protect modifications with their own locking. The idea behind this is that for a full memory manager additional data needs to be protected anyway, hence internal locking would be fully redundant.

LRU Scan/Eviction Support

Very often GPUs need to have continuous allocations for a given object. When evicting objects to make space for a new one it is therefore not most efficient when we simply start to select all objects from the tail of an LRU until there’s a suitable hole: Especially for big objects or nodes that otherwise have special allocation constraints there’s a good chance we evict lots of (smaller) objects unnecessarily.

The DRM range allocator supports this use-case through the scanning interfaces. First a scan operation needs to be initialized with drm_mm_scan_init() or drm_mm_scan_init_with_range(). The driver adds objects to the roster, probably by walking an LRU list, but this can be freely implemented. Eviction candidates are added using drm_mm_scan_add_block() until a suitable hole is found or there are no further evictable objects. Eviction roster metadata is tracked in struct drm_mm_scan.

The driver must walk through all objects again in exactly the reverse order to restore the allocator state. Note that while the allocator is used in the scan mode no other operation is allowed.

Finally the driver evicts all objects selected (drm_mm_scan_remove_block() reported true) in the scan, and any overlapping nodes after color adjustment (drm_mm_scan_color_evict()). Adding and removing an object is O(1), and since freeing a node is also O(1) the overall complexity is O(scanned_objects). So like the free stack which needs to be walked before a scan operation even begins this is linear in the number of objects. It doesn’t seem to hurt too badly.

DRM MM Range Allocator Function References

enum drm_mm_insert_mode

control search and allocation behaviour

Constants

DRM_MM_INSERT_BEST

Search for the smallest hole (within the search range) that fits the desired node.

Allocates the node from the bottom of the found hole.

DRM_MM_INSERT_LOW

Search for the lowest hole (address closest to 0, within the search range) that fits the desired node.

Allocates the node from the bottom of the found hole.

DRM_MM_INSERT_HIGH

Search for the highest hole (address closest to U64_MAX, within the search range) that fits the desired node.

Allocates the node from the top of the found hole. The specified alignment for the node is applied to the base of the node (drm_mm_node.start).

DRM_MM_INSERT_EVICT

Search for the most recently evicted hole (within the search range) that fits the desired node. This is appropriate for use immediately after performing an eviction scan (see drm_mm_scan_init()) and removing the selected nodes to form a hole.

Allocates the node from the bottom of the found hole.

DRM_MM_INSERT_ONCE

Only check the first hole for suitablity and report -ENOSPC immediately otherwise, rather than check every hole until a suitable one is found. Can only be used in conjunction with another search method such as DRM_MM_INSERT_HIGH or DRM_MM_INSERT_LOW.

DRM_MM_INSERT_HIGHEST

Only check the highest hole (the hole with the largest address) and insert the node at the top of the hole or report -ENOSPC if unsuitable.

Does not search all holes.

DRM_MM_INSERT_LOWEST

Only check the lowest hole (the hole with the smallest address) and insert the node at the bottom of the hole or report -ENOSPC if unsuitable.

Does not search all holes.

Description

The struct drm_mm range manager supports finding a suitable modes using a number of search trees. These trees are oranised by size, by address and in most recent eviction order. This allows the user to find either the smallest hole to reuse, the lowest or highest address to reuse, or simply reuse the most recent eviction that fits. When allocating the drm_mm_node from within the hole, the drm_mm_insert_mode also dictate whether to allocate the lowest matching address or the highest.

struct drm_mm_node

allocated block in the DRM allocator

Definition

struct drm_mm_node {
  unsigned long color;
  u64 start;
  u64 size;
};

Members

color

Opaque driver-private tag.

start

Start address of the allocated block.

size

Size of the allocated block.

Description

This represents an allocated block in a drm_mm allocator. Except for pre-reserved nodes inserted using drm_mm_reserve_node() the structure is entirely opaque and should only be accessed through the provided funcions. Since allocation of these nodes is entirely handled by the driver they can be embedded.

struct drm_mm

DRM allocator

Definition

struct drm_mm {
  void (*color_adjust)(const struct drm_mm_node *node,unsigned long color, u64 *start, u64 *end);
};

Members

color_adjust

Optional driver callback to further apply restrictions on a hole. The node argument points at the node containing the hole from which the block would be allocated (see drm_mm_hole_follows() and friends). The other arguments are the size of the block to be allocated. The driver can adjust the start and end as needed to e.g. insert guard pages.

Description

DRM range allocator with a few special functions and features geared towards managing GPU memory. Except for the color_adjust callback the structure is entirely opaque and should only be accessed through the provided functions and macros. This structure can be embedded into larger driver structures.

struct drm_mm_scan

DRM allocator eviction roaster data

Definition

struct drm_mm_scan {
};

Members

Description

This structure tracks data needed for the eviction roaster set up using drm_mm_scan_init(), and used with drm_mm_scan_add_block() and drm_mm_scan_remove_block(). The structure is entirely opaque and should only be accessed through the provided functions and macros. It is meant to be allocated temporarily by the driver on the stack.

bool drm_mm_node_allocated(const struct drm_mm_node *node)

checks whether a node is allocated

Parameters

const struct drm_mm_node *node

drm_mm_node to check

Description

Drivers are required to clear a node prior to using it with the drm_mm range manager.

Drivers should use this helper for proper encapsulation of drm_mm internals.

Return

True if the node is allocated.

bool drm_mm_initialized(const struct drm_mm *mm)

checks whether an allocator is initialized

Parameters

const struct drm_mm *mm

drm_mm to check

Description

Drivers should clear the struct drm_mm prior to initialisation if they want to use this function.

Drivers should use this helper for proper encapsulation of drm_mm internals.

Return

True if the mm is initialized.

bool drm_mm_hole_follows(const struct drm_mm_node *node)

checks whether a hole follows this node

Parameters

const struct drm_mm_node *node

drm_mm_node to check

Description

Holes are embedded into the drm_mm using the tail of a drm_mm_node. If you wish to know whether a hole follows this particular node, query this function. See also drm_mm_hole_node_start() and drm_mm_hole_node_end().

Return

True if a hole follows the node.

u64 drm_mm_hole_node_start(const struct drm_mm_node *hole_node)

computes the start of the hole following node

Parameters

const struct drm_mm_node *hole_node

drm_mm_node which implicitly tracks the following hole

Description

This is useful for driver-specific debug dumpers. Otherwise drivers should not inspect holes themselves. Drivers must check first whether a hole indeed follows by looking at drm_mm_hole_follows()

Return

Start of the subsequent hole.

u64 drm_mm_hole_node_end(const struct drm_mm_node *hole_node)

computes the end of the hole following node

Parameters

const struct drm_mm_node *hole_node

drm_mm_node which implicitly tracks the following hole

Description

This is useful for driver-specific debug dumpers. Otherwise drivers should not inspect holes themselves. Drivers must check first whether a hole indeed follows by looking at drm_mm_hole_follows().

Return

End of the subsequent hole.

drm_mm_nodes

drm_mm_nodes (mm)

list of nodes under the drm_mm range manager

Parameters

mm

the struct drm_mm range manager

Description

As the drm_mm range manager hides its node_list deep with its structure, extracting it looks painful and repetitive. This is not expected to be used outside of the drm_mm_for_each_node() macros and similar internal functions.

Return

The node list, may be empty.

drm_mm_for_each_node

drm_mm_for_each_node (entry, mm)

iterator to walk over all allocated nodes

Parameters

entry

struct drm_mm_node to assign to in each iteration step

mm

drm_mm allocator to walk

Description

This iterator walks over all nodes in the range allocator. It is implemented with list_for_each(), so not save against removal of elements.

drm_mm_for_each_node_safe

drm_mm_for_each_node_safe (entry, next, mm)

iterator to walk over all allocated nodes

Parameters

entry

struct drm_mm_node to assign to in each iteration step

next

struct drm_mm_node to store the next step

mm

drm_mm allocator to walk

Description

This iterator walks over all nodes in the range allocator. It is implemented with list_for_each_safe(), so save against removal of elements.

drm_mm_for_each_hole

drm_mm_for_each_hole (pos, mm, hole_start, hole_end)

iterator to walk over all holes

Parameters

pos

drm_mm_node used internally to track progress

mm

drm_mm allocator to walk

hole_start

ulong variable to assign the hole start to on each iteration

hole_end

ulong variable to assign the hole end to on each iteration

Description

This iterator walks over all holes in the range allocator. It is implemented with list_for_each(), so not save against removal of elements. entry is used internally and will not reflect a real drm_mm_node for the very first hole. Hence users of this iterator may not access it.

Implementation Note: We need to inline list_for_each_entry in order to be able to set hole_start and hole_end on each iteration while keeping the macro sane.

int drm_mm_insert_node_generic(struct drm_mm *mm, struct drm_mm_node *node, u64 size, u64 alignment, unsigned long color, enum drm_mm_insert_mode mode)

search for space and insert node

Parameters

struct drm_mm *mm

drm_mm to allocate from

struct drm_mm_node *node

preallocate node to insert

u64 size

size of the allocation

u64 alignment

alignment of the allocation

unsigned long color

opaque tag value to use for this node

enum drm_mm_insert_mode mode

fine-tune the allocation search and placement

Description

This is a simplified version of drm_mm_insert_node_in_range() with no range restrictions applied.

The preallocated node must be cleared to 0.

Return

0 on success, -ENOSPC if there’s no suitable hole.

int drm_mm_insert_node(struct drm_mm *mm, struct drm_mm_node *node, u64 size)

search for space and insert node

Parameters

struct drm_mm *mm

drm_mm to allocate from

struct drm_mm_node *node

preallocate node to insert

u64 size

size of the allocation

Description

This is a simplified version of drm_mm_insert_node_generic() with color set to 0.

The preallocated node must be cleared to 0.

Return

0 on success, -ENOSPC if there’s no suitable hole.

bool drm_mm_clean(const struct drm_mm *mm)

checks whether an allocator is clean

Parameters

const struct drm_mm *mm

drm_mm allocator to check

Return

True if the allocator is completely free, false if there’s still a node allocated in it.

drm_mm_for_each_node_in_range

drm_mm_for_each_node_in_range (node__, mm__, start__, end__)

iterator to walk over a range of allocated nodes

Parameters

node__

drm_mm_node structure to assign to in each iteration step

mm__

drm_mm allocator to walk

start__

starting offset, the first node will overlap this

end__

ending offset, the last node will start before this (but may overlap)

Description

This iterator walks over all nodes in the range allocator that lie between start and end. It is implemented similarly to list_for_each(), but using the internal interval tree to accelerate the search for the starting node, and so not safe against removal of elements. It assumes that end is within (or is the upper limit of) the drm_mm allocator. If [start, end] are beyond the range of the drm_mm, the iterator may walk over the special _unallocated_ drm_mm.head_node, and may even continue indefinitely.

void drm_mm_scan_init(struct drm_mm_scan *scan, struct drm_mm *mm, u64 size, u64 alignment, unsigned long color, enum drm_mm_insert_mode mode)

initialize lru scanning

Parameters

struct drm_mm_scan *scan

scan state

struct drm_mm *mm

drm_mm to scan

u64 size

size of the allocation

u64 alignment

alignment of the allocation

unsigned long color

opaque tag value to use for the allocation

enum drm_mm_insert_mode mode

fine-tune the allocation search and placement

Description

This is a simplified version of drm_mm_scan_init_with_range() with no range restrictions applied.

This simply sets up the scanning routines with the parameters for the desired hole.

Warning: As long as the scan list is non-empty, no other operations than adding/removing nodes to/from the scan list are allowed.

int drm_mm_reserve_node(struct drm_mm *mm, struct drm_mm_node *node)

insert an pre-initialized node

Parameters

struct drm_mm *mm

drm_mm allocator to insert node into

struct drm_mm_node *node

drm_mm_node to insert

Description

This functions inserts an already set-up drm_mm_node into the allocator, meaning that start, size and color must be set by the caller. All other fields must be cleared to 0. This is useful to initialize the allocator with preallocated objects which must be set-up before the range allocator can be set-up, e.g. when taking over a firmware framebuffer.

Return

0 on success, -ENOSPC if there’s no hole where node is.

int drm_mm_insert_node_in_range(struct drm_mm *const mm, struct drm_mm_node *const node, u64 size, u64 alignment, unsigned long color, u64 range_start, u64 range_end, enum drm_mm_insert_mode mode)

ranged search for space and insert node

Parameters

struct drm_mm * const mm

drm_mm to allocate from

struct drm_mm_node * const node

preallocate node to insert

u64 size

size of the allocation

u64 alignment

alignment of the allocation

unsigned long color

opaque tag value to use for this node

u64 range_start

start of the allowed range for this node

u64 range_end

end of the allowed range for this node

enum drm_mm_insert_mode mode

fine-tune the allocation search and placement

Description

The preallocated node must be cleared to 0.

Return

0 on success, -ENOSPC if there’s no suitable hole.

void drm_mm_remove_node(struct drm_mm_node *node)

Remove a memory node from the allocator.

Parameters

struct drm_mm_node *node

drm_mm_node to remove

Description

This just removes a node from its drm_mm allocator. The node does not need to be cleared again before it can be re-inserted into this or any other drm_mm allocator. It is a bug to call this function on a unallocated node.

void drm_mm_replace_node(struct drm_mm_node *old, struct drm_mm_node *new)

move an allocation from old to new

Parameters

struct drm_mm_node *old

drm_mm_node to remove from the allocator

struct drm_mm_node *new

drm_mm_node which should inherit old’s allocation

Description

This is useful for when drivers embed the drm_mm_node structure and hence can’t move allocations by reassigning pointers. It’s a combination of remove and insert with the guarantee that the allocation start will match.

void drm_mm_scan_init_with_range(struct drm_mm_scan *scan, struct drm_mm *mm, u64 size, u64 alignment, unsigned long color, u64 start, u64 end, enum drm_mm_insert_mode mode)

initialize range-restricted lru scanning

Parameters

struct drm_mm_scan *scan

scan state

struct drm_mm *mm

drm_mm to scan

u64 size

size of the allocation

u64 alignment

alignment of the allocation

unsigned long color

opaque tag value to use for the allocation

u64 start

start of the allowed range for the allocation

u64 end

end of the allowed range for the allocation

enum drm_mm_insert_mode mode

fine-tune the allocation search and placement

Description

This simply sets up the scanning routines with the parameters for the desired hole.

Warning: As long as the scan list is non-empty, no other operations than adding/removing nodes to/from the scan list are allowed.

bool drm_mm_scan_add_block(struct drm_mm_scan *scan, struct drm_mm_node *node)

add a node to the scan list

Parameters

struct drm_mm_scan *scan

the active drm_mm scanner

struct drm_mm_node *node

drm_mm_node to add

Description

Add a node to the scan list that might be freed to make space for the desired hole.

Return

True if a hole has been found, false otherwise.

bool drm_mm_scan_remove_block(struct drm_mm_scan *scan, struct drm_mm_node *node)

remove a node from the scan list

Parameters

struct drm_mm_scan *scan

the active drm_mm scanner

struct drm_mm_node *node

drm_mm_node to remove

Description

Nodes must be removed in exactly the reverse order from the scan list as they have been added (e.g. using list_add() as they are added and then list_for_each() over that eviction list to remove), otherwise the internal state of the memory manager will be corrupted.

When the scan list is empty, the selected memory nodes can be freed. An immediately following drm_mm_insert_node_in_range_generic() or one of the simpler versions of that function with !DRM_MM_SEARCH_BEST will then return the just freed block (because it’s at the top of the free_stack list).

Return

True if this block should be evicted, false otherwise. Will always return false when no hole has been found.

struct drm_mm_node *drm_mm_scan_color_evict(struct drm_mm_scan *scan)

evict overlapping nodes on either side of hole

Parameters

struct drm_mm_scan *scan

drm_mm scan with target hole

Description

After completing an eviction scan and removing the selected nodes, we may need to remove a few more nodes from either side of the target hole if mm.color_adjust is being used.

Return

A node to evict, or NULL if there are no overlapping nodes.

void drm_mm_init(struct drm_mm *mm, u64 start, u64 size)

initialize a drm-mm allocator

Parameters

struct drm_mm *mm

the drm_mm structure to initialize

u64 start

start of the range managed by mm

u64 size

end of the range managed by mm

Description

Note that mm must be cleared to 0 before calling this function.

void drm_mm_takedown(struct drm_mm *mm)

clean up a drm_mm allocator

Parameters

struct drm_mm *mm

drm_mm allocator to clean up

Description

Note that it is a bug to call this function on an allocator which is not clean.

void drm_mm_print(const struct drm_mm *mm, struct drm_printer *p)

print allocator state

Parameters

const struct drm_mm *mm

drm_mm allocator to print

struct drm_printer *p

DRM printer to use

DRM Buddy Allocator

DRM Buddy Function References

int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size)

init memory manager

Parameters

struct drm_buddy *mm

DRM buddy manager to initialize

u64 size

size in bytes to manage

u64 chunk_size

minimum page size in bytes for our allocations

Description

Initializes the memory manager and its resources.

Return

0 on success, error code on failure.

void drm_buddy_fini(struct drm_buddy *mm)

tear down the memory manager

Parameters

struct drm_buddy *mm

DRM buddy manager to free

Description

Cleanup memory manager resources and the freelist

struct drm_buddy_block *drm_get_buddy(struct drm_buddy_block *block)

get buddy address

Parameters

struct drm_buddy_block *block

DRM buddy block

Description

Returns the corresponding buddy block for block, or NULL if this is a root block and can’t be merged further. Requires some kind of locking to protect against any concurrent allocate and free operations.

void drm_buddy_free_block(struct drm_buddy *mm, struct drm_buddy_block *block)

free a block

Parameters

struct drm_buddy *mm

DRM buddy manager

struct drm_buddy_block *block

block to be freed

void drm_buddy_free_list(struct drm_buddy *mm, struct list_head *objects)

free blocks

Parameters

struct drm_buddy *mm

DRM buddy manager

struct list_head *objects

input list head to free blocks

int drm_buddy_block_trim(struct drm_buddy *mm, u64 new_size, struct list_head *blocks)

free unused pages

Parameters

struct drm_buddy *mm

DRM buddy manager

u64 new_size

original size requested

struct list_head *blocks

Input and output list of allocated blocks. MUST contain single block as input to be trimmed. On success will contain the newly allocated blocks making up the new_size. Blocks always appear in ascending order

Description

For contiguous allocation, we round up the size to the nearest power of two value, drivers consume actual size, so remaining portions are unused and can be optionally freed with this function

Return

0 on success, error code on failure.

int drm_buddy_alloc_blocks(struct drm_buddy *mm, u64 start, u64 end, u64 size, u64 min_page_size, struct list_head *blocks, unsigned long flags)

allocate power-of-two blocks

Parameters

struct drm_buddy *mm

DRM buddy manager to allocate from

u64 start

start of the allowed range for this block

u64 end

end of the allowed range for this block

u64 size

size of the allocation

u64 min_page_size

alignment of the allocation

struct list_head *blocks

output list head to add allocated blocks

unsigned long flags

DRM_BUDDY_*_ALLOCATION flags

Description

alloc_range_bias() called on range limitations, which traverses the tree and returns the desired block.

alloc_from_freelist() called when no range restrictions are enforced, which picks the block from the freelist.

Return

0 on success, error code on failure.

void drm_buddy_block_print(struct drm_buddy *mm, struct drm_buddy_block *block, struct drm_printer *p)

print block information

Parameters

struct drm_buddy *mm

DRM buddy manager

struct drm_buddy_block *block

DRM buddy block

struct drm_printer *p

DRM printer to use

void drm_buddy_print(struct drm_buddy *mm, struct drm_printer *p)

print allocator state

Parameters

struct drm_buddy *mm

DRM buddy manager

struct drm_printer *p

DRM printer to use

DRM Cache Handling and Fast WC memcpy()

void drm_clflush_pages(struct page *pages[], unsigned long num_pages)

Flush dcache lines of a set of pages.

Parameters

struct page *pages[]

List of pages to be flushed.

unsigned long num_pages

Number of pages in the array.

Description

Flush every data cache line entry that points to an address belonging to a page in the array.

void drm_clflush_sg(struct sg_table *st)

Flush dcache lines pointing to a scather-gather.

Parameters

struct sg_table *st

struct sg_table.

Description

Flush every data cache line entry that points to an address in the sg.

void drm_clflush_virt_range(void *addr, unsigned long length)

Flush dcache lines of a region

Parameters

void *addr

Initial kernel memory address.

unsigned long length

Region size.

Description

Flush every data cache line entry that points to an address in the region requested.

void drm_memcpy_from_wc(struct iosys_map *dst, const struct iosys_map *src, unsigned long len)

Perform the fastest available memcpy from a source that may be WC.

Parameters

struct iosys_map *dst

The destination pointer

const struct iosys_map *src

The source pointer

unsigned long len

The size of the area o transfer in bytes

Description

Tries an arch optimized memcpy for prefetching reading out of a WC region, and if no such beast is available, falls back to a normal memcpy.

DRM Sync Objects

DRM synchronisation objects (syncobj, see struct drm_syncobj) provide a container for a synchronization primitive which can be used by userspace to explicitly synchronize GPU commands, can be shared between userspace processes, and can be shared between different DRM drivers. Their primary use-case is to implement Vulkan fences and semaphores. The syncobj userspace API provides ioctls for several operations:

  • Creation and destruction of syncobjs

  • Import and export of syncobjs to/from a syncobj file descriptor

  • Import and export a syncobj’s underlying fence to/from a sync file

  • Reset a syncobj (set its fence to NULL)

  • Signal a syncobj (set a trivially signaled fence)

  • Wait for a syncobj’s fence to appear and be signaled

The syncobj userspace API also provides operations to manipulate a syncobj in terms of a timeline of struct dma_fence_chain rather than a single struct dma_fence, through the following operations:

  • Signal a given point on the timeline

  • Wait for a given point to appear and/or be signaled

  • Import and export from/to a given point of a timeline

At it’s core, a syncobj is simply a wrapper around a pointer to a struct dma_fence which may be NULL. When a syncobj is first created, its pointer is either NULL or a pointer to an already signaled fence depending on whether the DRM_SYNCOBJ_CREATE_SIGNALED flag is passed to DRM_IOCTL_SYNCOBJ_CREATE.

If the syncobj is considered as a binary (its state is either signaled or unsignaled) primitive, when GPU work is enqueued in a DRM driver to signal the syncobj, the syncobj’s fence is replaced with a fence which will be signaled by the completion of that work. If the syncobj is considered as a timeline primitive, when GPU work is enqueued in a DRM driver to signal the a given point of the syncobj, a new struct dma_fence_chain pointing to the DRM driver’s fence and also pointing to the previous fence that was in the syncobj. The new struct dma_fence_chain fence replace the syncobj’s fence and will be signaled by completion of the DRM driver’s work and also any work associated with the fence previously in the syncobj.

When GPU work which waits on a syncobj is enqueued in a DRM driver, at the time the work is enqueued, it waits on the syncobj’s fence before submitting the work to hardware. That fence is either :

  • The syncobj’s current fence if the syncobj is considered as a binary primitive.

  • The struct dma_fence associated with a given point if the syncobj is considered as a timeline primitive.

If the syncobj’s fence is NULL or not present in the syncobj’s timeline, the enqueue operation is expected to fail.

With binary syncobj, all manipulation of the syncobjs’s fence happens in terms of the current fence at the time the ioctl is called by userspace regardless of whether that operation is an immediate host-side operation (signal or reset) or or an operation which is enqueued in some driver queue. DRM_IOCTL_SYNCOBJ_RESET and DRM_IOCTL_SYNCOBJ_SIGNAL can be used to manipulate a syncobj from the host by resetting its pointer to NULL or setting its pointer to a fence which is already signaled.

With a timeline syncobj, all manipulation of the synobj’s fence happens in terms of a u64 value referring to point in the timeline. See dma_fence_chain_find_seqno() to see how a given point is found in the timeline.

Note that applications should be careful to always use timeline set of ioctl() when dealing with syncobj considered as timeline. Using a binary set of ioctl() with a syncobj considered as timeline could result incorrect synchronization. The use of binary syncobj is supported through the timeline set of ioctl() by using a point value of 0, this will reproduce the behavior of the binary set of ioctl() (for example replace the syncobj’s fence when signaling).

Host-side wait on syncobjs

DRM_IOCTL_SYNCOBJ_WAIT takes an array of syncobj handles and does a host-side wait on all of the syncobj fences simultaneously. If DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL is set, the wait ioctl will wait on all of the syncobj fences to be signaled before it returns. Otherwise, it returns once at least one syncobj fence has been signaled and the index of a signaled fence is written back to the client.

Unlike the enqueued GPU work dependencies which fail if they see a NULL fence in a syncobj, if DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT is set, the host-side wait will first wait for the syncobj to receive a non-NULL fence and then wait on that fence. If DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT is not set and any one of the syncobjs in the array has a NULL fence, -EINVAL will be returned. Assuming the syncobj starts off with a NULL fence, this allows a client to do a host wait in one thread (or process) which waits on GPU work submitted in another thread (or process) without having to manually synchronize between the two. This requirement is inherited from the Vulkan fence API.

Similarly, DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT takes an array of syncobj handles as well as an array of u64 points and does a host-side wait on all of syncobj fences at the given points simultaneously.

DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT also adds the ability to wait for a given fence to materialize on the timeline without waiting for the fence to be signaled by using the DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE flag. This requirement is inherited from the wait-before-signal behavior required by the Vulkan timeline semaphore API.

Import/export of syncobjs

DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE and DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD provide two mechanisms for import/export of syncobjs.

The first lets the client import or export an entire syncobj to a file descriptor. These fd’s are opaque and have no other use case, except passing the syncobj between processes. All exported file descriptors and any syncobj handles created as a result of importing those file descriptors own a reference to the same underlying struct drm_syncobj and the syncobj can be used persistently across all the processes with which it is shared. The syncobj is freed only once the last reference is dropped. Unlike dma-buf, importing a syncobj creates a new handle (with its own reference) for every import instead of de-duplicating. The primary use-case of this persistent import/export is for shared Vulkan fences and semaphores.

The second import/export mechanism, which is indicated by DRM_SYNCOBJ_FD_TO_HANDLE_FLAGS_IMPORT_SYNC_FILE or DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE lets the client import/export the syncobj’s current fence from/to a sync_file. When a syncobj is exported to a sync file, that sync file wraps the sycnobj’s fence at the time of export and any later signal or reset operations on the syncobj will not affect the exported sync file. When a sync file is imported into a syncobj, the syncobj’s fence is set to the fence wrapped by that sync file. Because sync files are immutable, resetting or signaling the syncobj will not affect any sync files whose fences have been imported into the syncobj.

Import/export of timeline points in timeline syncobjs

DRM_IOCTL_SYNCOBJ_TRANSFER provides a mechanism to transfer a struct dma_fence_chain of a syncobj at a given u64 point to another u64 point into another syncobj.

Note that if you want to transfer a struct dma_fence_chain from a given point on a timeline syncobj from/into a binary syncobj, you can use the point 0 to mean take/replace the fence in the syncobj.

struct drm_syncobj

sync object.

Definition

struct drm_syncobj {
  struct kref refcount;
  struct dma_fence __rcu *fence;
  struct list_head cb_list;
  spinlock_t lock;
  struct file *file;
};

Members

refcount

Reference count of this object.

fence

NULL or a pointer to the fence bound to this object.

This field should not be used directly. Use drm_syncobj_fence_get() and drm_syncobj_replace_fence() instead.

cb_list

List of callbacks to call when the fence gets replaced.

lock

Protects cb_list and write-locks fence.

file

A file backing for this syncobj.

Description

This structure defines a generic sync object which wraps a dma_fence.

void drm_syncobj_get(struct drm_syncobj *obj)

acquire a syncobj reference

Parameters

struct drm_syncobj *obj

sync object

Description

This acquires an additional reference to obj. It is illegal to call this without already holding a reference. No locks required.

void drm_syncobj_put(struct drm_syncobj *obj)

release a reference to a sync object.

Parameters

struct drm_syncobj *obj

sync object.

struct dma_fence *drm_syncobj_fence_get(struct drm_syncobj *syncobj)

get a reference to a fence in a sync object

Parameters

struct drm_syncobj *syncobj

sync object.

Description

This acquires additional reference to drm_syncobj.fence contained in obj, if not NULL. It is illegal to call this without already holding a reference. No locks required.

Return

Either the fence of obj or NULL if there’s none.

struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private, u32 handle)

lookup and reference a sync object.

Parameters

struct drm_file *file_private

drm file private pointer

u32 handle

sync object handle to lookup.

Description

Returns a reference to the syncobj pointed to by handle or NULL. The reference must be released by calling drm_syncobj_put().

void drm_syncobj_add_point(struct drm_syncobj *syncobj, struct dma_fence_chain *chain, struct dma_fence *fence, uint64_t point)

add new timeline point to the syncobj

Parameters

struct drm_syncobj *syncobj

sync object to add timeline point do

struct dma_fence_chain *chain

chain node to use to add the point

struct dma_fence *fence

fence to encapsulate in the chain node

uint64_t point

sequence number to use for the point

Description

Add the chain node as new timeline point to the syncobj.

void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, struct dma_fence *fence)

replace fence in a sync object.

Parameters

struct drm_syncobj *syncobj

Sync object to replace fence in

struct dma_fence *fence

fence to install in sync file.

Description

This replaces the fence on a sync object.

int drm_syncobj_find_fence(struct drm_file *file_private, u32 handle, u64 point, u64 flags, struct dma_fence **fence)

lookup and reference the fence in a sync object

Parameters

struct drm_file *file_private

drm file private pointer

u32 handle

sync object handle to lookup.

u64 point

timeline point

u64 flags

DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT or not

struct dma_fence **fence

out parameter for the fence

Description

This is just a convenience function that combines drm_syncobj_find() and drm_syncobj_fence_get().

Returns 0 on success or a negative error value on failure. On success fence contains a reference to the fence, which must be released by calling dma_fence_put().

void drm_syncobj_free(struct kref *kref)

free a sync object.

Parameters

struct kref *kref

kref to free.

Description

Only to be called from kref_put in drm_syncobj_put.

int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags, struct dma_fence *fence)

create a new syncobj

Parameters

struct drm_syncobj **out_syncobj

returned syncobj

uint32_t flags

DRM_SYNCOBJ_* flags

struct dma_fence *fence

if non-NULL, the syncobj will represent this fence

Description

This is the first function to create a sync object. After creating, drivers probably want to make it available to userspace, either through drm_syncobj_get_handle() or drm_syncobj_get_fd().

Returns 0 on success or a negative error value on failure.

int drm_syncobj_get_handle(struct drm_file *file_private, struct drm_syncobj *syncobj, u32 *handle)

get a handle from a syncobj

Parameters

struct drm_file *file_private

drm file private pointer

struct drm_syncobj *syncobj

Sync object to export

u32 *handle

out parameter with the new handle

Description

Exports a sync object created with drm_syncobj_create() as a handle on file_private to userspace.

Returns 0 on success or a negative error value on failure.

int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd)

get a file descriptor from a syncobj

Parameters

struct drm_syncobj *syncobj

Sync object to export

int *p_fd

out parameter with the new file descriptor

Description

Exports a sync object created with drm_syncobj_create() as a file descriptor.

Returns 0 on success or a negative error value on failure.

signed long drm_timeout_abs_to_jiffies(int64_t timeout_nsec)

calculate jiffies timeout from absolute value

Parameters

int64_t timeout_nsec

timeout nsec component in ns, 0 for poll

Description

Calculate the timeout in jiffies from an absolute time in sec/nsec.

GPU Scheduler

Overview

The GPU scheduler provides entities which allow userspace to push jobs into software queues which are then scheduled on a hardware run queue. The software queues have a priority among them. The scheduler selects the entities from the run queue using a FIFO. The scheduler provides dependency handling features among jobs. The driver is supposed to provide callback functions for backend operations to the scheduler like submitting a job to hardware run queue, returning the dependencies of a job etc.

The organisation of the scheduler is the following:

  1. Each hw run queue has one scheduler

  2. Each scheduler has multiple run queues with different priorities (e.g., HIGH_HW,HIGH_SW, KERNEL, NORMAL)

  3. Each scheduler run queue has a queue of entities to schedule

  4. Entities themselves maintain a queue of jobs that will be scheduled on the hardware.

The jobs in a entity are always scheduled in the order that they were pushed.

Scheduler Function References

DRM_SCHED_FENCE_DONT_PIPELINE

DRM_SCHED_FENCE_DONT_PIPELINE ()

Prefent dependency pipelining

Parameters

Description

Setting this flag on a scheduler fence prevents pipelining of jobs depending on this fence. In other words we always insert a full CPU round trip before dependen jobs are pushed to the hw queue.

struct drm_sched_entity

A wrapper around a job queue (typically attached to the DRM file_priv).

Definition

struct drm_sched_entity {
  struct list_head                list;
  struct drm_sched_rq             *rq;
  struct drm_gpu_scheduler        **sched_list;
  unsigned int                    num_sched_list;
  enum drm_sched_priority         priority;
  spinlock_t rq_lock;
  struct spsc_queue               job_queue;
  atomic_t fence_seq;
  uint64_t fence_context;
  struct dma_fence                *dependency;
  struct dma_fence_cb             cb;
  atomic_t *guilty;
  struct dma_fence                *last_scheduled;
  struct task_struct              *last_user;
  bool stopped;
  struct completion               entity_idle;
};

Members

list

Used to append this struct to the list of entities in the runqueue rq under drm_sched_rq.entities.

Protected by drm_sched_rq.lock of rq.

rq

Runqueue on which this entity is currently scheduled.

FIXME: Locking is very unclear for this. Writers are protected by rq_lock, but readers are generally lockless and seem to just race with not even a READ_ONCE.

sched_list

A list of schedulers (struct drm_gpu_scheduler). Jobs from this entity can be scheduled on any scheduler on this list.

This can be modified by calling drm_sched_entity_modify_sched(). Locking is entirely up to the driver, see the above function for more details.

This will be set to NULL if num_sched_list equals 1 and rq has been set already.

FIXME: This means priority changes through drm_sched_entity_set_priority() will be lost henceforth in this case.

num_sched_list

Number of drm_gpu_schedulers in the sched_list.

priority

Priority of the entity. This can be modified by calling drm_sched_entity_set_priority(). Protected by rq_lock.

rq_lock

Lock to modify the runqueue to which this entity belongs.

job_queue

the list of jobs of this entity.

fence_seq

A linearly increasing seqno incremented with each new drm_sched_fence which is part of the entity.

FIXME: Callers of drm_sched_job_arm() need to ensure correct locking, this doesn’t need to be atomic.

fence_context

A unique context for all the fences which belong to this entity. The drm_sched_fence.scheduled uses the fence_context but drm_sched_fence.finished uses fence_context + 1.

dependency

The dependency fence of the job which is on the top of the job queue.

cb

Callback for the dependency fence above.

guilty

Points to entities’ guilty.

last_scheduled

Points to the finished fence of the last scheduled job. Only written by the scheduler thread, can be accessed locklessly from drm_sched_job_arm() iff the queue is empty.

last_user

last group leader pushing a job into the entity.

stopped

Marks the enity as removed from rq and destined for termination. This is set by calling drm_sched_entity_flush() and by drm_sched_fini().

entity_idle

Signals when entity is not in use, used to sequence entity cleanup in drm_sched_entity_fini().

Description

Entities will emit jobs in order to their corresponding hardware ring, and the scheduler will alternate between entities based on scheduling policy.

struct drm_sched_rq

queue of entities to be scheduled.

Definition

struct drm_sched_rq {
  spinlock_t lock;
  struct drm_gpu_scheduler        *sched;
  struct list_head                entities;
  struct drm_sched_entity         *current_entity;
};

Members

lock

to modify the entities list.

sched

the scheduler to which this rq belongs to.

entities

list of the entities to be scheduled.

current_entity

the entity which is to be scheduled.

Description

Run queue is a set of entities scheduling command submissions for one specific ring. It implements the scheduling policy that selects the next entity to emit commands from.

struct drm_sched_fence

fences corresponding to the scheduling of a job.

Definition

struct drm_sched_fence {
  struct dma_fence                scheduled;
  struct dma_fence                finished;
  struct dma_fence                *parent;
  struct drm_gpu_scheduler        *sched;
  spinlock_t lock;
  void *owner;
};

Members

scheduled

this fence is what will be signaled by the scheduler when the job is scheduled.

finished

this fence is what will be signaled by the scheduler when the job is completed.

When setting up an out fence for the job, you should use this, since it’s available immediately upon drm_sched_job_init(), and the fence returned by the driver from run_job() won’t be created until the dependencies have resolved.

parent

the fence returned by drm_sched_backend_ops.run_job when scheduling the job on hardware. We signal the drm_sched_fence.finished fence once parent is signalled.

sched

the scheduler instance to which the job having this struct belongs to.

lock

the lock used by the scheduled and the finished fences.

owner

job owner for debugging

struct drm_sched_job

A job to be run by an entity.

Definition

struct drm_sched_job {
  struct spsc_node                queue_node;
  struct list_head                list;
  struct drm_gpu_scheduler        *sched;
  struct drm_sched_fence          *s_fence;
  union {
    struct dma_fence_cb             finish_cb;
    struct work_struct              work;
  };
  uint64_t id;
  atomic_t karma;
  enum drm_sched_priority         s_priority;
  struct drm_sched_entity         *entity;
  struct dma_fence_cb             cb;
  struct xarray                   dependencies;
  unsigned long                   last_dependency;
};

Members

queue_node

used to append this struct to the queue of jobs in an entity.

list

a job participates in a “pending” and “done” lists.

sched

the scheduler instance on which this job is scheduled.

s_fence

contains the fences for the scheduling of job.

{unnamed_union}

anonymous

finish_cb

the callback for the finished fence.

work

Helper to reschdeule job kill to different context.

id

a unique id assigned to each job scheduled on the scheduler.

karma

increment on every hang caused by this job. If this exceeds the hang limit of the scheduler then the job is marked guilty and will not be scheduled further.

s_priority

the priority of the job.

entity

the entity to which this job belongs.

cb

the callback for the parent fence in s_fence.

dependencies

Contains the dependencies as struct dma_fence for this job, see drm_sched_job_add_dependency() and drm_sched_job_add_implicit_dependencies().

last_dependency

tracks dependencies as they signal

Description

A job is created by the driver using drm_sched_job_init(), and should call drm_sched_entity_push_job() once it wants the scheduler to schedule the job.

struct drm_sched_backend_ops

Define the backend operations called by the scheduler

Definition

struct drm_sched_backend_ops {
  struct dma_fence *(*dependency)(struct drm_sched_job *sched_job, struct drm_sched_entity *s_entity);
  struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
  enum drm_gpu_sched_stat (*timedout_job)(struct drm_sched_job *sched_job);
  void (*free_job)(struct drm_sched_job *sched_job);
};

Members

dependency

Called when the scheduler is considering scheduling this job next, to get another struct dma_fence for this job to block on. Once it returns NULL, run_job() may be called.

If a driver exclusively uses drm_sched_job_add_dependency() and drm_sched_job_add_implicit_dependencies() this can be ommitted and left as NULL.

run_job

Called to execute the job once all of the dependencies have been resolved. This may be called multiple times, if timedout_job() has happened and drm_sched_job_recovery() decides to try it again.

timedout_job

Called when a job has taken too long to execute, to trigger GPU recovery.

This method is called in a workqueue context.

Drivers typically issue a reset to recover from GPU hangs, and this procedure usually follows the following workflow:

  1. Stop the scheduler using drm_sched_stop(). This will park the scheduler thread and cancel the timeout work, guaranteeing that nothing is queued while we reset the hardware queue

  2. Try to gracefully stop non-faulty jobs (optional)

  3. Issue a GPU reset (driver-specific)

  4. Re-submit jobs using drm_sched_resubmit_jobs()

  5. Restart the scheduler using drm_sched_start(). At that point, new jobs can be queued, and the scheduler thread is unblocked

Note that some GPUs have distinct hardware queues but need to reset the GPU globally, which requires extra synchronization between the timeout handler of the different drm_gpu_scheduler. One way to achieve this synchronization is to create an ordered workqueue (using alloc_ordered_workqueue()) at the driver level, and pass this queue to drm_sched_init(), to guarantee that timeout handlers are executed sequentially. The above workflow needs to be slightly adjusted in that case:

  1. Stop all schedulers impacted by the reset using drm_sched_stop()

  2. Try to gracefully stop non-faulty jobs on all queues impacted by the reset (optional)

  3. Issue a GPU reset on all faulty queues (driver-specific)

  4. Re-submit jobs on all schedulers impacted by the reset using drm_sched_resubmit_jobs()

  5. Restart all schedulers that were stopped in step #1 using drm_sched_start()

Return DRM_GPU_SCHED_STAT_NOMINAL, when all is normal, and the underlying driver has started or completed recovery.

Return DRM_GPU_SCHED_STAT_ENODEV, if the device is no longer available, i.e. has been unplugged.

free_job

Called once the job’s finished fence has been signaled and it’s time to clean it up.

Description

These functions should be implemented in the driver side.

struct drm_gpu_scheduler

scheduler instance-specific data

Definition

struct drm_gpu_scheduler {
  const struct drm_sched_backend_ops      *ops;
  uint32_t hw_submission_limit;
  long timeout;
  const char                      *name;
  struct drm_sched_rq             sched_rq[DRM_SCHED_PRIORITY_COUNT];
  wait_queue_head_t wake_up_worker;
  wait_queue_head_t job_scheduled;
  atomic_t hw_rq_count;
  atomic64_t job_id_count;
  struct workqueue_struct         *timeout_wq;
  struct delayed_work             work_tdr;
  struct task_struct              *thread;
  struct list_head                pending_list;
  spinlock_t job_list_lock;
  int hang_limit;
  atomic_t *score;
  atomic_t _score;
  bool ready;
  bool free_guilty;
  struct device                   *dev;
};

Members

ops

backend operations provided by the driver.

hw_submission_limit

the max size of the hardware queue.

timeout

the time after which a job is removed from the scheduler.

name

name of the ring for which this scheduler is being used.

sched_rq

priority wise array of run queues.

wake_up_worker

the wait queue on which the scheduler sleeps until a job is ready to be scheduled.

job_scheduled

once drm_sched_entity_do_release is called the scheduler waits on this wait queue until all the scheduled jobs are finished.

hw_rq_count

the number of jobs currently in the hardware queue.

job_id_count

used to assign unique id to the each job.

timeout_wq

workqueue used to queue work_tdr

work_tdr

schedules a delayed call to drm_sched_job_timedout after the timeout interval is over.

thread

the kthread on which the scheduler which run.

pending_list

the list of jobs which are currently in the job queue.

job_list_lock

lock to protect the pending_list.

hang_limit

once the hangs by a job crosses this limit then it is marked guilty and it will no longer be considered for scheduling.

score

score to help loadbalancer pick a idle sched

_score

score used when the driver doesn’t provide one

ready

marks if the underlying HW is ready to work

free_guilty

A hit to time out handler to free the guilty job.

dev

system struct device

Description

One scheduler is implemented for each hardware ring.

bool drm_sched_dependency_optimized(struct dma_fence *fence, struct drm_sched_entity *entity)

test if the dependency can be optimized

Parameters

struct dma_fence* fence

the dependency fence

struct drm_sched_entity *entity

the entity which depends on the above fence

Description

Returns true if the dependency can be optimized and false otherwise

void drm_sched_fault(struct drm_gpu_scheduler *sched)

immediately start timeout handler

Parameters

struct drm_gpu_scheduler *sched

scheduler where the timeout handling should be started.

Description

Start timeout handling immediately when the driver detects a hardware fault.

unsigned long drm_sched_suspend_timeout(struct drm_gpu_scheduler *sched)

Suspend scheduler job timeout

Parameters

struct drm_gpu_scheduler *sched

scheduler instance for which to suspend the timeout

Description

Suspend the delayed work timeout for the scheduler. This is done by modifying the delayed work timeout to an arbitrary large value, MAX_SCHEDULE_TIMEOUT in this case.

Returns the timeout remaining

void drm_sched_resume_timeout(struct drm_gpu_scheduler *sched, unsigned long remaining)

Resume scheduler job timeout

Parameters

struct drm_gpu_scheduler *sched

scheduler instance for which to resume the timeout

unsigned long remaining

remaining timeout

Description

Resume the delayed work timeout for the scheduler.

void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)

stop the scheduler

Parameters

struct drm_gpu_scheduler *sched

scheduler instance

struct drm_sched_job *bad

job which caused the time out

Description

Stop the scheduler and also removes and frees all completed jobs.

Note

bad job will not be freed as it might be used later and so it’s callers responsibility to release it manually if it’s not part of the pending list any more.

void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)

recover jobs after a reset

Parameters

struct drm_gpu_scheduler *sched

scheduler instance

bool full_recovery

proceed with complete sched restart

void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)

helper to relaunch jobs from the pending list

Parameters

struct drm_gpu_scheduler *sched

scheduler instance

void drm_sched_resubmit_jobs_ext(struct drm_gpu_scheduler *sched, int max)

helper to relunch certain number of jobs from mirror ring list

Parameters

struct drm_gpu_scheduler *sched

scheduler instance

int max

job numbers to relaunch

int drm_sched_job_init(struct drm_sched_job *job, struct drm_sched_entity *entity, void *owner)

init a scheduler job

Parameters

struct drm_sched_job *job

scheduler job to init

struct drm_sched_entity *entity

scheduler entity to use

void *owner

job owner for debugging

Description

Refer to drm_sched_entity_push_job() documentation for locking considerations.

Drivers must make sure drm_sched_job_cleanup() if this function returns successfully, even when job is aborted before drm_sched_job_arm() is called.

WARNING: amdgpu abuses drm_sched.ready to signal when the hardware has died, which can mean that there’s no valid runqueue for a entity. This function returns -ENOENT in this case (which probably should be -EIO as a more meanigful return value).

Returns 0 for success, negative error code otherwise.

void drm_sched_job_arm(struct drm_sched_job *job)

arm a scheduler job for execution

Parameters

struct drm_sched_job *job

scheduler job to arm

Description

This arms a scheduler job for execution. Specifically it initializes the drm_sched_job.s_fence of job, so that it can be attached to struct dma_resv or other places that need to track the completion of this job.

Refer to drm_sched_entity_push_job() documentation for locking considerations.

This can only be called if drm_sched_job_init() succeeded.

int drm_sched_job_add_dependency(struct drm_sched_job *job, struct dma_fence *fence)

adds the fence as a job dependency

Parameters

struct drm_sched_job *job

scheduler job to add the dependencies to

struct dma_fence *fence

the dma_fence to add to the list of dependencies.

Description

Note that fence is consumed in both the success and error cases.

Return

0 on success, or an error on failing to expand the array.

int drm_sched_job_add_implicit_dependencies(struct drm_sched_job *job, struct drm_gem_object *obj, bool write)

adds implicit dependencies as job dependencies

Parameters

struct drm_sched_job *job

scheduler job to add the dependencies to

struct drm_gem_object *obj

the gem object to add new dependencies from.

bool write

whether the job might write the object (so we need to depend on shared fences in the reservation object).

Description

This should be called after drm_gem_lock_reservations() on your array of GEM objects used in the job but before updating the reservations with your own fences.

Return

0 on success, or an error on failing to expand the array.

void drm_sched_job_cleanup(struct drm_sched_job *job)

clean up scheduler job resources

Parameters

struct drm_sched_job *job

scheduler job to clean up

Description

Cleans up the resources allocated with drm_sched_job_init().

Drivers should call this from their error unwind code if job is aborted before drm_sched_job_arm() is called.

After that point of no return job is committed to be executed by the scheduler, and this function should be called from the drm_sched_backend_ops.free_job callback.

struct drm_gpu_scheduler *drm_sched_pick_best(struct drm_gpu_scheduler **sched_list, unsigned int num_sched_list)

Get a drm sched from a sched_list with the least load

Parameters

struct drm_gpu_scheduler **sched_list

list of drm_gpu_schedulers

unsigned int num_sched_list

number of drm_gpu_schedulers in the sched_list

Description

Returns pointer of the sched with the least load or NULL if none of the drm_gpu_schedulers are ready

int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_backend_ops *ops, unsigned hw_submission, unsigned hang_limit, long timeout, struct workqueue_struct *timeout_wq, atomic_t *score, const char *name, struct device *dev)

Init a gpu scheduler instance

Parameters

struct drm_gpu_scheduler *sched

scheduler instance

const struct drm_sched_backend_ops *ops

backend operations for this scheduler

unsigned hw_submission

number of hw submissions that can be in flight

unsigned hang_limit

number of times to allow a job to hang before dropping it

long timeout

timeout value in jiffies for the scheduler

struct workqueue_struct *timeout_wq

workqueue to use for timeout work. If NULL, the system_wq is used

atomic_t *score

optional score atomic shared with other schedulers

const char *name

name used for debugging

struct device *dev

target struct device

Description

Return 0 on success, otherwise error code.

void drm_sched_fini(struct drm_gpu_scheduler *sched)

Destroy a gpu scheduler

Parameters

struct drm_gpu_scheduler *sched

scheduler instance

Description

Tears down and cleans up the scheduler.

void drm_sched_increase_karma_ext(struct drm_sched_job *bad, int type)

Update sched_entity guilty flag

Parameters

struct drm_sched_job *bad

The job guilty of time out

int type

type for increase/reset karma

int drm_sched_entity_init(struct drm_sched_entity *entity, enum drm_sched_priority priority, struct drm_gpu_scheduler **sched_list, unsigned int num_sched_list, atomic_t *guilty)

Init a context entity used by scheduler when submit to HW ring.

Parameters

struct drm_sched_entity *entity

scheduler entity to init

enum drm_sched_priority priority

priority of the entity

struct drm_gpu_scheduler **sched_list

the list of drm scheds on which jobs from this entity can be submitted

unsigned int num_sched_list

number of drm sched in sched_list

atomic_t *guilty

atomic_t set to 1 when a job on this queue is found to be guilty causing a timeout

Description

Note that the sched_list must have at least one element to schedule the entity.

For changing priority later on at runtime see drm_sched_entity_set_priority(). For changing the set of schedulers sched_list at runtime see drm_sched_entity_modify_sched().

An entity is cleaned up by callind drm_sched_entity_fini(). See also drm_sched_entity_destroy().

Returns 0 on success or a negative error code on failure.

void drm_sched_entity_modify_sched(struct drm_sched_entity *entity, struct drm_gpu_scheduler **sched_list, unsigned int num_sched_list)

Modify sched of an entity

Parameters

struct drm_sched_entity *entity

scheduler entity to init

struct drm_gpu_scheduler **sched_list

the list of new drm scheds which will replace existing entity->sched_list

unsigned int num_sched_list

number of drm sched in sched_list

Description

Note that this must be called under the same common lock for entity as drm_sched_job_arm() and drm_sched_entity_push_job(), or the driver needs to guarantee through some other means that this is never called while new jobs can be pushed to entity.

long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)

Flush a context entity

Parameters

struct drm_sched_entity *entity

scheduler entity

long timeout

time to wait in for Q to become empty in jiffies.

Description

Splitting drm_sched_entity_fini() into two functions, The first one does the waiting, removes the entity from the runqueue and returns an error when the process was killed.

Returns the remaining time in jiffies left from the input timeout

void drm_sched_entity_fini(struct drm_sched_entity *entity)

Destroy a context entity

Parameters

struct drm_sched_entity *entity

scheduler entity

Description

Cleanups up entity which has been initialized by drm_sched_entity_init().

If there are potentially job still in flight or getting newly queued drm_sched_entity_flush() must be called first. This function then goes over the entity and signals all jobs with an error code if the process was killed.

void drm_sched_entity_destroy(struct drm_sched_entity *entity)

Destroy a context entity

Parameters

struct drm_sched_entity *entity

scheduler entity

Description

Calls drm_sched_entity_flush() and drm_sched_entity_fini() as a convenience wrapper.

void drm_sched_entity_set_priority(struct drm_sched_entity *entity, enum drm_sched_priority priority)

Sets priority of the entity

Parameters

struct drm_sched_entity *entity

scheduler entity

enum drm_sched_priority priority

scheduler priority

Description

Update the priority of runqueus used for the entity.

void drm_sched_entity_push_job(struct drm_sched_job *sched_job)

Submit a job to the entity’s job queue

Parameters

struct drm_sched_job *sched_job

job to submit

Note

To guarantee that the order of insertion to queue matches the job’s fence sequence number this function should be called with drm_sched_job_arm() under common lock for the struct drm_sched_entity that was set up for sched_job in drm_sched_job_init().

Description

Returns 0 for success, negative error code otherwise.