|  | ===================== | 
|  | DRM Memory Management | 
|  | ===================== | 
|  |  | 
|  | Modern Linux systems require large amount of graphics memory to store | 
|  | frame buffers, textures, vertices and other graphics-related data. Given | 
|  | the very dynamic nature of many of that data, managing graphics memory | 
|  | efficiently is thus crucial for the graphics stack and plays a central | 
|  | role in the DRM infrastructure. | 
|  |  | 
|  | The DRM core includes two memory managers, namely Translation Table Maps | 
|  | (TTM) and Graphics Execution Manager (GEM). TTM was the first DRM memory | 
|  | manager to be developed and tried to be a one-size-fits-them all | 
|  | solution. It provides a single userspace API to accommodate the need of | 
|  | all hardware, supporting both Unified Memory Architecture (UMA) devices | 
|  | and devices with dedicated video RAM (i.e. most discrete video cards). | 
|  | This resulted in a large, complex piece of code that turned out to be | 
|  | hard to use for driver development. | 
|  |  | 
|  | GEM started as an Intel-sponsored project in reaction to TTM's | 
|  | complexity. Its design philosophy is completely different: instead of | 
|  | providing a solution to every graphics memory-related problems, GEM | 
|  | identified common code between drivers and created a support library to | 
|  | share it. GEM has simpler initialization and execution requirements than | 
|  | TTM, but has no video RAM management capabilities and is thus limited to | 
|  | UMA devices. | 
|  |  | 
|  | The Translation Table Manager (TTM) | 
|  | =================================== | 
|  |  | 
|  | TTM design background and information belongs here. | 
|  |  | 
|  | TTM initialization | 
|  | ------------------ | 
|  |  | 
|  | **Warning** | 
|  | This section is outdated. | 
|  |  | 
|  | Drivers wishing to support TTM must pass a filled :c:type:`ttm_bo_driver | 
|  | <ttm_bo_driver>` structure to ttm_bo_device_init, together with an | 
|  | initialized global reference to the memory manager.  The ttm_bo_driver | 
|  | structure contains several fields with function pointers for | 
|  | initializing the TTM, allocating and freeing memory, waiting for command | 
|  | completion and fence synchronization, and memory migration. | 
|  |  | 
|  | The :c:type:`struct drm_global_reference <drm_global_reference>` is made | 
|  | up of several fields: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | struct drm_global_reference { | 
|  | enum ttm_global_types global_type; | 
|  | size_t size; | 
|  | void *object; | 
|  | int (*init) (struct drm_global_reference *); | 
|  | void (*release) (struct drm_global_reference *); | 
|  | }; | 
|  |  | 
|  |  | 
|  | There should be one global reference structure for your memory manager | 
|  | as a whole, and there will be others for each object created by the | 
|  | memory manager at runtime. Your global TTM should have a type of | 
|  | TTM_GLOBAL_TTM_MEM. The size field for the global object should be | 
|  | sizeof(struct ttm_mem_global), and the init and release hooks should | 
|  | point at your driver-specific init and release routines, which probably | 
|  | eventually call ttm_mem_global_init and ttm_mem_global_release, | 
|  | respectively. | 
|  |  | 
|  | Once your global TTM accounting structure is set up and initialized by | 
|  | calling ttm_global_item_ref() on it, you need to create a buffer | 
|  | object TTM to provide a pool for buffer object allocation by clients and | 
|  | the kernel itself. The type of this object should be | 
|  | TTM_GLOBAL_TTM_BO, and its size should be sizeof(struct | 
|  | ttm_bo_global). Again, driver-specific init and release functions may | 
|  | be provided, likely eventually calling ttm_bo_global_ref_init() and | 
|  | ttm_bo_global_ref_release(), respectively. Also, like the previous | 
|  | object, ttm_global_item_ref() is used to create an initial reference | 
|  | count for the TTM, which will call your initialization function. | 
|  |  | 
|  | See the radeon_ttm.c file for an example of usage. | 
|  |  | 
|  | The Graphics Execution Manager (GEM) | 
|  | ==================================== | 
|  |  | 
|  | The GEM design approach has resulted in a memory manager that doesn't | 
|  | provide full coverage of all (or even all common) use cases in its | 
|  | userspace or kernel API. GEM exposes a set of standard memory-related | 
|  | operations to userspace and a set of helper functions to drivers, and | 
|  | let drivers implement hardware-specific operations with their own | 
|  | private API. | 
|  |  | 
|  | The GEM userspace API is described in the `GEM - the Graphics Execution | 
|  | Manager <http://lwn.net/Articles/283798/>`__ article on LWN. While | 
|  | slightly outdated, the document provides a good overview of the GEM API | 
|  | principles. Buffer allocation and read and write operations, described | 
|  | as part of the common GEM API, are currently implemented using | 
|  | driver-specific ioctls. | 
|  |  | 
|  | GEM is data-agnostic. It manages abstract buffer objects without knowing | 
|  | what individual buffers contain. APIs that require knowledge of buffer | 
|  | contents or purpose, such as buffer allocation or synchronization | 
|  | primitives, are thus outside of the scope of GEM and must be implemented | 
|  | using driver-specific ioctls. | 
|  |  | 
|  | On a fundamental level, GEM involves several operations: | 
|  |  | 
|  | -  Memory allocation and freeing | 
|  | -  Command execution | 
|  | -  Aperture management at command execution time | 
|  |  | 
|  | Buffer object allocation is relatively straightforward and largely | 
|  | provided by Linux's shmem layer, which provides memory to back each | 
|  | object. | 
|  |  | 
|  | Device-specific operations, such as command execution, pinning, buffer | 
|  | read & write, mapping, and domain ownership transfers are left to | 
|  | driver-specific ioctls. | 
|  |  | 
|  | GEM Initialization | 
|  | ------------------ | 
|  |  | 
|  | Drivers that use GEM must set the DRIVER_GEM bit in the struct | 
|  | :c:type:`struct drm_driver <drm_driver>` driver_features | 
|  | field. The DRM core will then automatically initialize the GEM core | 
|  | before calling the load operation. Behind the scene, this will create a | 
|  | DRM Memory Manager object which provides an address space pool for | 
|  | object allocation. | 
|  |  | 
|  | In a KMS configuration, drivers need to allocate and initialize a | 
|  | command ring buffer following core GEM initialization if required by the | 
|  | hardware. UMA devices usually have what is called a "stolen" memory | 
|  | region, which provides space for the initial framebuffer and large, | 
|  | contiguous memory regions required by the device. This space is | 
|  | typically not managed by GEM, and must be initialized separately into | 
|  | its own DRM MM object. | 
|  |  | 
|  | GEM Objects Creation | 
|  | -------------------- | 
|  |  | 
|  | GEM splits creation of GEM objects and allocation of the memory that | 
|  | backs them in two distinct operations. | 
|  |  | 
|  | GEM objects are represented by an instance of struct :c:type:`struct | 
|  | drm_gem_object <drm_gem_object>`. Drivers usually need to | 
|  | extend GEM objects with private information and thus create a | 
|  | driver-specific GEM object structure type that embeds an instance of | 
|  | struct :c:type:`struct drm_gem_object <drm_gem_object>`. | 
|  |  | 
|  | To create a GEM object, a driver allocates memory for an instance of its | 
|  | specific GEM object type and initializes the embedded struct | 
|  | :c:type:`struct drm_gem_object <drm_gem_object>` with a call | 
|  | to drm_gem_object_init(). The function takes a pointer | 
|  | to the DRM device, a pointer to the GEM object and the buffer object | 
|  | size in bytes. | 
|  |  | 
|  | GEM uses shmem to allocate anonymous pageable memory. | 
|  | drm_gem_object_init() will create an shmfs file of the | 
|  | requested size and store it into the struct :c:type:`struct | 
|  | drm_gem_object <drm_gem_object>` filp field. The memory is | 
|  | used as either main storage for the object when the graphics hardware | 
|  | uses system memory directly or as a backing store otherwise. | 
|  |  | 
|  | Drivers are responsible for the actual physical pages allocation by | 
|  | calling shmem_read_mapping_page_gfp() for each page. | 
|  | Note that they can decide to allocate pages when initializing the GEM | 
|  | object, or to delay allocation until the memory is needed (for instance | 
|  | when a page fault occurs as a result of a userspace memory access or | 
|  | when the driver needs to start a DMA transfer involving the memory). | 
|  |  | 
|  | Anonymous pageable memory allocation is not always desired, for instance | 
|  | when the hardware requires physically contiguous system memory as is | 
|  | often the case in embedded devices. Drivers can create GEM objects with | 
|  | no shmfs backing (called private GEM objects) by initializing them with a call | 
|  | to drm_gem_private_object_init() instead of drm_gem_object_init(). Storage for | 
|  | private GEM objects must be managed by drivers. | 
|  |  | 
|  | GEM Objects Lifetime | 
|  | -------------------- | 
|  |  | 
|  | All GEM objects are reference-counted by the GEM core. References can be | 
|  | acquired and release by calling drm_gem_object_get() and drm_gem_object_put() | 
|  | respectively. | 
|  |  | 
|  | When the last reference to a GEM object is released the GEM core calls | 
|  | the :c:type:`struct drm_driver <drm_driver>` gem_free_object_unlocked | 
|  | operation. That operation is mandatory for GEM-enabled drivers and must | 
|  | free the GEM object and all associated resources. | 
|  |  | 
|  | void (\*gem_free_object) (struct drm_gem_object \*obj); Drivers are | 
|  | responsible for freeing all GEM object resources. This includes the | 
|  | resources created by the GEM core, which need to be released with | 
|  | drm_gem_object_release(). | 
|  |  | 
|  | GEM Objects Naming | 
|  | ------------------ | 
|  |  | 
|  | Communication between userspace and the kernel refers to GEM objects | 
|  | using local handles, global names or, more recently, file descriptors. | 
|  | All of those are 32-bit integer values; the usual Linux kernel limits | 
|  | apply to the file descriptors. | 
|  |  | 
|  | GEM handles are local to a DRM file. Applications get a handle to a GEM | 
|  | object through a driver-specific ioctl, and can use that handle to refer | 
|  | to the GEM object in other standard or driver-specific ioctls. Closing a | 
|  | DRM file handle frees all its GEM handles and dereferences the | 
|  | associated GEM objects. | 
|  |  | 
|  | To create a handle for a GEM object drivers call drm_gem_handle_create(). The | 
|  | function takes a pointer to the DRM file and the GEM object and returns a | 
|  | locally unique handle.  When the handle is no longer needed drivers delete it | 
|  | with a call to drm_gem_handle_delete(). Finally the GEM object associated with a | 
|  | handle can be retrieved by a call to drm_gem_object_lookup(). | 
|  |  | 
|  | Handles don't take ownership of GEM objects, they only take a reference | 
|  | to the object that will be dropped when the handle is destroyed. To | 
|  | avoid leaking GEM objects, drivers must make sure they drop the | 
|  | reference(s) they own (such as the initial reference taken at object | 
|  | creation time) as appropriate, without any special consideration for the | 
|  | handle. For example, in the particular case of combined GEM object and | 
|  | handle creation in the implementation of the dumb_create operation, | 
|  | drivers must drop the initial reference to the GEM object before | 
|  | returning the handle. | 
|  |  | 
|  | GEM names are similar in purpose to handles but are not local to DRM | 
|  | files. They can be passed between processes to reference a GEM object | 
|  | globally. Names can't be used directly to refer to objects in the DRM | 
|  | API, applications must convert handles to names and names to handles | 
|  | using the DRM_IOCTL_GEM_FLINK and DRM_IOCTL_GEM_OPEN ioctls | 
|  | respectively. The conversion is handled by the DRM core without any | 
|  | driver-specific support. | 
|  |  | 
|  | GEM also supports buffer sharing with dma-buf file descriptors through | 
|  | PRIME. GEM-based drivers must use the provided helpers functions to | 
|  | implement the exporting and importing correctly. See ?. Since sharing | 
|  | file descriptors is inherently more secure than the easily guessable and | 
|  | global GEM names it is the preferred buffer sharing mechanism. Sharing | 
|  | buffers through GEM names is only supported for legacy userspace. | 
|  | Furthermore PRIME also allows cross-device buffer sharing since it is | 
|  | based on dma-bufs. | 
|  |  | 
|  | GEM Objects Mapping | 
|  | ------------------- | 
|  |  | 
|  | Because mapping operations are fairly heavyweight GEM favours | 
|  | read/write-like access to buffers, implemented through driver-specific | 
|  | ioctls, over mapping buffers to userspace. However, when random access | 
|  | to the buffer is needed (to perform software rendering for instance), | 
|  | direct access to the object can be more efficient. | 
|  |  | 
|  | The mmap system call can't be used directly to map GEM objects, as they | 
|  | don't have their own file handle. Two alternative methods currently | 
|  | co-exist to map GEM objects to userspace. The first method uses a | 
|  | driver-specific ioctl to perform the mapping operation, calling | 
|  | do_mmap() under the hood. This is often considered | 
|  | dubious, seems to be discouraged for new GEM-enabled drivers, and will | 
|  | thus not be described here. | 
|  |  | 
|  | The second method uses the mmap system call on the DRM file handle. void | 
|  | \*mmap(void \*addr, size_t length, int prot, int flags, int fd, off_t | 
|  | offset); DRM identifies the GEM object to be mapped by a fake offset | 
|  | passed through the mmap offset argument. Prior to being mapped, a GEM | 
|  | object must thus be associated with a fake offset. To do so, drivers | 
|  | must call drm_gem_create_mmap_offset() on the object. | 
|  |  | 
|  | Once allocated, the fake offset value must be passed to the application | 
|  | in a driver-specific way and can then be used as the mmap offset | 
|  | argument. | 
|  |  | 
|  | The GEM core provides a helper method drm_gem_mmap() to | 
|  | handle object mapping. The method can be set directly as the mmap file | 
|  | operation handler. It will look up the GEM object based on the offset | 
|  | value and set the VMA operations to the :c:type:`struct drm_driver | 
|  | <drm_driver>` gem_vm_ops field. Note that drm_gem_mmap() doesn't map memory to | 
|  | userspace, but relies on the driver-provided fault handler to map pages | 
|  | individually. | 
|  |  | 
|  | To use drm_gem_mmap(), drivers must fill the struct :c:type:`struct drm_driver | 
|  | <drm_driver>` gem_vm_ops field with a pointer to VM operations. | 
|  |  | 
|  | The VM operations is a :c:type:`struct vm_operations_struct <vm_operations_struct>` | 
|  | made up of several fields, the more interesting ones being: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | struct vm_operations_struct { | 
|  | void (*open)(struct vm_area_struct * area); | 
|  | void (*close)(struct vm_area_struct * area); | 
|  | vm_fault_t (*fault)(struct vm_fault *vmf); | 
|  | }; | 
|  |  | 
|  |  | 
|  | The open and close operations must update the GEM object reference | 
|  | count. Drivers can use the drm_gem_vm_open() and drm_gem_vm_close() helper | 
|  | functions directly as open and close handlers. | 
|  |  | 
|  | The fault operation handler is responsible for mapping individual pages | 
|  | to userspace when a page fault occurs. Depending on the memory | 
|  | allocation scheme, drivers can allocate pages at fault time, or can | 
|  | decide to allocate memory for the GEM object at the time the object is | 
|  | created. | 
|  |  | 
|  | Drivers that want to map the GEM object upfront instead of handling page | 
|  | faults can implement their own mmap file operation handler. | 
|  |  | 
|  | For platforms without MMU the GEM core provides a helper method | 
|  | drm_gem_cma_get_unmapped_area(). The mmap() routines will call this to get a | 
|  | proposed address for the mapping. | 
|  |  | 
|  | To use drm_gem_cma_get_unmapped_area(), drivers must fill the struct | 
|  | :c:type:`struct file_operations <file_operations>` get_unmapped_area field with | 
|  | a pointer on drm_gem_cma_get_unmapped_area(). | 
|  |  | 
|  | More detailed information about get_unmapped_area can be found in | 
|  | Documentation/admin-guide/mm/nommu-mmap.rst | 
|  |  | 
|  | Memory Coherency | 
|  | ---------------- | 
|  |  | 
|  | When mapped to the device or used in a command buffer, backing pages for | 
|  | an object are flushed to memory and marked write combined so as to be | 
|  | coherent with the GPU. Likewise, if the CPU accesses an object after the | 
|  | GPU has finished rendering to the object, then the object must be made | 
|  | coherent with the CPU's view of memory, usually involving GPU cache | 
|  | flushing of various kinds. This core CPU<->GPU coherency management is | 
|  | provided by a device-specific ioctl, which evaluates an object's current | 
|  | domain and performs any necessary flushing or synchronization to put the | 
|  | object into the desired coherency domain (note that the object may be | 
|  | busy, i.e. an active render target; in that case, setting the domain | 
|  | blocks the client and waits for rendering to complete before performing | 
|  | any necessary flushing operations). | 
|  |  | 
|  | Command Execution | 
|  | ----------------- | 
|  |  | 
|  | Perhaps the most important GEM function for GPU devices is providing a | 
|  | command execution interface to clients. Client programs construct | 
|  | command buffers containing references to previously allocated memory | 
|  | objects, and then submit them to GEM. At that point, GEM takes care to | 
|  | bind all the objects into the GTT, execute the buffer, and provide | 
|  | necessary synchronization between clients accessing the same buffers. | 
|  | This often involves evicting some objects from the GTT and re-binding | 
|  | others (a fairly expensive operation), and providing relocation support | 
|  | which hides fixed GTT offsets from clients. Clients must take care not | 
|  | to submit command buffers that reference more objects than can fit in | 
|  | the GTT; otherwise, GEM will reject them and no rendering will occur. | 
|  | Similarly, if several objects in the buffer require fence registers to | 
|  | be allocated for correct rendering (e.g. 2D blits on pre-965 chips), | 
|  | care must be taken not to require more fence registers than are | 
|  | available to the client. Such resource management should be abstracted | 
|  | from the client in libdrm. | 
|  |  | 
|  | GEM Function Reference | 
|  | ---------------------- | 
|  |  | 
|  | .. kernel-doc:: include/drm/drm_gem.h | 
|  | :internal: | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_gem.c | 
|  | :export: | 
|  |  | 
|  | GEM CMA Helper Functions Reference | 
|  | ---------------------------------- | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_gem_cma_helper.c | 
|  | :doc: cma helpers | 
|  |  | 
|  | .. kernel-doc:: include/drm/drm_gem_cma_helper.h | 
|  | :internal: | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_gem_cma_helper.c | 
|  | :export: | 
|  |  | 
|  | GEM SHMEM Helper Function Reference | 
|  | ----------------------------------- | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_gem_shmem_helper.c | 
|  | :doc: overview | 
|  |  | 
|  | .. kernel-doc:: include/drm/drm_gem_shmem_helper.h | 
|  | :internal: | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_gem_shmem_helper.c | 
|  | :export: | 
|  |  | 
|  | GEM VRAM Helper Functions Reference | 
|  | ----------------------------------- | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_gem_vram_helper.c | 
|  | :doc: overview | 
|  |  | 
|  | .. kernel-doc:: include/drm/drm_gem_vram_helper.h | 
|  | :internal: | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_gem_vram_helper.c | 
|  | :export: | 
|  |  | 
|  | GEM TTM Helper Functions Reference | 
|  | ----------------------------------- | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_gem_ttm_helper.c | 
|  | :doc: overview | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_gem_ttm_helper.c | 
|  | :export: | 
|  |  | 
|  | VMA Offset Manager | 
|  | ================== | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_vma_manager.c | 
|  | :doc: vma offset manager | 
|  |  | 
|  | .. kernel-doc:: include/drm/drm_vma_manager.h | 
|  | :internal: | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_vma_manager.c | 
|  | :export: | 
|  |  | 
|  | .. _prime_buffer_sharing: | 
|  |  | 
|  | PRIME Buffer Sharing | 
|  | ==================== | 
|  |  | 
|  | PRIME is the cross device buffer sharing framework in drm, originally | 
|  | created for the OPTIMUS range of multi-gpu platforms. To userspace PRIME | 
|  | buffers are dma-buf based file descriptors. | 
|  |  | 
|  | Overview and Lifetime Rules | 
|  | --------------------------- | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_prime.c | 
|  | :doc: overview and lifetime rules | 
|  |  | 
|  | PRIME Helper Functions | 
|  | ---------------------- | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_prime.c | 
|  | :doc: PRIME Helpers | 
|  |  | 
|  | PRIME Function References | 
|  | ------------------------- | 
|  |  | 
|  | .. kernel-doc:: include/drm/drm_prime.h | 
|  | :internal: | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_prime.c | 
|  | :export: | 
|  |  | 
|  | DRM MM Range Allocator | 
|  | ====================== | 
|  |  | 
|  | Overview | 
|  | -------- | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_mm.c | 
|  | :doc: Overview | 
|  |  | 
|  | LRU Scan/Eviction Support | 
|  | ------------------------- | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_mm.c | 
|  | :doc: lru scan roster | 
|  |  | 
|  | DRM MM Range Allocator Function References | 
|  | ------------------------------------------ | 
|  |  | 
|  | .. kernel-doc:: include/drm/drm_mm.h | 
|  | :internal: | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_mm.c | 
|  | :export: | 
|  |  | 
|  | DRM Cache Handling | 
|  | ================== | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_cache.c | 
|  | :export: | 
|  |  | 
|  | DRM Sync Objects | 
|  | =========================== | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_syncobj.c | 
|  | :doc: Overview | 
|  |  | 
|  | .. kernel-doc:: include/drm/drm_syncobj.h | 
|  | :internal: | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/drm_syncobj.c | 
|  | :export: | 
|  |  | 
|  | GPU Scheduler | 
|  | ============= | 
|  |  | 
|  | Overview | 
|  | -------- | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c | 
|  | :doc: Overview | 
|  |  | 
|  | Scheduler Function References | 
|  | ----------------------------- | 
|  |  | 
|  | .. kernel-doc:: include/drm/gpu_scheduler.h | 
|  | :internal: | 
|  |  | 
|  | .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c | 
|  | :export: |