src/devices/graphics/implement-vsync.jd - platform/docs/source.android.com - Git at Google

 page.title=Implementing VSYNC
 @jd:body

 <!--
     Copyright 2016 The Android Open Source Project

     Licensed under the Apache License, Version 2.0 (the "License");
     you may not use this file except in compliance with the License.
     You may obtain a copy of the License at

         http://www.apache.org/licenses/LICENSE-2.0

     Unless required by applicable law or agreed to in writing, software
     distributed under the License is distributed on an "AS IS" BASIS,
     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     See the License for the specific language governing permissions and
     limitations under the License.
 -->

 <div id="qv-wrapper">
   <div id="qv">
     <h2>In this document</h2>
     <ol id="auto-toc">
     </ol>
   </div>
 </div>


 <p>VSYNC synchronizes certain events to the refresh cycle of the display.
 Applications always start drawing on a VSYNC boundary, and SurfaceFlinger
 always composites on a VSYNC boundary. This eliminates stutters and improves
 visual performance of graphics.</p>

 <p>The Hardware Composer (HWC) has a function pointer indicating the function
 to implement for VSYNC:</p>

 <pre class=prettyprint> int (waitForVsync*) (int64_t *timestamp) </pre>

 <p>This function blocks until a VSYNC occurs and returns the timestamp of the
 actual VSYNC. A message must be sent every time VSYNC occurs. A client can
 receive a VSYNC timestamp once at specified intervals or continuously at
 intervals of 1. You must implement VSYNC with a maximum 1 ms lag (0.5 ms or less
 is recommended); timestamps returned must be extremely accurate.</p>

 <h2 id=explicit_synchronization>Explicit synchronization</h2>

 <p>Explicit synchronization is required and provides a mechanism for Gralloc
 buffers to be acquired and released in a synchronized way. Explicit
 synchronization allows producers and consumers of graphics buffers to signal
 when they are done with a buffer. This allows Android to asynchronously queue
 buffers to be read or written with the certainty that another consumer or
 producer does not currently need them. For details, see
 <a href="{@docRoot}devices/graphics/index.html#synchronization_framework">Synchronization
 framework</a>.</p>

 <p>The benefits of explicit synchronization include less behavior variation
 between devices, better debugging support, and improved testing metrics. For
 instance, the sync framework output readily identifies problem areas and root
 causes, and centralized SurfaceFlinger presentation timestamps show when events
 occur in the normal flow of the system.</p>

 <p>This communication is facilitated by the use of synchronization fences,
 which are required when requesting a buffer for consuming or producing. The
 synchronization framework consists of three main building blocks:
 <code>sync_timeline</code>, <code>sync_pt</code>, and <code>sync_fence</code>.</p>

 <h3 id=sync_timeline>sync_timeline</h3>

 <p>A <code>sync_timeline</code> is a monotonically increasing timeline that
 should be implemented for each driver instance, such as a GL context, display
 controller, or 2D blitter. This is essentially a counter of jobs submitted to
 the kernel for a particular piece of hardware. It provides guarantees about the
 order of operations and allows hardware-specific implementations.</p>

 <p>The sync_timeline is offered as a CPU-only reference implementation called
 <code>sw_sync</code> (software sync). If possible, use this instead of a
 <code>sync_timeline</code> to save resources and avoid complexity. If you’re not
 employing a hardware resource, <code>sw_sync</code> should be sufficient.</p>

 <p>If you must implement a <code>sync_timeline</code>, use the
 <code>sw_sync</code> driver as a starting point. Follow these guidelines:</p>

 <ul>
 <li>Provide useful names for all drivers, timelines, and fences. This simplifies
 debugging.</li>
 <li>Implement <code>timeline_value_str</code> and <code>pt_value_str</code>
 operators in your timelines to make debugging output more readable.</li>
 <li>If you want your userspace libraries (such as the GL library) to have access
 to the private data of your timelines, implement the fill driver_data operator.
 This lets you get information about the immutable sync_fence and
 <code>sync_pts</code> so you can build command lines based upon them.</li>
 </ul>

 <p>When implementing a <code>sync_timeline</code>, <strong>do not</strong>:</p>

 <ul>
 <li>Base it on any real view of time, such as when a wall clock or other piece
 of work might finish. It is better to create an abstract timeline that you can
 control.</li>
 <li>Allow userspace to explicitly create or signal a fence. This can result in
 one piece of the user pipeline creating a denial-of-service attack that halts
 all functionality. This is because the userspace cannot make promises on behalf
 of the kernel.</li>
 <li>Access <code>sync_timeline</code>, <code>sync_pt</code>, or
 <code>sync_fence</code> elements explicitly, as the API should provide all
 required functions.</li>
 </ul>

 <h3 id=sync_pt>sync_pt</h3>

 <p>A <code>sync_pt</code> is a single value or point on a sync_timeline. A point
 has three states: active, signaled, and error. Points start in the active state
 and transition to the signaled or error states. For instance, when a buffer is
 no longer needed by an image consumer, this sync_point is signaled so image
 producers know it is okay to write into the buffer again.</p>

 <h3 id=sync_fence>sync_fence</h3>

 <p>A <code>sync_fence</code> is a collection of <code>sync_pts</code> that often
 have different <code>sync_timeline</code> parents (such as for the display
 controller and GPU). These are the main primitives over which drivers and
 userspace communicate their dependencies. A fence is a promise from the kernel
 given upon accepting work that has been queued and assures completion in a
 finite amount of time.</p>

 <p>This allows multiple consumers or producers to signal they are using a
 buffer and to allow this information to be communicated with one function
 parameter. Fences are backed by a file descriptor and can be passed from
 kernel-space to user-space. For instance, a fence can contain two
 <code>sync_points</code> that signify when two separate image consumers are done
 reading a buffer. When the fence is signaled, the image producers know both
 consumers are done consuming.</p>

 <p>Fences, like <code>sync_pts</code>, start active and then change state based
 upon the state of their points. If all <code>sync_pts</code> become signaled,
 the <code>sync_fence</code> becomes signaled. If one <code>sync_pt</code> falls
 into an error state, the entire sync_fence has an error state.</p>

 <p>Membership in the <code>sync_fence</code> is immutable after the fence is
 created. As a <code>sync_pt</code> can be in only one fence, it is included as a
 copy. Even if two points have the same value, there will be two copies of the
 <code>sync_pt</code> in the fence. To get more than one point in a fence, a
 merge operation is conducted where points from two distinct fences are added to
 a third fence. If one of those points was signaled in the originating fence and
 the other was not, the third fence will also not be in a signaled state.</p>

 <p>To implement explicit synchronization, provide the following:</p>

 <ul>
 <li>A kernel-space driver that implements a synchronization timeline for a
 particular piece of hardware. Drivers that need to be fence-aware are generally
 anything that accesses or communicates with the Hardware Composer. Key files
 include:
 <ul>
 <li>Core implementation:
 <ul>
  <li><code>kernel/common/include/linux/sync.h</code></li>
  <li><code>kernel/common/drivers/base/sync.c</code></li>
 </ul></li>
 <li><code>sw_sync</code>:
 <ul>
  <li><code>kernel/common/include/linux/sw_sync.h</code></li>
  <li><code>kernel/common/drivers/base/sw_sync.c</code></li>
 </ul></li>
 <li>Documentation at <code>kernel/common//Documentation/sync.txt</code>.</li>
 <li>Library to communicate with the kernel-space in
  <code>platform/system/core/libsync</code>.</li>
 </ul></li>
 <li>A Hardware Composer HAL module (v1.3 or higher) that supports the new
 synchronization functionality. You must provide the appropriate synchronization
 fences as parameters to the <code>set()</code> and <code>prepare()</code>
 functions in the HAL.</li>
 <li>Two fence-related GL extensions (<code>EGL_ANDROID_native_fence_sync</code>
 and <code>EGL_ANDROID_wait_sync</code>) and fence support in your graphics
 drivers.</li>
 </ul>

 <p>For example, to use the API supporting the synchronization function, you
 might develop a display driver that has a display buffer function. Before the
 synchronization framework existed, this function would receive dma-bufs, put
 those buffers on the display, and block while the buffer is visible. For
 example:</p>

 <pre class=prettyprint>/*
  * assumes buf is ready to be displayed.  returns when buffer is no longer on
  * screen.
  */
 void display_buffer(struct dma_buf *buf);
 </pre>

 <p>With the synchronization framework, the API call is slightly more complex.
 While putting a buffer on display, you associate it with a fence that says when
 the buffer will be ready. You can queue up the work and initiate after the fence
 clears.</p>

 <p>In this manner, you are not blocking anything. You immediately return your
 own fence, which is a guarantee of when the buffer will be off of the display.
 As you queue up buffers, the kernel will list dependencies with the
 synchronization framework:</p>

 <pre class=prettyprint>/*
  * will display buf when fence is signaled.  returns immediately with a fence
  * that will signal when buf is no longer displayed.
  */
 struct sync_fence* display_buffer(struct dma_buf *buf, struct sync_fence
 *fence);
 </pre>


 <h2 id=sync_integration>Sync integration</h2>
 <p>This section explains how to integrate the low-level sync framework with
 different parts of the Android framework and the drivers that must communicate
 with one another.</p>

 <h3 id=integration_conventions>Integration conventions</h3>

 <p>The Android HAL interfaces for graphics follow consistent conventions so
 when file descriptors are passed across a HAL interface, ownership of the file
 descriptor is always transferred. This means:</p>

 <ul>
 <li>If you receive a fence file descriptor from the sync framework, you must
 close it.</li>
 <li>If you return a fence file descriptor to the sync framework, the framework
 will close it.</li>
 <li>To continue using the fence file descriptor, you must duplicate the
 descriptor.</li>
 </ul>

 <p>Every time a fence passes through BufferQueue (such as for a window that
 passes a fence to BufferQueue saying when its new contents will be ready) the
 fence object is renamed. Since kernel fence support allows fences to have
 strings for names, the sync framework uses the window name and buffer index
 that is being queued to name the fence (i.e., <code>SurfaceView:0</code>). This
 is helpful in debugging to identify the source of a deadlock as the names appear
 in the output of <code>/d/sync</code> and bug reports.</p>

 <h3 id=anativewindow_integration>ANativeWindow integration</h3>

 <p>ANativeWindow is fence aware and <code>dequeueBuffer</code>,
 <code>queueBuffer</code>, and <code>cancelBuffer</code> have fence parameters.
 </p>

 <h3 id=opengl_es_integration>OpenGL ES integration</h3>

 <p>OpenGL ES sync integration relies upon two EGL extensions:</p>

 <ul>
 <li><code>EGL_ANDROID_native_fence_sync</code>. Provides a way to either
 wrap or create native Android fence file descriptors in EGLSyncKHR objects.</li>
 <li><code>EGL_ANDROID_wait_sync</code>. Allows GPU-side stalls rather than in
 CPU, making the GPU wait for an EGLSyncKHR. This is essentially the same as the
 <code>EGL_KHR_wait_sync</code> extension (refer to that specification for
 details).</li>
 </ul>

 <p>These extensions can be used independently and are controlled by a compile
 flag in libgui. To use them, first implement the
 <code>EGL_ANDROID_native_fence_sync</code> extension along with the associated
 kernel support. Next, add a ANativeWindow support for fences to your driver then
 turn on support in libgui to make use of the
 <code>EGL_ANDROID_native_fence_sync</code> extension.</p>

 <p>In a second pass, enable the <code>EGL_ANDROID_wait_sync</code>
 extension in your driver and turn it on separately. The
 <code>EGL_ANDROID_native_fence_sync</code> extension consists of a distinct
 native fence EGLSync object type so extensions that apply to existing EGLSync
 object types don’t necessarily apply to <code>EGL_ANDROID_native_fence</code>
 objects to avoid unwanted interactions.</p>

 <p>The EGL_ANDROID_native_fence_sync extension employs a corresponding native
 fence file descriptor attribute that can be set only at creation time and
 cannot be directly queried onward from an existing sync object. This attribute
 can be set to one of two modes:</p>

 <ul>
 <li><em>A valid fence file descriptor</em>. Wraps an existing native Android
 fence file descriptor in an EGLSyncKHR object.</li>
 <li><em>-1</em>. Creates a native Android fence file descriptor from an
 EGLSyncKHR object.</li>
 </ul>

 <p>The DupNativeFenceFD function call is used to extract the EGLSyncKHR object
 from the native Android fence file descriptor. This has the same result as
 querying the attribute that was set but adheres to the convention that the
 recipient closes the fence (hence the duplicate operation). Finally, destroying
 the EGLSync object should close the internal fence attribute.</p>

 <h3 id=hardware_composer_integration>Hardware Composer integration</h3>

 <p>The Hardware Composer handles three types of sync fences:</p>

 <ul>
 <li><em>Acquire fence</em>. One per layer, set before calling
 <code>HWC::set</code>. It signals when Hardware Composer may read the buffer.</li>
 <li><em>Release fence</em>. One per layer, filled in by the driver in
 <code>HWC::set</code>. It signals when Hardware Composer is done reading the
 buffer so the framework can start using that buffer again for that particular
 layer.</li>
 <li><em>Retire fence</em>. One per the entire frame, filled in by the driver
 each time <code>HWC::set</code> is called. This covers all layers for the set
 operation and signals to the framework when all effects of this set operation
 have completed. The retire fence signals when the next set operation takes place
 on the screen.</li>
 </ul>

 <p>The retire fence can be used to determine how long each frame appears on the
 screen. This is useful in identifying the location and source of delays, such
 as a stuttering animation.</p>

 <h2 id=vsync_offset>VSYNC offset</h2>

 <p>Application and SurfaceFlinger render loops should be synchronized to the
 hardware VSYNC. On a VSYNC event, the display begins showing frame N while
 SurfaceFlinger begins compositing windows for frame N+1. The app handles
 pending input and generates frame N+2.</p>

 <p>Synchronizing with VSYNC delivers consistent latency. It reduces errors in
 apps and SurfaceFlinger and the drifting of displays in and out of phase with
 each other. This, however, does assume application and SurfaceFlinger per-frame
 times don’t vary widely. Nevertheless, the latency is at least two frames.</p>

 <p>To remedy this, you can employ VSYNC offsets to reduce the input-to-display
 latency by making application and composition signal relative to hardware
 VSYNC. This is possible because application plus composition usually takes less
 than 33 ms.</p>

 <p>The result of VSYNC offset is three signals with same period, offset
 phase:</p>

 <ul>
 <li><code>HW_VSYNC_0</code>. Display begins showing next frame.</li>
 <li><code>VSYNC</code>. App reads input and generates next frame.</li>
 <li><code>SF VSYNC</code>. SurfaceFlinger begins compositing for next frame.</li>
 </ul>

 <p>With VSYNC offset, SurfaceFlinger receives the buffer and composites the
 frame, while the application processes the input and renders the frame, all
 within a single frame of time.</p>

 <p class="note"><strong>Note:</strong> VSYNC offsets reduce the time available
 for app and composition and therefore provide a greater chance for error.</p>

 <h3 id=dispsync>DispSync</h3>

 <p>DispSync maintains a model of the periodic hardware-based VSYNC events of a
 display and uses that model to execute periodic callbacks at specific phase
 offsets from the hardware VSYNC events.</p>

 <p>DispSync is essentially a software phase lock loop (PLL) that generates the
 VSYNC and SF VSYNC signals used by Choreographer and SurfaceFlinger, even if
 not offset from hardware VSYNC.</p>

 <img src="images/dispsync.png" alt="DispSync flow">

 <p class="img-caption"><strong>Figure 1.</strong> DispSync flow</p>

 <p>DispSync has the following qualities:</p>

 <ul>
 <li><em>Reference</em>. HW_VSYNC_0.</li>
 <li><em>Output</em>. VSYNC and SF VSYNC.</li>
 <li><em>Feedback</em>. Retire fence signal timestamps from Hardware Composer.
 </li>
 </ul>

 <h3 id=vsync_retire_offset>VSYNC/Retire offset</h3>

 <p>The signal timestamp of retire fences must match HW VSYNC even on devices
 that don’t use the offset phase. Otherwise, errors appear to have greater
 severity than reality. Smart panels often have a delta: Retire fence is the end
 of direct memory access (DMA) to display memory, but the actual display switch
 and HW VSYNC is some time later.</p>

 <p><code>PRESENT_TIME_OFFSET_FROM_VSYNC_NS</code> is set in the device’s
 BoardConfig.mk make file. It is based upon the display controller and panel
 characteristics. Time from retire fence timestamp to HW VSYNC signal is
 measured in nanoseconds.</p>

 <h3 id=vsync_and_sf_vsync_offsets>VSYNC and SF_VSYNC offsets</h3>

 <p>The <code>VSYNC_EVENT_PHASE_OFFSET_NS</code> and
 <code>SF_VSYNC_EVENT_PHASE_OFFSET_NS</code> are set conservatively based on
 high-load use cases, such as partial GPU composition during window transition
 or Chrome scrolling through a webpage containing animations. These offsets
 allow for long application render time and long GPU composition time.</p>

 <p>More than a millisecond or two of latency is noticeable. We recommend
 integrating thorough automated error testing to minimize latency without
 significantly increasing error counts.</p>

 <p class="note"><strong>Note:</strong> Theses offsets are also configured in the
 device’s BoardConfig.mk file. Both settings are offset in nanoseconds after
 HW_VSYNC_0, default to zero (if not set), and can be negative.</p>
	page.title=Implementing VSYNC
	@jd:body

	<!--
	Copyright 2016 The Android Open Source Project

	Licensed under the Apache License, Version 2.0 (the "License");
	you may not use this file except in compliance with the License.
	You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	-->

	<div id="qv-wrapper">
	<div id="qv">
	<h2>In this document</h2>
	<ol id="auto-toc">
	</ol>
	</div>
	</div>


	<p>VSYNC synchronizes certain events to the refresh cycle of the display.
	Applications always start drawing on a VSYNC boundary, and SurfaceFlinger
	always composites on a VSYNC boundary. This eliminates stutters and improves
	visual performance of graphics.</p>

	<p>The Hardware Composer (HWC) has a function pointer indicating the function
	to implement for VSYNC:</p>

	<pre class=prettyprint> int (waitForVsync) (int64_t timestamp) </pre>

	<p>This function blocks until a VSYNC occurs and returns the timestamp of the
	actual VSYNC. A message must be sent every time VSYNC occurs. A client can
	receive a VSYNC timestamp once at specified intervals or continuously at
	intervals of 1. You must implement VSYNC with a maximum 1 ms lag (0.5 ms or less
	is recommended); timestamps returned must be extremely accurate.</p>

	<h2 id=explicit_synchronization>Explicit synchronization</h2>

	<p>Explicit synchronization is required and provides a mechanism for Gralloc
	buffers to be acquired and released in a synchronized way. Explicit
	synchronization allows producers and consumers of graphics buffers to signal
	when they are done with a buffer. This allows Android to asynchronously queue
	buffers to be read or written with the certainty that another consumer or
	producer does not currently need them. For details, see
	<a href="{@docRoot}devices/graphics/index.html#synchronization_framework">Synchronization
	framework</a>.</p>

	<p>The benefits of explicit synchronization include less behavior variation
	between devices, better debugging support, and improved testing metrics. For
	instance, the sync framework output readily identifies problem areas and root
	causes, and centralized SurfaceFlinger presentation timestamps show when events
	occur in the normal flow of the system.</p>

	<p>This communication is facilitated by the use of synchronization fences,
	which are required when requesting a buffer for consuming or producing. The
	synchronization framework consists of three main building blocks:
	<code>sync_timeline</code>, <code>sync_pt</code>, and <code>sync_fence</code>.</p>

	<h3 id=sync_timeline>sync_timeline</h3>

	<p>A <code>sync_timeline</code> is a monotonically increasing timeline that
	should be implemented for each driver instance, such as a GL context, display
	controller, or 2D blitter. This is essentially a counter of jobs submitted to
	the kernel for a particular piece of hardware. It provides guarantees about the
	order of operations and allows hardware-specific implementations.</p>

	<p>The sync_timeline is offered as a CPU-only reference implementation called
	<code>sw_sync</code> (software sync). If possible, use this instead of a
	<code>sync_timeline</code> to save resources and avoid complexity. If you’re not
	employing a hardware resource, <code>sw_sync</code> should be sufficient.</p>

	<p>If you must implement a <code>sync_timeline</code>, use the
	<code>sw_sync</code> driver as a starting point. Follow these guidelines:</p>

	<ul>
	<li>Provide useful names for all drivers, timelines, and fences. This simplifies
	debugging.</li>
	<li>Implement <code>timeline_value_str</code> and <code>pt_value_str</code>
	operators in your timelines to make debugging output more readable.</li>
	<li>If you want your userspace libraries (such as the GL library) to have access
	to the private data of your timelines, implement the fill driver_data operator.
	This lets you get information about the immutable sync_fence and
	<code>sync_pts</code> so you can build command lines based upon them.</li>
	</ul>

	<p>When implementing a <code>sync_timeline</code>, <strong>do not</strong>:</p>

	<ul>
	<li>Base it on any real view of time, such as when a wall clock or other piece
	of work might finish. It is better to create an abstract timeline that you can
	control.</li>
	<li>Allow userspace to explicitly create or signal a fence. This can result in
	one piece of the user pipeline creating a denial-of-service attack that halts
	all functionality. This is because the userspace cannot make promises on behalf
	of the kernel.</li>
	<li>Access <code>sync_timeline</code>, <code>sync_pt</code>, or
	<code>sync_fence</code> elements explicitly, as the API should provide all
	required functions.</li>
	</ul>

	<h3 id=sync_pt>sync_pt</h3>

	<p>A <code>sync_pt</code> is a single value or point on a sync_timeline. A point
	has three states: active, signaled, and error. Points start in the active state
	and transition to the signaled or error states. For instance, when a buffer is
	no longer needed by an image consumer, this sync_point is signaled so image
	producers know it is okay to write into the buffer again.</p>

	<h3 id=sync_fence>sync_fence</h3>

	<p>A <code>sync_fence</code> is a collection of <code>sync_pts</code> that often
	have different <code>sync_timeline</code> parents (such as for the display
	controller and GPU). These are the main primitives over which drivers and
	userspace communicate their dependencies. A fence is a promise from the kernel
	given upon accepting work that has been queued and assures completion in a
	finite amount of time.</p>

	<p>This allows multiple consumers or producers to signal they are using a
	buffer and to allow this information to be communicated with one function
	parameter. Fences are backed by a file descriptor and can be passed from
	kernel-space to user-space. For instance, a fence can contain two
	<code>sync_points</code> that signify when two separate image consumers are done
	reading a buffer. When the fence is signaled, the image producers know both
	consumers are done consuming.</p>

	<p>Fences, like <code>sync_pts</code>, start active and then change state based
	upon the state of their points. If all <code>sync_pts</code> become signaled,
	the <code>sync_fence</code> becomes signaled. If one <code>sync_pt</code> falls
	into an error state, the entire sync_fence has an error state.</p>

	<p>Membership in the <code>sync_fence</code> is immutable after the fence is
	created. As a <code>sync_pt</code> can be in only one fence, it is included as a
	copy. Even if two points have the same value, there will be two copies of the
	<code>sync_pt</code> in the fence. To get more than one point in a fence, a
	merge operation is conducted where points from two distinct fences are added to
	a third fence. If one of those points was signaled in the originating fence and
	the other was not, the third fence will also not be in a signaled state.</p>

	<p>To implement explicit synchronization, provide the following:</p>

	<ul>
	<li>A kernel-space driver that implements a synchronization timeline for a
	particular piece of hardware. Drivers that need to be fence-aware are generally
	anything that accesses or communicates with the Hardware Composer. Key files
	include:
	<ul>
	<li>Core implementation:
	<ul>
	<li><code>kernel/common/include/linux/sync.h</code></li>
	<li><code>kernel/common/drivers/base/sync.c</code></li>
	</ul></li>
	<li><code>sw_sync</code>:
	<ul>
	<li><code>kernel/common/include/linux/sw_sync.h</code></li>
	<li><code>kernel/common/drivers/base/sw_sync.c</code></li>
	</ul></li>
	<li>Documentation at <code>kernel/common//Documentation/sync.txt</code>.</li>
	<li>Library to communicate with the kernel-space in
	<code>platform/system/core/libsync</code>.</li>
	</ul></li>
	<li>A Hardware Composer HAL module (v1.3 or higher) that supports the new
	synchronization functionality. You must provide the appropriate synchronization
	fences as parameters to the <code>set()</code> and <code>prepare()</code>
	functions in the HAL.</li>
	<li>Two fence-related GL extensions (<code>EGL_ANDROID_native_fence_sync</code>
	and <code>EGL_ANDROID_wait_sync</code>) and fence support in your graphics
	drivers.</li>
	</ul>

	<p>For example, to use the API supporting the synchronization function, you
	might develop a display driver that has a display buffer function. Before the
	synchronization framework existed, this function would receive dma-bufs, put
	those buffers on the display, and block while the buffer is visible. For
	example:</p>

	<pre class=prettyprint>/*
	* assumes buf is ready to be displayed. returns when buffer is no longer on
	* screen.
	*/
	void display_buffer(struct dma_buf *buf);
	</pre>

	<p>With the synchronization framework, the API call is slightly more complex.
	While putting a buffer on display, you associate it with a fence that says when
	the buffer will be ready. You can queue up the work and initiate after the fence
	clears.</p>

	<p>In this manner, you are not blocking anything. You immediately return your
	own fence, which is a guarantee of when the buffer will be off of the display.
	As you queue up buffers, the kernel will list dependencies with the
	synchronization framework:</p>

	<pre class=prettyprint>/*
	* will display buf when fence is signaled. returns immediately with a fence
	* that will signal when buf is no longer displayed.
	*/
	struct sync_fence* display_buffer(struct dma_buf *buf, struct sync_fence
	*fence);
	</pre>


	<h2 id=sync_integration>Sync integration</h2>
	<p>This section explains how to integrate the low-level sync framework with
	different parts of the Android framework and the drivers that must communicate
	with one another.</p>

	<h3 id=integration_conventions>Integration conventions</h3>

	<p>The Android HAL interfaces for graphics follow consistent conventions so
	when file descriptors are passed across a HAL interface, ownership of the file
	descriptor is always transferred. This means:</p>

	<ul>
	<li>If you receive a fence file descriptor from the sync framework, you must
	close it.</li>
	<li>If you return a fence file descriptor to the sync framework, the framework
	will close it.</li>
	<li>To continue using the fence file descriptor, you must duplicate the
	descriptor.</li>
	</ul>

	<p>Every time a fence passes through BufferQueue (such as for a window that
	passes a fence to BufferQueue saying when its new contents will be ready) the
	fence object is renamed. Since kernel fence support allows fences to have
	strings for names, the sync framework uses the window name and buffer index
	that is being queued to name the fence (i.e., <code>SurfaceView:0</code>). This
	is helpful in debugging to identify the source of a deadlock as the names appear
	in the output of <code>/d/sync</code> and bug reports.</p>

	<h3 id=anativewindow_integration>ANativeWindow integration</h3>

	<p>ANativeWindow is fence aware and <code>dequeueBuffer</code>,
	<code>queueBuffer</code>, and <code>cancelBuffer</code> have fence parameters.
	</p>

	<h3 id=opengl_es_integration>OpenGL ES integration</h3>

	<p>OpenGL ES sync integration relies upon two EGL extensions:</p>

	<ul>
	<li><code>EGL_ANDROID_native_fence_sync</code>. Provides a way to either
	wrap or create native Android fence file descriptors in EGLSyncKHR objects.</li>
	<li><code>EGL_ANDROID_wait_sync</code>. Allows GPU-side stalls rather than in
	CPU, making the GPU wait for an EGLSyncKHR. This is essentially the same as the
	<code>EGL_KHR_wait_sync</code> extension (refer to that specification for
	details).</li>
	</ul>

	<p>These extensions can be used independently and are controlled by a compile
	flag in libgui. To use them, first implement the
	<code>EGL_ANDROID_native_fence_sync</code> extension along with the associated
	kernel support. Next, add a ANativeWindow support for fences to your driver then
	turn on support in libgui to make use of the
	<code>EGL_ANDROID_native_fence_sync</code> extension.</p>

	<p>In a second pass, enable the <code>EGL_ANDROID_wait_sync</code>
	extension in your driver and turn it on separately. The
	<code>EGL_ANDROID_native_fence_sync</code> extension consists of a distinct
	native fence EGLSync object type so extensions that apply to existing EGLSync
	object types don’t necessarily apply to <code>EGL_ANDROID_native_fence</code>
	objects to avoid unwanted interactions.</p>

	<p>The EGL_ANDROID_native_fence_sync extension employs a corresponding native
	fence file descriptor attribute that can be set only at creation time and
	cannot be directly queried onward from an existing sync object. This attribute
	can be set to one of two modes:</p>

	<ul>
	<li><em>A valid fence file descriptor</em>. Wraps an existing native Android
	fence file descriptor in an EGLSyncKHR object.</li>
	<li><em>-1</em>. Creates a native Android fence file descriptor from an
	EGLSyncKHR object.</li>
	</ul>

	<p>The DupNativeFenceFD function call is used to extract the EGLSyncKHR object
	from the native Android fence file descriptor. This has the same result as
	querying the attribute that was set but adheres to the convention that the
	recipient closes the fence (hence the duplicate operation). Finally, destroying
	the EGLSync object should close the internal fence attribute.</p>

	<h3 id=hardware_composer_integration>Hardware Composer integration</h3>

	<p>The Hardware Composer handles three types of sync fences:</p>

	<ul>
	<li><em>Acquire fence</em>. One per layer, set before calling
	<code>HWC::set</code>. It signals when Hardware Composer may read the buffer.</li>
	<li><em>Release fence</em>. One per layer, filled in by the driver in
	<code>HWC::set</code>. It signals when Hardware Composer is done reading the
	buffer so the framework can start using that buffer again for that particular
	layer.</li>
	<li><em>Retire fence</em>. One per the entire frame, filled in by the driver
	each time <code>HWC::set</code> is called. This covers all layers for the set
	operation and signals to the framework when all effects of this set operation
	have completed. The retire fence signals when the next set operation takes place
	on the screen.</li>
	</ul>

	<p>The retire fence can be used to determine how long each frame appears on the
	screen. This is useful in identifying the location and source of delays, such
	as a stuttering animation.</p>

	<h2 id=vsync_offset>VSYNC offset</h2>

	<p>Application and SurfaceFlinger render loops should be synchronized to the
	hardware VSYNC. On a VSYNC event, the display begins showing frame N while
	SurfaceFlinger begins compositing windows for frame N+1. The app handles
	pending input and generates frame N+2.</p>

	<p>Synchronizing with VSYNC delivers consistent latency. It reduces errors in
	apps and SurfaceFlinger and the drifting of displays in and out of phase with
	each other. This, however, does assume application and SurfaceFlinger per-frame
	times don’t vary widely. Nevertheless, the latency is at least two frames.</p>

	<p>To remedy this, you can employ VSYNC offsets to reduce the input-to-display
	latency by making application and composition signal relative to hardware
	VSYNC. This is possible because application plus composition usually takes less
	than 33 ms.</p>

	<p>The result of VSYNC offset is three signals with same period, offset
	phase:</p>

	<ul>
	<li><code>HW_VSYNC_0</code>. Display begins showing next frame.</li>
	<li><code>VSYNC</code>. App reads input and generates next frame.</li>
	<li><code>SF VSYNC</code>. SurfaceFlinger begins compositing for next frame.</li>
	</ul>

	<p>With VSYNC offset, SurfaceFlinger receives the buffer and composites the
	frame, while the application processes the input and renders the frame, all
	within a single frame of time.</p>

	<p class="note"><strong>Note:</strong> VSYNC offsets reduce the time available
	for app and composition and therefore provide a greater chance for error.</p>

	<h3 id=dispsync>DispSync</h3>

	<p>DispSync maintains a model of the periodic hardware-based VSYNC events of a
	display and uses that model to execute periodic callbacks at specific phase
	offsets from the hardware VSYNC events.</p>

	<p>DispSync is essentially a software phase lock loop (PLL) that generates the
	VSYNC and SF VSYNC signals used by Choreographer and SurfaceFlinger, even if
	not offset from hardware VSYNC.</p>

	<img src="images/dispsync.png" alt="DispSync flow">

	<p class="img-caption"><strong>Figure 1.</strong> DispSync flow</p>

	<p>DispSync has the following qualities:</p>

	<ul>
	<li><em>Reference</em>. HW_VSYNC_0.</li>
	<li><em>Output</em>. VSYNC and SF VSYNC.</li>
	<li><em>Feedback</em>. Retire fence signal timestamps from Hardware Composer.
	</li>
	</ul>

	<h3 id=vsync_retire_offset>VSYNC/Retire offset</h3>

	<p>The signal timestamp of retire fences must match HW VSYNC even on devices
	that don’t use the offset phase. Otherwise, errors appear to have greater
	severity than reality. Smart panels often have a delta: Retire fence is the end
	of direct memory access (DMA) to display memory, but the actual display switch
	and HW VSYNC is some time later.</p>

	<p><code>PRESENT_TIME_OFFSET_FROM_VSYNC_NS</code> is set in the device’s
	BoardConfig.mk make file. It is based upon the display controller and panel
	characteristics. Time from retire fence timestamp to HW VSYNC signal is
	measured in nanoseconds.</p>

	<h3 id=vsync_and_sf_vsync_offsets>VSYNC and SF_VSYNC offsets</h3>

	<p>The <code>VSYNC_EVENT_PHASE_OFFSET_NS</code> and
	<code>SF_VSYNC_EVENT_PHASE_OFFSET_NS</code> are set conservatively based on
	high-load use cases, such as partial GPU composition during window transition
	or Chrome scrolling through a webpage containing animations. These offsets
	allow for long application render time and long GPU composition time.</p>

	<p>More than a millisecond or two of latency is noticeable. We recommend
	integrating thorough automated error testing to minimize latency without
	significantly increasing error counts.</p>

	<p class="note"><strong>Note:</strong> Theses offsets are also configured in the
	device’s BoardConfig.mk file. Both settings are offset in nanoseconds after
	HW_VSYNC_0, default to zero (if not set), and can be negative.</p>