Open GL : Advanced Geometry Management – Synchronizing When OpenGL Begins to Draw

In an advanced application, OpenGL’s order of operation and the pipeline nature of the system may be important. Examples of such applications are those with multiple contexts and multiple threads, or those sharing data between OpenGL and other APIs such as OpenCL. In some cases, it may be necessary to determine whether commands sent to OpenGL have finished yet and whether the results of those commands are ready. OpenGL includes two commands to force it to start working on commands or to finish working on commands that have been issued so far. These are

glFlush();
 

and

glFinish();
 

There are subtle differences between the two. The first, glFlush, ensures that any commands issued so far are at least placed into the start of the OpenGL pipeline and that they will eventually be executed. glFinish, on the other hand actually ensures that all commands issued have been fully executed and that the OpenGL pipeline is empty. The problem is that glFlush doesn’t tell you anything about the execution status of the commands issued—only that they will eventually be executed, and while glFinish does ensure that all of your OpenGL commands have been processed, it will empty the OpenGL pipeline, causing a bubble and reducing performance, sometimes drastically.

Sometimes it may be necessary to know whether OpenGL has finished executing commands up to some point. This is especially useful when you are sharing data between two contexts or between OpenGL and OpenCL, for example. This type of synchronization is managed by what are known as sync objects. Like any other OpenGL object, they must be created before they are used and destroyed when they are no longer needed. Sync objects have two possible states: signaled and unsignaled. They start out in the unsignaled state, and when some particular event occurs, they move to the signaled state. The event that triggers their transition from unsignaled to signaled depends on their type. The type of sync object we are interested in is called a fence sync, and one can be created by calling

GLsync glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0);
 

The first parameter is a token specifying the event we’re going to wait for. In this case, GL_SYNC_GPU_COMMANDS_COMPLETE says that we want the GPU to have processed all commands in the pipeline before setting the state of the sync object to signaled. The second parameter is a flags field and is zero here because no flags are relevant for this type of sync object. The glFenceSync function returns a new GLsync object. As soon as the fence sync is created, it enters (in the unsignaled state) the OpenGL pipeline and is processed along with all the other commands without stalling OpenGL or consuming significant resources. When it reaches the end of the pipeline, it is “executed” like any other command, and this sets its state to signaled. Because of the in-order nature of OpenGL, this tells us that any OpenGL commands issued before the call to glFenceSync have completed, even though commands issued after the glFenceSync may not have reached the end of the pipeline yet.

Once the sync object has been created (and has therefore entered the OpenGL pipeline), we can query its state to find out if it’s reached the end of the pipeline yet, and we can ask OpenGL to wait for it to become signaled before returning to the application.

To determine whether the sync object has become signaled yet, call

glGetSynciv(sync, GL_SYNC_STATUS, sizeof(GLint), NULL, &result);
 

When glGetSynciv returns, result (which is a GLint) will contain GL_SIGNALED if the sync object was in the signaled state and GL_UNSIGNALED otherwise. This allows the application to poll the state of the sync object and use this information to potentially do some useful work while the GPU is busy with previous commands. For example, consider the code in Listing 1.

Listing 1. Working While Waiting for a Sync Object
GLint result = GL_UNSIGNALED;
glGetSynciv(sync, GL_SYNC_STATUS, sizeof(GLint), NULL, &result);
while (result != GL_SIGNALED) {
    DoSomeUsefulWork();
    glGetSynciv(sync, GL_SYNC_STATUS, sizeof(GLint), NULL, &result);
}


This code loops, doing a small amount of useful work on each iteration until the sync object becomes signaled. If the application were to create a sync object at the start of each frame, the application could wait for the sync object from two frames ago and do a variable amount of work depending on how long it takes the GPU to process the commands for that frame. This allows an application to balance the amount of work done by the CPU (such as the number of sound effects to mix together or the number of iterations of a physics simulation to run, for example) with the speed of the GPU.

To actually cause OpenGL to wait for a sync object to become signaled (and therefore, for the commands in the pipeline before the sync to complete), there are two functions that you can use:

glClientWaitSync(sync, GL_SYNC_FLUSH_COMMANDS_BIT, timeout);
 

or

glWaitSync(sync, 0, GL_TIMEOUT_IGNORED);
 

The first parameter to both functions is the name of the sync object that was returned by glFenceSync. The second and third parameters to the two functions have the same names but must be set differently.

For glClientWaitSync, the second parameter is a bitfield specifying additional behavior of the function. The GL_SYNC_FLUSH_COMMANDS_BIT tells glClientWaitSync to ensure that the sync object has entered the OpenGL pipeline before beginning to wait for it to become signaled. Without this bit, there is a possibility that OpenGL could watch for a sync object that hasn’t been sent down the pipeline yet, and the application could end up waiting forever and hang. It’s a good idea to set this bit unless you have a really good reason not to. The third parameter is a timeout value in nanoseconds to wait. If the sync object doesn’t become signaled within this time, glClientWaitSync returns a status code to indicate so. glClientWaitSync won’t return until either the sync object becomes signaled or a timeout occurs.

There are four possible status codes that might be returned by glClientWaitSync. They are summarized in Table 2.

Table 2. Possible Return Values for glClientWaitSync
Status Returned by glClientWaitSync Meaning
GL_ALREADY_SIGNALED The sync object was already signaled when glClientWaitSync was called and so the function returned immediately.
GL_TIMEOUT_EXPIRED The timeout specified in the timeout parameter expired, meaning that the sync object never became signaled in the allowed time.
GL_CONDITION_SATISFIED The sync object became signaled within the allowed timeout period (but was not already signaled when glClientWaitSync was called).
GL_WAIT_FAILED An error occurred (such as sync not being a valid sync object), and the user should check the result of glGetError() to get more information.
 

There are a couple of things to note about the timeout value. First, while the unit of measurement is nanoseconds, there is no accuracy requirement in OpenGL. If you specify that you want to wait for one nanosecond, OpenGL could round this up to the next millisecond or more. Second, if you specify a timeout value of zero, glClientWaitSync will return GL_ALREADY_SIGNALED if the sync object was in a signaled state at the time of the call and GL_TIMEOUT_EXPIRED otherwise. It will never return GL_CONDITION_SATISFIED.

For glWaitSync, the behavior is slightly different. The application won’t actually wait for the sync object to become signaled, only the GPU will. Therefore, glWaitSync will return to the application immediately. This makes the second and third parameters somewhat irrelevant. Because the application doesn’t wait for the function to return, there is no danger of hanging, and so the GL_SYNC_FLUSH_COMMANDS_BIT is not needed and would actually cause an error if specified. Also, the timeout will actually be implementation dependent and so the special timeout value GL_TIMEOUT_IGNORED is specified to make this clear. If you’re interested, you can find out what the timeout value used by your implementation is by calling glGetInteger64v with the GL_MAX_SERVER_WAIT_TIMEOUT parameter.

You might be wondering, “What is the point of asking the GPU to wait for a sync object to reach the end of the pipeline?” After all, the sync object will become signaled when it reaches the end of the pipeline, and so if you wait for it to reach the end of the pipeline, it will of course be signaled. Therefore, won’t glWaitSync just do nothing? This would be true if we only considered simple applications that only use a single OpenGL context and that don’t use other APIs. However, the power of sync objects is harnessed when using multiple OpenGL contexts. Sync objects can be shared between OpenGL contexts and between compatible APIs such as OpenCL. That is, a sync object created by a call to glFenceSync on one context can be waited for by a call to glWaitSync (or glClientWaitSync) on another context.

Consider this. You can ask one OpenGL context to hold off rendering something until another context has finished doing something. This allows synchronization between two contexts. You can have an application with two threads and two contexts (or more, if you want). If you create a sync object in each context, and then in each context you wait for the sync objects from the other contexts using either glClientWaitSync, you know that when all of the functions have returned, all of those contexts are synchronized with each other. Together with thread synchronization primitives provided by your OS (such as semaphores), you can keep rendering to multiple windows in sync.

An example of this type of usage is when a buffer is shared between two contexts. The first context is writing to the buffer using transform feedback, while the second context wants to draw the results of the transform feedback. The first context would draw using transform feedback mode. After calling glEndTransformFeedback, it immediately calls glFenceSync. Now, the application makes the second context current and calls glWaitSync to wait for the sync object to become signaled. It can then issue more commands to OpenGL (on the new context), and those are queued up by the drivers, ready to execute. Only when the GPU has finished recording data into the transform feedback buffers with the first context does it start to work on the commands using that data in the second context.

There are also extensions and other functionality in APIs like OpenCL that allow asynchronous writes to buffers. You can use glWaitSync to ask a GPU to wait until the data in a buffer is valid by creating a sync object on the context that generates the data and then waiting for that sync object to become signaled on the context that’s going to consume the data.

Sync objects only ever go from the unsignaled to the signaled state. There is no mechanism to put a sync object back into the unsignaled state, even manually. This is because a manual flip of a sync object can cause race conditions and possibly hang the application. Consider the situation where a sync object is created, reaches the end of the pipeline and becomes signaled, and then the application set it back to unsignaled. If another thread tried to wait for that sync object but didn’t start waiting until after the application had already set the sync object back to the unsignaled state, it would wait forever. Each sync object therefore represents a one-shot event, and every time a synchronization is required, a new sync object must be created by calling glFenceSync. Although it is always important to clean up after yourself by deleting objects when you’re done with them, this is particularly important with sync objects because you might be creating many new ones every frame. To delete a sync object, call

glDeleteSync(sync);

This deletes the sync object. This may not occur immediately; any thread that is watching for the sync object to become signaled will still wait for its respective timeouts, and the object will actually be deleted once nobody’s watching it any more. Thus, it is perfectly legal to call glWaitSync followed by glDeleteSync even though the sync object is still in the OpenGL pipeline.