CUDA timer

The principle of Using CUDA Events to measure time is for every CUDA stream, the operations are executed in order. So when calling cudaEventRecord in one stream, a timestamp was noted. cudaEventSynchronize will wait the event was recorded, and cudaEventElapsedTime will return the time span between two events.

The difference of creating event with cudaEventBlockingSync set or not is: using cudaEventBlockingSync will cause host thread to relinquish CPU when waiting for the event completes, while by default the CPU will do a busy-wait for event to complete.

References:
CUDA blocking flags;
How to Implement Performance Metrics in CUDA C/C++;
cudaTimer.

CUDA timer

CUDA timer

results matching ""

No results matching ""