Sync the gameplay with audio and music

    Still, for some games (mainly, rhythm games), it may be required to synchronize player actions with something happening in a song (usually in sync with the BPM). For this, having more precise timing information for an exact playback position is useful.

    Achieving very low playback timing precision is difficult. This is because many factors are at play during audio playback:

    • Mixed chunks of audio are not played immediately.
    • When playing on TVs, some delay may be added due to image processing.

    The most common way to reduce latency is to shrink the audio buffers (again, by editing the latency setting in the project settings). The problem is that when latency is too small, sound mixing will require considerably more CPU. This increases the risk of skipping (a crack in sound because a mix callback was lost).

    This is a common tradeoff, so Godot ships with sensible defaults that should not need to be altered.

    The problem, in the end, is not this slight delay but synchronizing graphics and audio for games that require it. Beginning with Godot 3.2, some helpers were added to obtain more precise playback timing.

    As mentioned before, If you call , sound will not begin immediately, but when the audio thread processes the next chunk.

    The output latency (what happens after the mix) can also be estimated by calling AudioServer.get_output_latency().

    Add these two and it’s possible to guess almost exactly when sound or music will begin playing in the speakers during _process():

    GDScript

    In the long run, though, as the sound hardware clock is never exactly in sync with the system clock, the timing information will slowly drift away.

    For a rhythm game where a song begins and ends after a few minutes, this approach is fine (and it’s the recommended approach). For a game where playback can last a much longer time, the game will eventually go out of sync and a different approach is needed.

    Using to obtain the current position for the song sounds ideal, but it’s not that useful as-is. This value will increment in chunks (every time the audio callback mixed a block of sound), so many calls can return the same value. Added to this, the value will be out of sync with the speakers too because of the previously mentioned reasons.

    Adding the return value from this function to get_playback_position() increases precision:

    GDScript

    To increase precision, subtract the latency information (how much it takes for the audio to be heard after it was mixed):

    GDScript

    The result may be a bit jittery due how multiple threads work. Just check that the value is not less than in the previous frame (discard it if so). This is also a less precise approach than the one before, but it will work for songs of any length, or synchronizing anything (sound effects, as an example) to music.

    Here is the same code as before using this approach: