Knowing when a WebRTC session has broken
Once a WebRTC session has been established, WebRTC and the application need to know media can flow between parties and that nothing has been broken during the session. The way to know that the connection is still live is through STUN checks. A key question then arises: at what interval should these STUN checks be made?
The tradeoff is:
- A very high rate of STUN check means too much data overload on the media path, data that has no media value in it
- Insufficiently frequent checks will lead to a bad user experience when a connection is lost as it will take a longer time to determine the connection has failed, increasing the overall time needed to reestablish the connection
Product managers should be aware of this and the tradeoff so they know what to expect from their application, what to communicate to the customer and what to require from R&D. Moreover, STUN intervals may vary between browsers.
Not much as it is more of a WebRTC implementation decision than a standard thing, and therefore actual intervals may vary between WebRTC implementation (AKA browsers). It is possible that guidelines may be included in standards documents at some point in the future.
STUN checks serve two purposes:
- Ensure that the connection is still alive
- Ensure that the remote party still gives permission for media to be delivered
As mentioned above, there is a tradeoff between how rapidly a device might want to declare a connection failure and how “expensive” it is to do the checks.
Chrome is now running experiments in the background whenever users use WebRTC to get some idea of reasonable values for the interval between STUN checks.
On the lower bound side, Chrome has found real problems with intervals less than 20 ms (meaning checks that are closer than 20 ms apart) due to overall network delays. Interestingly, such checks typically have a data size of 156 bytes, and sending that every 20 ms would be approximately 64 kbps of data just in STUN checks.
Chrome is currently using an interval of 50 ms, corresponding to 24 kbps of data. They are running additional experiments to determine whether this is truly a good value to use.
Once Google is finished with their testing they will likely propose guidelines to be included in the standards. There are currently disagreements about the need for STUN check intervals to be included in the standard and therefore it is not clear if this will be standardized or not.