5. RTP Media API - 《WebRTC v1.0 Documentation》

Note

There is not an exact 1:1 correspondence between tracks sent by one and received by the other. For one, IDs of tracks sent have no mapping to the IDs of tracks received. Also, replaceTrack changes the track sent by an without creating a new track on the receiver side; the corresponding RTCRtpReceiver will only have a single track, potentially representing multiple sources of media stitched together. Both and replaceTrack can be used to cause the same track to be sent multiple times, which will be observed on the receiver side as multiple receivers each with its own separate track. Thus it’s more accurate to think of a 1:1 relationship between an on one side and an RTCRtpReceiver‘s track on the other side, matching senders and receivers using the ‘s mid if necessary.

When sending media, the sender may need to rescale or resample the media to meet various requirements including the envelope negotiated by SDP.

Following the rules in [] (section 3.6.), the video MAY be downscaled in order to fit the SDP constraints. The media MUST NOT be upscaled to create fake data that did not occur in the input source, the media MUST NOT be cropped except as needed to satisfy constraints on pixel counts, and the aspect ratio MUST NOT be changed.

The WebRTC Working Group is seeking implementation feedback on the need and timeline for a more complex handling of this situation. Some possible designs have been discussed in .

When video is rescaled, for example for certain combinations of width or height and scaleResolutionDownBy values, situations when the resulting width or height is not an integer may occur. In such situations the user agent MUST use . What to transmit if the integer part of the scaled width or height is zero is implementation-specific.

The actual encoding and transmission of MediaStreamTracks is managed through objects called s. Similarly, the reception and decoding of MediaStreamTracks is managed through objects called s. Each RTCRtpSender is associated with at most one track, and each track to be received is associated with exactly one .

The encoding and transmission of each MediaStreamTrack SHOULD be made such that its characteristics (width, height and frameRate for video tracks; sampleSize, sampleRate and channelCount for audio tracks) are to a reasonable degree retained by the track created on the remote side. There are situations when this does not apply, there may for example be resource constraints at either endpoint or in the network or there may be settings applied that instruct the implementation to act differently.

In order for an RTCRtpTransceiver to send and/or receive media with another endpoint this must be negotiated with SDP such that both endpoints have an object that is associated with the same .

When creating an offer, enough media descriptions will be generated to cover all transceivers on that end. When this offer is set as the local description, any disassociated transceivers get associated with media descriptions in the offer.

When an offer is set as the remote description, any media descriptions in it not yet associated with a transceiver get associated with a new or existing transceiver. In this case, only disassociated transceivers that were created via the addTrack() method may be associated. Disassociated transceivers created via the () method, however, won’t get associated even if media descriptions are available in the remote offer. Instead, new transceivers will be created and associated if there aren’t enough addTrack()-created transceivers. This sets ()-created and addTransceiver()-created transceivers apart in a critical way that is not observable from inspecting their attributes.

When creating an answer, only media media descriptions that were present in the offer may be listed in the answer. As a consequence, any transceivers that were not associated when setting the remote offer remain disassociated after setting the local answer. This can be remedied by the answerer creating a follow-up offer, initiating another offer/answer exchange, or in the case of using ()-created transceivers, making sure that enough media descriptions are offered in the initial exchange.