RTP Protocol
In this tutorial, we are going to see What is the RTP Protocol?
The spread of computers, coupled with the availability of inexpensive audio/video hardware, as well as the availability of higher-speed links, have made the interest of using the Internet to send audio and video, types of data that traditionally were reserved to specialized networks for this purpose, and for some years now audio and video conferencing have become a common practice.
However, the nature of the Internet itself means that it is not suitable for real-time data transmission, and as a result, the quality of the audio sent over the Internet is, on average, of poor quality. This study focuses on the analysis and solution of these problems in order to allow an audio conferencing or telephone application over the Internet to adapt its behavior to maintain an acceptable audio quality even in cases where the network is quite congested. These solutions, presented as control mechanisms, have been implemented and tested on the Free Phone audio conferencing and Internet phone software that we have developed. A study on the behavior that would have these mechanisms in an Internet that was evolving to integrate the Fair Queueing service discipline showed that these mechanisms, which would still be necessary, would even perform better in this type of network.
What is the RTP Protocol?
The purpose of RTP is to provide a uniform way to transmit real-time constrained data (audio, video, etc.) over IP. The main role of RTP is to implement IP packet sequence numbers to reconstruct voice or video information even if the underlying network changes the order of the packets.
More generally, RTP allows:
- identify the type of information being transported,
- add time markers and sequence numbers to the information being transported
- control the packet arrival at its destination.
In addition, RTP can be transported by multicast packets in order to send conversations to multiple recipients.
Use of RTP
RTP allows the management of multimedia flows (voice, video) over IP. RTP works on UDP. The RTP header contains synchronization and numbering information. The encoding of the data will depend on the type of compression. RFCxxxx specifies RTP, but the adaptation of a compression method to RTP will be described in a specific RFC, for example, H261 on RTP is described in RFCxxxx. Each RTP channel is used per type of stream: one for audio, one for video. The xxx field is used for synchronization. RTP offers end-to-end service. It adds a header that provides the timing information needed to synchronize real-time streams such as audio and video. RTP (Realtime Transport Protocol) allows to transport and control of data streams that have real-time properties. RTP is an application-level protocol that uses the underlying TCP or UDP transport protocols. However, RTP is generally used over UDP. RTP can use Unicast (point-to-point) and Multicast (multipoint) modes.
RTP Header Format:
The RTP header contains the following information:
Here is the meaning of the different fields of the header:
- The field Version V of 2 bits length indicates the version of the protocol (V=2)
- The field padding P: 1 bit, if P is equal to 1, the packet contains additional padding bytes to finish the last packet.
- The X extension field: 1 bit, if X=1 the header is followed by an extension packet
- The CSRC count CC field: 4 bits, contains the number of CSRCs that follow the header
- The marker field M: 1 bit, its interpretation is defined by an application profile
- The Payload Type PT field: 7 bits, this field identifies the type of payload (audio, video, image, text, html, etc.)
- The sequence number field: 16 bits, its initial value is random and it increments by 1 for each packet sent, it can be used to detect lost packets
- The timestamp field: 32 bits, reflects the time when the first byte of the RTP packet was sent. This time must be derived from a monotonically increasing clock that is linear in time to allow synchronization and the calculation of jitter at the destination
- The SSRC field: 32 bits, uniquely identifies the source, its value is randomly chosen by the application. The SSRC field identifies the synchronization source (or simply “the source”). This identifier is chosen randomly with the interest that it is unique among all the sources of the same session. The list of SSRCs identifies the sources (SSRCs) that have contributed to obtaining the data contained in the packet that contains these identifiers. The number of identifiers is given in the CC field
- The CSRC field: 32 bits, identifies the contributing sources.