Building Telephony Systems with OpenSER

Most of the time, you will use REGISTER, INVITE, BYE, and CANCEL. Some messages are used for other features. As an example, INFO is used for DTMF relay and mid-call signaling information. PUBLISH, NOTIFY, and SUBSCRIBE give support to presence systems. REFER is used for call transfer and MESSAGE for chat applications. Newer messages can appear depending on the protocol standardization process.

Responses to these messages are in text format as in the HTTP protocol. Some of the most important are shown below:

SIP Dialog Flow

This section introduces some basic SIP operations using a simple example. Let's examine this message sequence between two user agents shown below. You can see several other flows associated with the session establishment in RFC3665.

The messages are labeled in sequence. In this example userA uses an IP phone to call another IP phone over the network. To complete the call, two SIP proxies are used.

The userA calls userB using its SIP identity, called SIP URI. The URI is similar to an email address, such as sip:[email protected]. A secure SIP URI can be used too, such as sips:[email protected]. A call made using SIPS will use a secure transport (TLS-Transport Layer Security) between the caller and the callee.

The transaction starts with userA sending an INVITE request addressed to userB. The INVITE request contains a certain number of header fields. Header fields are named attributes that provide additional information about the message; they include a unique identifier, the destination, and information about the session.

The first line of the message contains the method name. The following lines contain a list of header fields. This example contains the minimum set required. We will briefly describe these header fields below:

VIA: This contains the address at which userA will be waiting to receive responses to this request. It also contains a parameter called branch that identifies this transaction. The VIA header defines the last SIP hop as IP, transport, and transaction-specific parameters. VIA is used exclusively for routing back the replies. Each proxy adds an additional VIA header. It is a lot easier for replies to find the route back using the VIA header, than to go again to the location server or DNS.
TO: This contains the name (display name) and the SIP URI (that is, sip:[email protected]) to the destination originally selected. The TO header field is not used to route the packets.
FROM: This contains the name and SIP URI (that is, sip:[email protected]) that indicate the caller ID. This header field has a tag parameter containing a random string that was added to the URI by the IP phone. It is used for purposes of identification. The tag parameter is used in the TO and FROM fields. It serves as a general mechanism to identify the dialog, which is the combination of the Call-ID along with the two tags, one from each participant in the dialog. Tags can be useful in parallel forking.
CALL-ID: This contains a globally unique identifier for this call generated by the combination of a random string and the host name or IP address from the IP phone. A combination of the tags TO, FROM, and CALL-ID fully defines an end-to-end SIP relation known as a SIP dialog.
CSEQ: The CSEQ or command sequence contains an integer and a method name. The CSEQ number is incremented for each new request inside a SIP dialog and is a traditional sequence number.
CONTACT: This contains a SIP URI, which represents a direct route to contact userA, usually composed of a user name and a FQDN (fully qualified domain name). Sometimes the domains are not registered, thus, IP address are permitted too. While the VIA header field tells the other elements where to send a response, the CONTACT tells the other elements where to send future requests.
MAX-FORWARDS: This is used to limit the number of allowed hops a request can make in the path to its final destination. It consists of an integer decremented by one on each hop.
CONTENT-TYPE: This contains a body message description.
CONTENT-LENGTH: This contains a byte count of the body message.

Session details, like media type and codec are not described using SIP. Instead it uses a session description protocol called SDP (RFC2327). This SDP message is carried by the SIP message, similar to an email attachment.

The sequence is as follows:

The phone does not know the location of userB or the server responsible for domainB. Thus, it sends the INVITE request to the server responsible for the domain sipA. This address is configured in the phone of userA or can be discovered by DHCP. The server sipA.com is also known as the SIP proxy for the domain sipA.com.

In this example, the proxy receives the INVITE request and sends a message "100 trying" back to userA, signaling that the proxy received the INVITE and is working to forward the request. The SIP responses use a three digit code followed by a descriptive phrase. This response contains the same TO, FROM, CALL-ID, and CSEQ header fields and a parameter "branch" in the header field VIA as the INVITE request. This allows userA's phone to correlate the INVITE request sent.
ProxyA locates proxyB consulting a DNS server (SRV records) to find what server is responsible for the SIP domain sipB and forwards the INVITE request. Before sending the request to proxyA, it adds a VIA header field that contains its own address. This allows userA's phone to correlate the response to the INVITE request sent. .
ProxyB receives the INVITE request and responds with a "100 Trying" message back to proxyA indicating that it is processing the request.
ProxyB consults its own location database for userB's address and then it adds another VIA header field with its own address to the INVITE request and sends to userB's IP address.
UserB's phone receives the INVITE request and start ringing. The phone indicates back this condition, sending a message "180 Ringing".
This message is routed back through both proxies in the reverse direction. Each proxy uses the VIA header fields to determine where to send the response and removes its own VIA header from the top. As a result, the message "180 Ringing" can return back to the user without any lookups to DNS or Location Service Responses and without the need for stateful processing. Thus, each proxy sees all messages resulting from the INVITE request.
When userA's phone receives the "180 Ringing" Responses, it starts to ring back, to signal to the user that the call is ringing on the other side. Some phones show this in the display.
In this example, userB decides to answer the call. When userB responds, the phone sends a response "200 Ok" to indicate that the call was taken. The "200 Ok" message contains in its body a session description specifying codecs, ports, and everything pertaining to the session. It uses the SDP protocol for this duty. As a result, there is an exchange in two phases of messages from A to B (INVITE) and B to A (200 OK) negotiating the resources and capabilities used on the call in a simple "offer/response" model. If userB does not want to receive the call or is busy, the "200 OK" won't be sent and a message signaling the condition (that is, "486 Busy Here") will be sent instead.

The first line contains the response code and a description (OK). The following lines contain the header fields. The fields VIA, TO, FROM, CALL-ID, and CSEQ are copied from the INVITE request. There are three VIA fields, one added by userA, another by proxyA and finally that added by proxy B. The SIP phone of userB added a parameter TAG on both end points inside the dialog, which will be included on all future requests and responses for this call.

The CONTACT header field contains the URI with which userB can be contacted directly on their own IP phone.

The CONTENT-TYPE and CONTENT-LENGTH header-fields give some information about the the SDP header ahead. The SDP header contains media-related parameters used to establish the RTP session.

In this case, the message "200 Ok" is sent back through both proxies and is received by userA and then the phone stops ringing back indicating that the call was accepted.
Finally userA sends an ACK message to userB's phone confirming the reception of the "200 OK" message. In this example the ACK is sent directly from phoneA to phoneB avoiding both proxies. ACK is the only SIP method that has no reply. The endpoints learned each other's addresses from the CONTACT header fields during the INVITE process. This ends the cycle INVITE/200 OK/ACK also known as SIP three way handshake.
At this moment the session between both users starts and they send media packets to each other using a mutually agreed format established by the SDP protocol. Usually these packets are end-to-end. During the session, the parties can change the session characteristics issuing a new INVITE request. This is called a re-invite. If the re-invite is not acceptable, a message "488 Not Acceptable Here" will be sent, but the session will not fail.
At the session end, userB disconnects the phone and generates a BYE message. This message is routed directly to userA's softphone bypassing both proxies.
UserA confirms the reception of the BYE message with a "200 OK" message ending the session. No ACK is sent. An ACK is sent only for INVITE requests.

In some cases it can be important for proxies to stay in the middle of the signaling to see all messages between endpoints during the whole session. If the proxy wants to stay in the path after the initial INVITE request it has to add the RECORD-ROUTE header field to the request. This information will be received by userB's phone and it will send back the message through the proxies with the RECORD-ROUTE header field included too. Record routing is used in most scenarios.

The REGISTER request is the way that proxyB uses to learn the location of userB. When the phone initializes or at regular time intervals, softphone B sends a REGISTER request to a server on domain sipB known as "SIP REGISTRAR". The REGISTER messages associate a URI ([email protected]) to an IP address. This binding is stored in a database in the Location server. Usually the Registrar, Location, and Proxy server are in the same computer and use the same software. OpenSER is capable of playing the three roles. A URI can only be registered by a single device at a certain time.

Building Telephony Systems with OpenSER

Building Telephony Systems with OpenSER

Overview of this book

Related Content you might be interested in

Current Title:

Building Telephony Systems with OpenSER

Basic Messages

SIP Dialog Flow