WebRTC Cookbook

At the beginning, peers don't know each other, and they don't know the necessary network information to make direct connection possible. Before establishing a direct connection, peers should exchange necessary data using some middle point—usually, a signaling server. This is a middle point that is known to each peer. So each peer can connect to the signaling server, and then one peer can call another one—by asking the signaling server to exchange specific data with another peer and make peers know each other.

So, you need a signaling server to run.

How to do it…

Before two peers can establish a direct connection, they should exchange specific data (ICE candidates and session descriptions) using a middle point—the signaling server. After that, one peer can call another one, and the direct peer-to-peer connection can be established.

Interactive Connectivity Establishment (ICE) is a technique used in Network Address Translator (NAT), which bypasses the process of establishing peer-to-peer direct communication. Usually, ICE candidates provide information about the IP address and port of the peer. Typically, an ICE candidate message might look like the following:

a=candidate:1 1 UDP 4257021352 192.168.0.10 1211 typ host

Session Description Protocol (SDP) is used by peers in WebRTC to configure exchanging (network configuration, audio/video codecs available, and so on). Every peer sends details regarding its configuration to another peer and gets the same details from it back. The following print depicts a part of an SDP packet representing the audio configuration options of a peer:

m=audio 53275 RTP/SAVPF 121 918 100 1 2 102 90 131 16

c=IN IP4 16.0.0.1

a=rtcp:53275 IN IP4 16.0.0.1

In the schema represented in the following diagram, you can see the generic flow of a call establishing process:

Note that TURN is not showed in the schema. If you used TURN, it would be depicted just after the STUN stage (before the first and second stage).

Making a call

To make a call, we need to take some steps to prepare (such as getting access to the browser's media):

Get access to the user's media:

function doGetUserMedia() {
    var constraints = {"audio": true, "video": {"mandatory": {}, "optional": []}};
        try {
            getUserMedia(constraints, onUserMediaSuccess,
                function(e) {
                    console.log("getUserMedia error "+ e.toString());
                });
        } catch (e) {
            console.log(e.toString());
        }
    };

If you succeed, create a peer connection object and make a call:

function onUserMediaSuccess(stream) {
        attachMediaStream(localVideo, stream);
        localStream = stream;
        createPeerConnection();
        pc.addStream(localStream);
        if (initiator) doCall();
};
function createPeerConnection() {
        var pc_constraints = {"optional": [{"DtlsSrtpKeyAgreement": true}]};
        try {
            pc = new RTCPeerConnection(pc_config, pc_constraints);
            pc.onicecandidate = onIceCandidate;
        } catch (e) {
            console.log(e.toString());
            pc = null;
            return;
        }
        pc.onaddstream = onRemoteStreamAdded;
};

function onIceCandidate(event) {
        if (event.candidate)
            sendMessage({type: 'candidate', label: event.candidate.sdpMLineIndex, id: event.candidate.sdpMid,candidate: event.candidate.candidate});
};

function onRemoteStreamAdded(event) {
        attachMediaStream(remoteVideo, event.stream);
        remoteStream = event.stream;
};

function doCall() {
        var constraints = {"optional": [], "mandatory": {"MozDontOfferDataChannel": true}};
        if (webrtcDetectedBrowser === "chrome")
            for (var prop in constraints.mandatory) if (prop.indexOf("Moz") != -1) delete constraints.mandatory[prop];

        constraints = mergeConstraints(constraints, sdpConstraints);
        pc.createOffer(setLocalAndSendMessage, errorCallBack, constraints);
};

Answering a call

Assuming that we will use WebSockets as a transport protocol for exchanging data with signaling server, every client application should have a function to process messages coming from the server. In general, it looks as follows:

function processSignalingMessage(message) {
        var msg = JSON.parse(message);
        if (msg.type === 'offer') {
            pc.setRemoteDescription(new RTCSessionDescription(msg));
            doAnswer();
        } else if (msg.type === 'answer') {
            pc.setRemoteDescription(new RTCSessionDescription(msg));
        } else if (msg.type === 'candidate') {
            var candidate = new RTCIceCandidate({sdpMLineIndex:msg.label, candidate:msg.candidate});
            pc.addIceCandidate(candidate);
        } else if (msg.type === 'GETROOM') {
            room = msg.value;
            onRoomReceived(room);
        } else if (msg.type === 'WRONGROOM') {
            window.location.href = "/";
        }
};

This function receives messages from the signaling server using the WebSockets layer and acts appropriately. For this recipe, we are interested in the offer type of message and doAnswer function.

The doAnswer function is presented in the following listing:

function doAnswer() {
    pc.createAnswer(setLocalAndSendMessage, errorCallBack, sdpConstraints);
};

The sdpConstraints object describes the WebRTC connection options to be used. In general, it looks as follows:

var sdpConstraints = {'mandatory': {'OfferToReceiveAudio':true, 'OfferToReceiveVideo':true }};

Here we can say that we would like to use both audio and video while establishing WebRTC peer-to-peer connection.

The errorCallback method is a callback function that is called in case of an error during the calling of the createAnswer function. In this callback function, you can print a message to the console that might help to debug the application.

The setLocalAndSendMessage function sets the local session description and sends it back to the signaling server. This data will be sent as an answer type of message, and then the signaling server will route this message to the caller:

function setLocalAndSendMessage(sessionDescription) {
    pc.setLocalDescription(sessionDescription);
    sendMessage(sessionDescription);
};

Note that you can find the full source code for this example supplied with this book.

How it works…

Firstly, we will ask the web browser to gain access to the user media (audio and video). The web browser will ask the user for these access rights. If we get the access, we can create a connection peer entity and send the call message to the signaling server, which will route this message to the remote peer.

The workflow of the code is very simple. The processSignalingMessage function should be called every time we get a message from the signaling server. Usually, you should set it as an onmessage event handler of the WebSocket JavaScript object.

After the message is received, this function detects the message type and acts appropriately. To answer an incoming call, it calls the doAnswer function that will do the rest of the magic—prepare the session description and send it back to the server.

The signaling server will get this reply as an answer message and will route it to the remote peer. After that, peers will have all the necessary data on each other to start establishing a direct connection.

There's more…

This is the basic functionality of WebRTC. Most of your applications will probably have the same code for this task. The only big difference might be communication with the signaling server—you can use any protocol you like.

WebRTC Cookbook

By : Andrii Sergiienko

WebRTC Cookbook

By: Andrii Sergiienko

Overview of this book

Related Content you might be interested in

Current Title:

WebRTC Cookbook

Making and answering calls

Getting ready

How to do it…

Making a call

Answering a call

How it works…

There's more…

See also