GStreamer RTSP negoiated RTP Transport Streamer Back Channel Communication

From RidgeRun Developer Wiki

Introduction

Imagine if an RTSP server, say a surveillance camera, was able to accept data from the client that connected to the camera. There are many interesting uses of such a capability, including:

  • Client sending data that changes an output signal, such as turning on a light or alarm.
  • Operator speaking into a microphone with the audio being sent to the survaliance camera and played out a loud speaker.
  • There are canned audio and/or video resources on the camera with a display and a remote imaging processing algorithm detects events, such as a dog misbehaving, and causes an audio file containing the owner's voice to be played.

It might be the video feed from the camera is being monitored by a security guard or maybe processed by various DSP algorithms. In either case, it might be useful to provide a back channel whereby audio / video / data from the RTSP client can be given to the RTSP server.

Possible protocol choices

Because of the flexibility of the various protocols supported by A/V over RTSP, there are many options to supporting a communication back channel to allow the client to give data to the RTSP server. Some of those choices include:

  • TCP application - Use an embedded web server, ONVIF, or other existing protocol that makes it easy to configure and control a remote IP camera.
  • RTSP negotiated RTP payload - The ONVIF Streaming Specification describes one such approach
  • Transport Stream (TS) over RTP - Use RTP payload type 33 with the client creating the Transport Stream and the camera being the consumer. By using TS, you can encode audio, video, and metadata over the single RTP back channel connection.

RTSP negoiated RTP Transport Stream encoded back channel

The rest of this paper discusses how you can encode metadata in a transport stream created by the RTSP client and send it over an RTSP negoiated RTP back channel as defined in the ONVIF Streaming Specification. A GStreamer based implementation is used as the example deployment.

RTSP negoiation

The RTSP protocol (section 1.5 Extending RTSP and section 12.32 Require tag) defines how the RTSP negoiation can be extended to require the server to support extended capabilities. The ONVIF Streaming Specification uses this flexibility in the RTSP specification to add a back channel extension.

Extended RTSP DESCRIBE method

An extended RTSP describe method along with a response is show below. The example data comes from GStreamer RTSP support.

RTSP request
   method: 'DESCRIBE'
   uri:    'rtsp://10.251.101.23/test'
   version: '1.0'
 headers:
   key: 'CSeq', value: '2'
   key: 'Accept', value: 'application/sdp'
   key: 'Date', value: 'Tue, 02 Jul 2013 18:39:43 GMT'
   key: 'Require', value: 'www.onvif.org/ver20/backchannel'          <-------- extension

where the key Require and value www.onvif.org/ver20/backchannel are added.

RTSP response
 status line:
   code:   '200'
   reason: 'OK'
   version: '1.0'
 headers:
   key: 'CSeq', value: '2'
   key: 'Content-Type', value: 'application/sdp'
   key: 'Content-Base', value: 'rtsp://10.251.101.23/test/'
   key: 'Server', value: 'GStreamer RTSP server'
 body:
   v=0
   o=- 1188340656180883 1 IN IP4 10.251.101.23
   s=Session streamed with GStreamer with backchannel                <-------- extension
   i=rtsp-server
   e=NONE
   t=0 0
   a=tool:GStreamer
   a=type:broadcast
   a=control:*
   a=range:npt=now-
   m=video 0 RTP/AVP 33
   c=IN IP4 10.251.101.23
   a=rtpmap:33 MP2T-ES/90000
   a=control:stream=0
   a=recvonly                                                        <-------- extension
   m=video 0 RTP/AVP 33                                              <-------- extension
   a=rtpmap:33 MP2T-ES/90000                                         <-------- extension
   a=control:stream=1                                                <-------- extension
   a=sendonly                                                        <-------- extension

Connection setup

First RTSP SETUP is used to setup the sessions. In this case we have two independent TS sessions - the normal camera to client and the new back channel from client to camera

Setup normal camera to client RTSP session

RTSP request
 request line:
   method: 'SETUP'
   uri:    'rtsp://10.251.101.23/test/normal'
   version: '1.0'
 headers:
   key: 'CSeq', value: '3'
   key: 'Transport', value: 'RTP/AVP;unicast;client_port=36632-36633'
   key: 'Date', value: 'Tue, 02 Jul 2013 18:39:44 GMT'

RTSP response
 status line:
   code:   '200'
   reason: 'OK'
   version: '1.0'
 headers:
   key: 'CSeq', value: '3'
   key: 'Transport', value: 'RTP/AVP;unicast;client_port=36632-36633;server_port=42996-42997;ssrc=9BFD11EE;mode="PLAY"'
   key: 'Server', value: 'GStreamer RTSP server'
   key: 'Session', value: 'tgfobcrsheywfamr'

Setup back channel client to camera RTSP session

Normal RTSP SETUP command with new Require tag.

RTSP request
 request line:
   method: 'SETUP'
   uri:    'rtsp://10.251.101.23/test/backchannel'
   version: '1.0'
 headers:
   key: 'CSeq', value: '3'
   key: 'Transport', value: 'RTP/AVP;unicast;client_port=36632-36633'
   key: 'Date', value: 'Tue, 02 Jul 2013 18:39:44 GMT'
   key: 'Require', value: 'www.onvif.org/ver20/backchannel'          <-------- extension

RTSP response
 status line:
   code:   '200'
   reason: 'OK'
   version: '1.0'
 headers:
   key: 'CSeq', value: '3'
   key: 'Transport', value: 'RTP/AVP;unicast;client_port=36632-36633;server_port=42998-42999;ssrc=9BFD11FF;mode="PLAY"'
   key: 'Server', value: 'GStreamer RTSP server'
   key: 'Session', value: 'tgfobcrsheywfamr'

Start RTSP sessions

Before any data can be exchange over either the normal or back channel session, the RTSP PLAY is sent from the client to the camera.

RTSP request
 request line:
   method: 'PLAY'
   uri:    'rtsp://10.251.101.23/test'
   version: '1.0'
 headers:
   key: 'CSeq', value: '4'
   key: 'Range', value: 'npt=now-'
   key: 'Session', value: 'tgfobcrsheywfamr'
   key: 'Date', value: 'Tue, 02 Jul 2013 18:39:44 GMT'
   key: 'Require', value: 'www.onvif.org/ver20/backchannel'          <-------- extension

RTSP response
 status line:
   code:   '200'
   reason: 'OK'
   version: '1.0'
 headers:
   key: 'CSeq', value: '4'
   key: 'RTP-Info', value: 'url=rtsp://10.251.101.23/test/stream=0;seq=56212;rtptime=358582991'
   key: 'Range', value: 'npt=now-'
   key: 'Server', value: 'GStreamer RTSP server'
   key: 'Session', value: 'tgfobcrsheywfamr'

Once the OK is received from the RTSP server, the client can start sending Transport Stream encoded data to the camera.

Including metadata in Transport Stream

GStreamer and in-band metadata is one simple approach to creating a GStreamer pipeline for metadata. When supporting a backchannel, the same transport Stream mux / demux approach can be used by extending the GStream mpegtsmux and mpegtsdeux to support a metadata sink/source pad.

Example GStreamer pipelines

Camera with very loud speaker to scare away intruders

In this example a security camera is monitor by a guard. When the guard observes an intruder, the guard can speak into a microphone and a speaker connected to the camera can blast the intruder with the words of wisdom spoken by the guard. Let's hope this camera isn't a PTZ device with attached weapon!

A simplified GStreamer pipeline, for a TI DM8148/DM8168 based camera device.

gst-launch v4l2src ! omx_h264enc ! mpegtsmux  ! rtpmp2tpay ! rtspsink backchannel=ts rtspbcsrc ! rtpmp2tdepay ! mpegtsdemux ! omx_aacdec ! alsasink

where rtspsink passes the backchannel to the element rtspbcsrc.

Camera with external alarm output signals

A camera in a display room can be physically secured if motion is detected using the camera's external alarm output. A motion detection algorithm monitors the video stream and uses the back channel to trigger the alarm output.

A simplified GStreamer pipeline, for a iMX6 based camera device.

gst-launch mfw_v4lsrc ! vpuenc codec=6 ! mpegtsmux ! rtpmp2tpay ! rtspsink backchannel=ts rtspbcsrc ! rtpmp2tdepay ! mpegtsdemux ! appsink

where rtspsink passes the backchannel to the element rtspbcsrc and appsink is a generic usage to indicate a custom application processes the metadata contained in the back channel communication.