Columbia Video Network Tool (CoVit) Program Documentation

Subhrendu Sarkar
Columbia University
New York, NY 10027
USA
ss3295@columbia.edu

Abstract

This document describes the different data structures and libraries used by the CoVit software tool. It also provides an overview of the installation procedure and running of the program.

Software Requirements

The software tool has been developed on a Linux platform and compiled under gcc. It comprises of a set of C++ source and header files. The software has used the RTP library (RTPlib) for the RTP implementation. and XML-RPC-C for the XML-RPC Server. The software also uses the unicap library which provides a uniform interface to video capture devices. It allows applications to use any supported video capture device via a single API.
Instructions to download and install the library Unicap must be followed as documented.
Instructions to download and install the library Xml-Rpc-C must be followed as documented.

run "make" from the shell prompt to compile CoVit and create the executable.
The make command will also create a sample user program "client" which can be used to interact with the CoVit tool.
running CoVit:
./covit 8080
(8080 is a command line argument to specify the port number at which the XML-RPC Server will be listening)

running client program:
./client http://localhost:8080/RPC2

Program Details

XML-RPC Server

The program for the XML-RPC server is in the file <xmlrpc_server.cpp>
To add a new method support at the server we need to add the method to the xmlrpc registry and also implement a corresponding method for that. The prototype for adding a new method is shown below:
     xmlrpc_registry_add_method(
    &env, registryP, NULL, "QueryInterface", &QueryInterface, NULL);


Following are the set of methods currently implemented at the XML-RPC Server.

 
- QueryInterface (returns the set of supported methods by the server) 
     - no input arguments


- CreateSendStream  (creates a new video send stream with the input identifier, identifier value has to be positive)
----------------------------------------------------------------------------------------
input arguments    data type   description
----------------------------------------------------------------------------------------
 stream id           int        video send stream identifier (a positive number)


- CreateRecvStream  (creates a new video receive stream with the input identifier, identifier value has to be positive)
----------------------------------------------------------------------------------------
input arguments    data type   description
----------------------------------------------------------------------------------------
 stream id           int        video send stream identifier (a positive number)


- StartStreamingRTP  (starts streaming video, outbound)
----------------------------------------------------------------------------------------
input arguments                     data type   description
----------------------------------------------------------------------------------------
data array of three elements        array       IP address (string)
                                                 port (string)
                                                 valid video stream identifier (string 


- StartReceivingRTP  (starts receiving video, inbound) 
----------------------------------------------------------------------------------------
input arguments                     data type   description
----------------------------------------------------------------------------------------
data array of three elements        array       IP address (string)
                                                 port (string)
                                                 valid video stream identifier (string 


- setCameraProps     (sets Camera Properties) 
----------------------------------------------------------------------------------------
input arguments                     data type   description
----------------------------------------------------------------------------------------
width                                   int      width of video capture from the camera
height                                  int      height of video capture from the camera


- setEncodingFormats (sets the encoding formats)   
----------------------------------------------------------------------------------------
input arguments                     data type   description
----------------------------------------------------------------------------------------
data array of five elements        array         codec type (string) e.g MPEG4, MPEG2, MPEG1, etc.
                                                 width (string)
                                                 height (string)
                                                 bitrate (string)
                                                 frame-rate (string)
                                                 valid video stream identifier (string) 


- setDecodingFormats (sets the encoding formats)   
----------------------------------------------------------------------------------------
input arguments                     data type   description
----------------------------------------------------------------------------------------
data array of five elements        array         codec type (string) e.g MPEG4, MPEG2, MPEG1, etc.
                                                 decoding video width (string)
                                                 decoding video height (string)
                                                 


- StopSend           (Stops the RTP Sender and capture threads)
----------------------------------------------------------------------------------------
input arguments     data type       description
----------------------------------------------------------------------------------------
stream id               int         valid video stream identifier


- StopRecv           (Stops the RTP receiver)
----------------------------------------------------------------------------------------
input arguments     data type       description
----------------------------------------------------------------------------------------
stream id               int         valid video stream identifier

 

RTP Sender

The RTP sender thread details are mostly embedded in the file <send_rtp.cpp>. "sendrtp" is the thread function.

RTP Receiver

The RTP receiver thread details are mostly embedded in the file <recv_rtp.cpp>. "recvrtp" is the thread function.

Capture Thread

The Capture thread details are mostly embedded in the file <capture.cpp>. "capthread" is the thread function.

Data Stuctures

The program uses a couple of data structures and uses an object oriented model.
The objects of importance are as follows:
- VideoCapture (defined in <covit.h> as CVideoCapture) (this is the object at the RTP Sender layer responsible for RTP sending)
- MediaBuffer (defined in <covit.h> as CMediaBuffer)
- UnicapVidCap (defined in <unicapVidCap.h> as CUnicapVidCap) (this is the object at the Capture layer responsible for putting buffers into the Media Buffers)

The encoding format for each video send stream has the following structure:

 
 
typedef struct encodeformat
{
        //codec type - MPEG4, MPEG2, MPEG1, etc
        CodecID encode_fmt;
        //encoder video width 
        int encode_width;
        //encoder video height
        int encode_height;
        //encoder bitrate
        int encode_bitrate;
        //encoder frame rate
        int fps;
        //encoder GOP size
        int gopSize;
};



The decoding format for each video receive stream has the following structure:

 
typedef struct decodeformat
{
        //codec type - MPEG4, MPEG2, MPEG1, etc
        CodecID decode_fmt;
        //decode video width
        int decode_width;
        //decode video height
        int decode_height;
};


CoVit maintains a linked list of all the send and receive video stream resources.
There are three such linked lists:
-capList (List for capture video stream resources)
-sendList (List for send video stream resources)
-recvList (List for receive video stream resources)
Generally there will be same number of capture List nodes and send List nodes sharing the same video send stream identifiers.

The capList node is as follows:

typedef struct capNode
{
        //capture or send video stream identifier
        int id;
        //capture thread mutex
        pthread_mutex_t capthr_cond_mutex;
        //capture thread conditional for signaling
        pthread_cond_t  capthr_cond;
        //capture thread stop mutex
        pthread_mutex_t capstop_cond_mutex;
        //capture thread conditional for signaling
        pthread_cond_t  capstop_cond;
        //capture thread status mutex
        pthread_mutex_t capstatus_mutex;
        //capture thread stop status
        int stopCapStatus;
        //encode format
        encodeformat encodeFmtNode;
};



The sendList node is as follows:

typedef struct sendNode
{
        //send or capture video stream identifier
        int id;
        //sender thread mutex
        pthread_mutex_t sendthr_cond_mutex;
        //sender thread conditional for signaling
        pthread_cond_t  sendthr_cond;
        //sender thread stop mutex
        pthread_mutex_t sendstop_cond_mutex;
        //sender thread stop conditional for signaling
        pthread_cond_t sendstop_cond;
        //sender thread status mutex
        pthread_mutex_t sendstatus_mutex;
        //variable for sender thread quit 
        int send_quit;  
};



The recvList node is as follows:

typedef struct recvNode
{
        //receive video stream identifier
        int id;
        //receiver thread mutex
        pthread_mutex_t recvthr_cond_mutex;
        //receiver thread conditional for signaling
        pthread_cond_t  recvthr_cond;
 
        //receiver thread stop mutex
        pthread_mutex_t recvstop_cond_mutex;
        //receiver thread stop conditional for signaling
        pthread_cond_t  recvstop_cond;
        //variable to quit receiver thread associated with the video stream identifier
        int recv_quit;
        //decode format
        decodeformat decodeFmtNode;
};



VideoCapture implements the IVideoCapture interface defined in <interfaces.h>
The method "Process_Raw_Video" of the above interface is the most important as this method is responsible for the communication between the Object at the Capture layer and the Object at the RTP Sender Layer.
An object of CUnicapVidCap (ObjVidCap) is aggregated within CVideoCapture.
The declaration of CVideoCapture looks like:

 
class CVideoCapture : public IVideoCapture
{
protected :
        //UnicapVidCap object
        CUnicapVidCap *ObjVidCap;
        //display video window rectangle 
        SDL_Rect rect;
        //SDL surface for video display
        SDL_Surface *m_screen;
        //video capture stop status
        int m_stopStatusBase;
        //capture status mutex
        pthread_mutex_t *m_capstatus_mutex;
        //Capture thread stop mutex
        pthread_mutex_t *m_thrstop_mutex;
        //conditional to signal the stopping the capture thread 
        pthread_cond_t *m_thrstop_cond;       
 
        //mutex conditionals for the 30 Empty Media Buffers shared with sender thread
        pthread_cond_t *m_MediaBuffer_Empty_cond[30];
        //mutex conditionals for the 30 Full Media Buffers shared with sender thread
        pthread_cond_t *m_MediaBuffer_Full_cond[30];
        //mutex for the 30 Media Buffers shared with sender thread
        pthread_mutex_t *m_MediaBuffer_mutex[30];
 
        //mutex conditionals for the Empty Media Buffers for deletion 
        pthread_cond_t *m_delMediaBuffer_Empty_cond;
        //mutex conditionals for the Full Media Buffers for deletion 
        pthread_cond_t *m_delMediaBuffer_Full_cond;
        //mutex for the Media Buffers for deletion 
        pthread_mutex_t *m_delMediaBuffer_mutex;
 
        //Video encode bitrate
        int m_encode_bitrate;
        //video capture frame rate
        int m_fps;
        //video encoder GOP size
        int m_gopSize;
        //video encoder codec format
        CodecID m_encode_fmt;
 
public  :
        //capture thread id associated with the object
        int m_threadid;
        //head of the Media Buffer linked list
        CMediaBuffer *m_MediaBuffer; 
        //Video encoder object
        CVideoEnc *VideoEnc;
        //constructor
        CVideoCapture(int threadid);
        //destructor
        virtual ~CVideoCapture();
        //video capture stop status
        int getStopStatus();
        //set the video capture stop status 
        int setStopStatus();
        //set settings for the device number
        virtual void SetSettings(int device_no);
        //process the video
        void Process_Raw_Video(int buffers_ready,int PixFormat, int bufsize,int width, int height,unsigned char *data, int tsinc_usec);  
        //get the video format associated with the video capture object
        virtual void getVideoFormat();
        //set the video format associated with the video capture object
        virtual void setVideoFormat();
        //get the video device properties associated with the video capture object
        virtual void getVideoDeviceProp();
        //set the MediaBuffer to the video capture object
        virtual void setMediaBuffer(CCapThreadParams*);
        //start the capture
        virtual void start_capture(CCapThreadParams *cparam);
        //stop capture
        virtual int stop_capture();
};



ObjVidCap is an object of type CUnicapVidCap aggregated within the CVideoCapture. This object acts as the interface with the capture device and internally it uses the unicap imaging library to interact with the capture devices.
The declaration of CUnicapVidCap is as follows:

 
class CUnicapVidCap
{
public :
        //constructor
        CUnicapVidCap(IVideoCapture *, int streamid);
        //destructor
        virtual ~CUnicapVidCap();      
 
        //open device with the device number
        void open_device(int device_no);
        //returns the number of devices in the system 
        int enumerate_devices();
        //sets formats for the camera devices
        void setformats();
        //start capture; takes as input the address of the stop_status which is used to stop capture externally, fps to indicate the encoding frame rate
        void start_capture(int *stop_status, int fps);
        //stop capture
        void stop_capture(); 
 
        //capture status mutex
        pthread_mutex_t *m_capstatus_mutex;
        //thread stop mutex
        pthread_mutex_t *m_thrstop_mutex;
        //thread stop conditional for signaling purposes
        pthread_cond_t *m_thrstop_cond;
        //thread id associated with the object
        int m_threadid; 
        
private :
        //device number associated with the object
        int m_device_no;
        //handle of the camera
        unicap_handle_t m_handle;
        //device structure for the camera
        unicap_device_t m_device;
        //format for the camera
        unicap_format_t m_format_spec;
        unicap_format_t m_format;
        //array of buffers to hold the captured data from the camera 
        unicap_data_buffer_t m_buffers[BUFFERS];
        unicap_data_buffer_t *m_returned_buffer;
        
        //static CUnicapVidCap* m_instance; 
        int m_PixFmt;
        //CCameraDevice object
        CCameraDevice *m_Cam;
        IVideoCapture *BaseVidCapture;
};




The XML RPC Server needs to configure the capture devices present in a system. CCameraDevice is a singleton object, hence only one instance of CCameraDevice exists in CoVit. Each instance of CUnicapVidCap contains a pointer to that one instance of CCameraDevice object. This object is responsible for initializing the capture devices, creating capture device handles, set formats on the capture devices and maintaining the reference count for each of the capture devices. The object also maintains a reference count for itself so that it can delete the object when the object is no longer referenced and used.

 
class CCameraDevice
{
public:
        //singleton Camera Device object; returns the single instance of the object
        static CCameraDevice* Instance();
        //constructor of camera device class
        CCameraDevice();
        //destructor
        virtual ~CCameraDevice();
        // returns the number of camera devices in the system
        int getNumberofCams();
        //sets the camera handle for the device number specified
        int getCamHandle(unicap_handle_t *handle, int device_no = 0);
        //closes the camera handle for the device number
        int closeCamHandle(int device_no);
        //destroy the instance for the CCameraDevice
        void destroy();
        //sets the capture format for the device number
        int setformats(int device_no, unicap_format_t *format);
        //starts capture for the device number
        int start_capture(int device_no);
        //stops capture
        void stop_capture();
        //initializes the camera devices 
        int initCamera();
        //deinitializes the camera devices 
        void deinitCams();
        //the single instance of CCameraDevice 
        static CCameraDevice* m_CameraInstance;
        //reference count of the CCameraDevice object
        int m_refCount;
private:
        //number of Camera devices in the system
        int m_NumCams; 
        //buffer to hold captured video from the devices
        unicap_data_buffer_t **m_buffers;
 
        //array of variables to indicate capture state for each camera device; 0:device not started 1:captured started
        int *m_start_capture;
        //array of reference counts for each capture device
        int *m_hndRefCount;
        //array of camera handles for each device
        unicap_handle_t *m_handle;
        //array of camera device
        unicap_device_t *m_device;
        //array of formats for each camera device
        unicap_format_t *m_format_spec;
        unicap_format_t *m_format;
 
};

Source Code Documentation

 

Program Flow

The following diagram (Figure 5) tries to explain the flow of the program for sending video.

Figure 5: The Program Flow


Figure 5

Client Program


The source code for the client program can be found at <xmlrpc_sample_client.cpp>. The client program is a user interactive program which waits for user input. Depending on the method name provided by the user in the input, the program asks for the input arguments from the user in an interactive manner. The client program aggregates all the input arguments necessary for a method and then makes the XML RPC method call on the listening XML RPC server.
To make a remote procedure call we can look at the following method:
    /* Make the remote procedure call */
     result = xmlrpc_client_call(&env, serverUrl, methodName, "(i)", (xmlrpc_int32) 1);

serverUrl is a string e.g "http://localhost:8080/RPC2".
methodName is a string e.g "CreateSendStream".
(i) signifies an int data type for the parameter and 1 is the value of that parameter.

On the wire, XML-RPC values are encoded as XML as shown below:

 
<methodCall>
  <methodName>CreateSendStream</methodName>
  <params>
    <param><value><int>1</int></value></param>
  </params>
</methodCall>
 
<methodCall>
  <methodName>StartStreamingRTP</methodName>
  <params>
    <param><value>
    <array>
    <data>
      <value><string>1</string></value>
      <value><string>128.59.19.222</string></value>
      <value><string>20002</string></value>
      </data>
    </array>
    </value></param>
    </params>
</methodCall>


This is according to the XML RPC specification. [7]

References

1

Unicap Library http://www.unicap-imaging.org

2

XML-RPC http://xmlrpc-c.svn.sourceforge.net.

3

RFC 3550 RTP: A Transport Protocol for Real-Time Applications.

4

Colin Perkins RTP: Audio and Video for the Internet.

5

RTP Library RTP Library API Specification.

6

FFMPEG Tutorial Online FFMPEG Tutorial.

7

XML RPC XML RPC Specification.


Last updated: 01-12-2009 by Subhrendu Sarkar