SIP (Session Initiation Protocol) described in RFC 3261 is used for signaling of Internet telephony calls. SIP identifies a user by an email like address called a SIP URI, e.g., sip:alice@cs.columbia.edu. Every domain, e.g., cs.columbia.edu, will typically run a SIP server (proxy, registration, redirect). Sipd is one such SIP server that can be used. When a user, say Alice, starts her phone, either a software program or a hardware ethernet phone, it registers the current location (IP address) of the user identified, sip:alice@cs.columbia.edu, with the SIP server at cs.columbia.edu. Now, if another user, say, Bob with address sip:bob@office.com, wants to contact Alice, he dials (makes a call to) sip:alice@cs.columbia.edu. The Call reaches the SIP server at cs.columbia.edu, which in turn forwards (proxies) the call to the current IP address of Alice's phone. We have described the detailed architecture of our SIP telephony test-bed called CINEMA (Columbia InterNet Extensible Multimedia Architecture) in a technical report.
- Which software are we extending? building a new component in CINEMA.
- Features to be supported and priorities? Joining SIP IM session from a SIP phone (high priority). Allow MSN or Yahoo instant messages also (low priority). Allow a regular telephone using a SIP/PSTN gateway (low priority).
- Existing libraries or modules? libsip++, librtp, IBM ViaVoice text to speech and speech recognition.
- Programming language? C/C++
- Operating system? Linux
- GUI tool kit? None.
- Component testing? Basic interworking should work.
- Error logging? None.
- Installation mechanism? integrated into CINEMA distribution.
- Milestones? discussed in this document.
Readings: Skim through section 1, 2, 3.1, 5, 9 and 12 of the CINEMA technical report. We will be building another component in the cinema architecture so it is useful to know the overall system.SIP can be used for IP telephony call establishment/termination, instant messaging, as well as presence indication. Sipc is a software tool that can support these functions. We will be using sipc in all our tests, initially. sipc supports audio using Robust Audio Tool (RAT).
Using sipc: Read sipc documentation. Download the binaries for your choice of platform. You will need two machines to test sipc. If you do not have two machine you can use the IRT lab. Contact me. If you have problems in sipc download, contact Xiaotao Wu. Test the audio call between two sipc running on two different machines. Also test the instant messaging exchanges. Use the sipc's monitor option to capture all the messages in each case.Our initial goal for the project is to do interworking between a SIP call and a SIP instant messaging session. Just to give the overall architecture see the following diagram. We will be building an IM/call convertor that maps the call (INVITE, BYE, audio) to instant messages (MESSAGE, text). The conversion needs to be done in both the directions.
A INVITE simvoice MESSAGE B
sipc ----------->--- IM/call convertor ----------->----- sipc
(call) | (IM)
|
Text2speech and
speech2text
Consider a caller A makes a call to IM/call convertor (lets call it,
simvoice: SIP IM/Voice call convertor). Let the IP addresses of A,
simvoice and B be 128.59.19.195, 128.59.19.196 and 128.59.19.197
respectively. The INVITE message used for establishing the session is
something like the following:
INVITE sip:"bob@128.59.19.197"@128.59.19.196 SIP/2.0 To: sip:"bob@128.59.19.197"@128.59.19.196 From: sip:alice@128.59.19.195 ... c=IN IP4 128.59.19.195 m=audio 8000 RTP/AVP 0 8The caller A (for alice) indicates that it wishes to connect to 128.59.19.196 (i.e., the simvoice host), but the user part in the SIP URI indicates that the final destination is bob@128.59.19.197 (i.e., host B). It also indicates that A can listen for audio at 128.59.19.195 at port number 8000, for payload type (codec type) G.711 mu law (0) and G.711 A law (8). The simvoice application extracts the final destination from the request URI as bob@128.59.19.197 and establishes an IM session with that URI. At this point simvoice can try to send IM to the destination to probe if it is active or not.
MESSAGE sip:bob@128.59.19.197 SIP/2.0 To: sip:bob@128.59.19.197 From: sip:some_identifier@128.59.19.196 Content-Type: text/plain Content-Length: ... Hi. This is IM/call convertor on behalf of alice@128.59.19.195Then simvoice can accept the call from Alice, by sending a 200 OK response indicating the RTP IP and port for simvoice. We initially support only G.711 Mu law in simvoice.
SIP/2.0 200 OK ... c=IN IP4 128.59.19.196 m=audio 9000 RTP/AVP 0At this point simvoice initializes the text2speech and speech2text engines. Whatever audio packets are received on 128.59.19.196 at port 9000, from user agent A, it invokes the speech2text convertor to convert the audio to text. The converted text is sent to user agent B, using MESSAGE. If there is some error in the speech recognition, then the simvoice flushes the input and informs the caller user agent A to repeat what she said. This prompting to the caller requires text2speech conversion. When the useragent B sends a MESSAGE to the simvoice, for the given identifier "some_identifier" that maps to useragent A of alice, then the text is converted to speech, packetized as G.711 Mu law audio and streamed to the user agent A.
When the user agent A hangs up the call, the association is broken. The simvoice should be able to send error messages to the A (in voice) and B (in text) if there is some error. Possible errors could be: speech could not be recognized, or IM is received after the call is closed.
There are other ways to do the association, but we will start with the above mentioned simple scheme. We will not consider the IM sessions but consider each instant message as an independent message. Since, simvoice may not be allowed to send IM to the user agent B (not enough authentication), it may be needed to enhance the system. Secondly, one should also be allowed to initiate the conversation from user agent B, in which case the simvoice initiates the SIP call to user agent A. However, all these are for further studies, not considered in this project.
We have implemented a SIP useragent in C/C++ using our libsip++ library. We also have an RTP library for RTP/RTCP implementation. We will be using these libraries for implementing simvoice module. SIP library information is at http://www.cs.columbia.edu/~kns10/software/siplib.
Code review: Check out the cinema code from CVS next time you meet me. Then review the code in sipua/main.cpp and understand how it works.
TBD: more details to be added as and when needed.