\title{SIP as Covert Channel -- Or the Information Capacity of a SIP Channel} \begin{document} \section{Introduction} SIP, like other signaling protocols, can be used as a ``covert channel'', conveying information to the recipient beyond the intended purpose. Network operators typically do not charge for unsuccessful call setup attempts, and are thus loath to encourage users to abuse such attempts to carry useful information. If carriers charge for instant messaging, say, users may quickly discover mechanisms to send IMs as signaling messages, defining a new service called Signaling Message Service (SMS). This issue is hardly new. Examples include the ``toll saver'' features of analog answering machines that would pick up the call after different number of rings depending on whether there were messages to be retrieved by the owner, thus avoiding the cost of a long-distance call just to find out that ``you have no new messages''. In general, various ways to use non-completed calls for user-to-user signaling, such ring codes (``if the phone rings twice, I'll be home soon''), have a long history. ISDN Q.931 signaling offered an explicit size-limit UU signaling element, which was then promptly disallowed by various operators. In general, signaling protocols can be used as hidden channels in at least four ways: \begin{description} \item[Signaling timing:] The simplest mechanism is to simple have the caller associate meanings with the spacing between incomplete call attempts. \item[Signaling duration:] In systems that rely on human reaction, i.e., all kinds of telephony signaling, the caller can control the duration of the signaling attempt. \item[Signaling destination:] In some scenarios, the called user has access to a range of destination numbers or addresses. Examples include ISDN sub-addresses and SIP user names. \item[Signaling content:] Most easily, the signaling originator may be able to add or modify parts of the signaling message. \end{description} These mechanisms are easiest to employ if the sender and receiver devices are under the control of the end user and, better yet, under program control. We define a measure of effectiveness similar to spectral efficiency, namely how many bits of user-to-user information can be conveyed for each bit of signaling information needed. \section{Covert Channels in SIP} SIP offers a rich set of opportunities among all four categories of covert signaling. \begin{itemize} \item[Signaling timing:] Since call setup delays are likely going to be relatively short and since many SIP devices will support multiple concurrent calls, timing the interarrival time of signaling messages to a high degree of precision would not be hard. Thus, one can build an on-off keying system that simply sends a signaling request every 100 ms or so, skipping attempts to designate 0 bits or 0/1 transitions. Indeed, if two or more different signaling messages can be conveyed, variations on phase-shift keying become possible. The network would have to go through elaborate timing distortion mechanisms to prevent such mechanisms. The possible date rate requires further study, but could well be around 10 b/s, but the efficiency is rather low. \item[Signaling duration:] Here, we quantize the signaling duration, e.g., between INVITE and CANCEL, and use it to convey information. Obviously, we can combine this with signaling timing, creating a multi-level channel or the equivalent of pulse-width modulation. \item[Signaling destination:] Unlike PSTN numbers, SIP addresses are inherently plentiful. There is no principal reason that a user could not have hundreds of user names, each indicating a different bit combination. Even many standard email systems support the ``+'' model of sub-addresses, so that all, for example, ``alice+private'' will be delivered to the same destination, but can readily be filtered. Since addresses in SIP are just another part of the message, this approach is really just a special case of the next one. \item[Signaling content:] SIP offers three kinds of mechanisms to convey user-to-user information: \begin{description} \item[User-defined header fields:] SIP UAs can add header fields that are meaningful only to the recipient, similar to the various X- headers common in email messages. It remains to be seen whether users will be follow the SIP change process and label these as P-headers. \item[SIP-defined informational headers:] SIP itself defines a number of headers that convey information to the caller, including the \header{Subject}, \header{Organization}, \header{Call-Info} and \header{Alert-Info} header fields. The \header{Subject} header field is designed for text messages, while the others can easily be drafted to such purpose. Any field that has URLs can carry data URLs \cite{data} or can carry user information in the URL itself, as in \begin{verbatim} Call-Info: http://meet.me.at/lunch/today/at/noon \end{verbatim} \item[Cloaked data:] A large number of SIP information elements can be used to ``hide'' user-to-user information. This includes all user-generated header fields that contain random components, for example \header{Call-ID} or tags in \header{To} and \header{From}. SIP allows to define new methods that are treated like non-{\INVITE} requests. All of these can be substituted by user-messages. If the content is encrypted, they will look like randomly generated bytes rather than natural-language text. \end{description} \end{description} As can be seen from the above, treating SIP signaling as a communication channel offers rich opportunities to apply standard coding and modulation techniques. Given such a channel, one can readily construct a layered protocol architecture that would allow one to run normal IP applications. The efficiency is somewhat inferior to the low benchmark set by using V.22 coding in RFC 2833 \cite{rfc2833}, but the bits are free. \section{Dealing with Hidden Channels} A carrier has a number of tools at its disposal to deal with this issue. We discuss a few below, highlighting their drawbacks and advantages. \subsection{Prevent User-to-User Signaling} Preventing user-to-user signaling entails closing all opportunities for users to influence the content of signaling messages. As should be clear from the previous section, this is not limited to removing user-defined header fields and removing all SIP informational header fields such as \header{Organization} and \header{Subject}, but also would entail rewriting all randomly generated components of the SIP message. This approach does not eliminate the ability of the user to employ message timing or the callee address to convey information. It also severely limits interoperability and extensibility for the protocol, violating core tenets of the SIP architecture \cite{arch}. It probably forces all SIP elements to be call-stateful. It effectively turns SIP into a text-variant of Q.931. \subsection{Message Length Limitation} An operator could restrict SIP requests to a size reflecting the typical real signaling needs. However, given the variability of a number of SIP elements, it will be difficult to pick a size that does not inadvertently reject legitimate signaling messages. Since many of the SMS messages are just a few tens of bytes long, there is not much observable difference between ``spiked'' messages and a plain {\INVITE}. \subsection{Restrict Message Volume} An operator could restrict the number of user-generated SIP requests. This may well be necessary to deal with mal-functioning dial software or various innovative spam messages. It is unlikely that this will prevent casual messaging at a volume of a few messages a day. \subsection{Charge for Signaling} The technically cleanest solution is to charge the caller for SIP requests just like any other user data. If we assume that a SIP {\INVITE} is about 200 bytes long when generated by the caller, this would cost about 0.08c per message at the typical \$4/MB of GPRS data costs. Combined with an allowance reflecting normal call failure ratios, this approach offers the right incentives to users, making it more efficient to use intended means of user-to-user messaging. This does not address messages generated by a landline network, addressed to the wireless subscriber. Charging for inbound \end{document}