It’s bad. No, it’s worse than that.
It’s generally accepted that email is “available for the world to see” in security circles, essentially it’s transmitted in plain-text over the internet. Plain-text in this case has nothing to do with HTML vs ASCII text, but instead refers to the fact that your email isn’t being encrypted before being transmitted. VoIP is exactly the same.
This means that anybody with the ability to “sniff” your traffic (or make a copy of your VoIP data in transit) can listen to it. It’s almost like tapping an actual telephone wire, except, you can probably do it without the victim being able to see physical evidence of it! The details of this sniffing is not going to be discussed further – what I can say is that fortunately this is harder than what I make it sound here. It’s not trivial unless you happen to “0wn” a router or three that happens to be on the route that somebody’s VoIP traffic travels along (Along the same vein as the argument for and against the paradox of “you’ve either got plaintext passwords on the wire, or plaintext passwords in your database – you can make both plaintext, or encrypt (hash) one, but not both).
So, for the sake of illustration I’ve taken a sample system and configured a SIP account to one of my providers, and then proceeded to make a call, and captured it on my system using “tcpdump -i eth1 -s0 udp and host 196.26.201.30 and host 10.0.0.14 -w sample_call.cap”. You can download the sample_call if you would like to follow the rest of the stuff to prove that it works.
Now, you can load that into wireshark – and by going to statistics -> VoIP Calls you’ll see that there are two calls there – firstly a failed call and then when I got the number right a proper call. Then go to statistics -> rtp -> show all streams, here you’ll note four streams, the ones we’re interested in is the ones between port 20000 and 26350 (You could have gotten these numbers from the VoIP Calls by selecting graph and looking for the RTP negotiation packets). You can now either use wireshark to filter these out, or tcpdump directly from the command line:
tcpdump -r sample_call.cap -w rtp1.cap -s0 src port 20000 and dst port 26350
tcpdump -r sample_call.cap -w rtp2.cap -s0 dst port 20000 and src port 26350
So now we have a pcap file for each direction. Next we extract the rtp audio data using rtp_dump.pl to generate two .g729 files:
rtp_dump.pl rtp1.cap rtp1.g729
rtp_dump.pl rtp2.cap rtp2.g729
Now you have two ways to proceed:
- Use your pbx (such as asterisk) to playback the g729 call to a g729 capable phone (a simple application with Answer(), Playback(/path/to/rtp1), Hangup() and then another for rtp2 should do the trick).
- Convert the g729 data to wav. Two ways – standalone tool or abuse asterisk a little (what I do at the moment).
To get asterisk to convert the g729 to wav you need to have it running with the g729 license installed. If it’s installed you should be able to run a command such as asterisk -rx “file convert /path/to/stream.g729 /path/to/stream.wav” which will then convert the stream.g729 file into stream.wav! Now you can play the wav file with whatever media player you prefer.
To automate almost all of this I cooked a little script called extract_streams. This script isn’t particularly clean, nor does it make any attempts to restrict it’s conversions to rtp only – it tries to convert _all_ udp not on port 5060 as rtp streams.
The net result if you’ve done everything correctly is two wav files, one for each direction.
So the question is now really, what can be done to prevent this kind of thing from being done without permission? Well, firstly – nobody is supposed to have access to routers they they’re not admins of – and admins of routers generally are at least somewhat trustworthy. This sniffing this kind of data isn’t that simple. Then there also exists extensions for RTP that uses encryption to help protect the data. Then you can also run the SIP and RTP data inside of a IPSec protection layer. And that is pretty much all that is required.
What you should be asking yourself – are you willing to send an email unencrypted? If so you’re probably fine making calls over the public internet as this has way fewer intercept points than email. Email can be hi-jacked in transit on any of a number of servers whereas VoIP has to be captured real-time off the wire.