How to get a live u-Law WAV stream to Cisco VOIP servers (Updated 2017-07-23)

I’m probably going to get some of the specifics on the Cisco VOIP server a bit wrong, but the following is what I remember when deconstructing multiple customer descriptions over the years who did not yet know how to set up their Cisco VOIP server with on hold audio. As scary as it may seem I think that I was better at setting up the on hold music using Helix Server than anyone working at Cisco was or any of their customers were even though Cisco, for many years, recommended Helix Server until it was discontinued. Then again it was my job to know these sorts of things.

In a typical scenario a Cisco engineer and one of their customers would get on a phone call with me on how to configure Helix Server for streaming their on hold audio. For everything else Cisco I am currently a knuckle dragging troglodyte.

When I was working at RealNetworks supporting Helix Server we had a high volume of customers using Cisco VOIP phone systems and they all needed two things:

1) Looped on demand u-Law WAV files. Helix Server supported this using it’s Simulated Live Transfer Agent (SLTA).

The Cisco VOIP server has the ability for you to upload audio files, which then get converted to u-Law WAV files, for when a customer is on hold in a cost center (customer service, billing, legal, etc…) so that they can have customized music or advertisements for a product the customer might be interested in. I was never fully sure why they used SLTA for this unless they had a super cheap Cisco server that didn’t have that function, if they had more cost centers than the Cisco server supported, or even worse if they didn’t know that they had that functionality on their Cisco server.

Cisco did not seem to have any really good documentation on how to make the u-Law WAV files you needed for SLTA, but this forum post works. Sadly FFmpeg, my encoding tool of choice, supports looping images only. There is currently no audio equivalent that I know of.

Loop over the input stream. Currently it works only for image streams. This option is used for automatic FFserver testing. This option is deprecated, use -loop 1.

I have a few theoretical hacks to get looped content working, but they are both difficult to set up and are also very unstable. In other words they are not ready for deployment in an enterprise environment that demands high uptime. If I find a free solution that is stable I will post it here.

2) Live u-Law WAV stream. Helix Server did not support this. I performed extensive and exhaustive testing and it was unable to properly repacketize an incoming live u-Law stream to either unicast RTSP or multicast SDP no matter the input method. I was hoping that this would get fixed, however our group was laid off before that could happen.

Cisco used to have an audio capture card in their hardware VOIP servers that customers would use to pipe their satellite Muzak feed into (stereo or mono din input if I remember correctly), but that was discontinued because now they apparently only provide an image that goes into a VM so there is no capture card. Their customers had to settle for SLTA.

With that said I can provide option number two for companies that have a Cisco VOIP server that they use. For a proper live u-Law WAV delivery configuration you need to know a few things:

A) A little bit about FFmpeg or at least a willingness to learn. You can download  already compiled versions for Windows over at Zeranoe’s website. If you are on Linux you can either compile FFmpeg yourself or head to the FFmpeg website itself for some static Linux builds.

B) How to create a u-Law file for testing a pseudo live feed.

C) How to create a multicast SDP file using FFmpeg using the u-Law file created above.

D) How to modify the multicast SDP file to work with VLC as a player. This is the first test to see if you have things right for live streaming.

E) How to connect via DirectShow on Windows or ALSA on Linux to an audio source. This is the final step in testing that your device works. If you start here then you may never really know if your device is working, if the SDP file is working, or if you even created the output correctly.

You will learn all of the above in this article.

I just finished converting an MP3 file to u-Law using the following command line:

ffmpeg -i in.mp3 -acodec pcm_mulaw -b:a 64 -ac 1 -ar 8000 -f wav -y out.wav

You can deliver a pseudo live non looping feed of that file:

ffmpeg -re -i out.wav -f rtp rtp://

You can also use the source file if you want:

ffmpeg -re -i in.mp3 -acodec pcm_mulaw -b:a 64 -ac 1 -ar 8000 -f rtp rtp://

FFmpeg is nice in that it dumps the SDP information for the RTP stream to the command prompt even though no SDP file is created:

o=- 0 0 IN IP4
s=Your File Metadata
c=IN IP4
t=0 0
a=tool:libavformat 57.23.100
m=audio 9008 RTP/AVP 0

Sadly connecting to that SDP output occasionally stutters then cuts out when listening to the stream with either VLC, QuickTime for the PC, or RealPlayer. If you read through all of the RFCs you might get an idea of the complexity of the RTP/SDP specifications.

Or not. I don’t know about you but reading those RFCs puts me right to sleep. A slightly easier to digest article on SDP structure can be found here. The article is in regards as to why your technically 100 percent compliant SDP file doesn’t work with Helix Server. To use your SDP file with that server you are required to add optional flags.

Sadly the mostly working SDP file that FFmpeg creates is missing one important item:

a=rtpmap:0 PCMU/9008/1

The “rtpmap” attribute is used to connect or map the audio that is defined in the “m” or “media” section to the network RTP output as well as define the codec (payload type) and the number of audio channels in use if it is an audio stream. This is sort of important for devices, players, or receivers to know what to listen for and how to decode it, especially when there may be two or more streams described in the SDP file.

Playing that modified SDP file fixes everything, at least for VLC and QuickTime:

o=- 0 0 IN IP4
s=Your File Metadata
c=IN IP4
t=0 0
a=tool:libavformat 55.0.100
m=audio 21414 RTP/AVP 0
a=rtpmap:0 PCMU/9008/1

Please note that if you have multiple live streams running that you need to have each SDP file and each encoder configured to use a different port number for the audio. I make sure increase the port number by two in each SDP. For example 21414, 21416, 21418, etc…

Now that you have something that works with a file let us now try with a live source. On Windows you will need to have FFmpeg connect via DirectShow. To find the list of DirectShow devices on your computer the command line shown below will help

ffmpeg -list_devices true -f dshow -i dummy

Now feel free to try it with your audio device.

ffmpeg -f dshow -i audio=”Microphone (HD Pro Webcam C920)” -acodec pcm_mulaw -b:a 64 -ac 1 -ar 8000 -f rtp rtp://

The line above works well for me, especially as FFmpeg now supports crossbar devices.

If you are on Linux you may want to use ALSA to connect to your live feed, but again you need to find the device you want to use first. This will show you the ALSA devices your system has:

$ arecord -L

$ ffmpeg -f alsa -i default:CARD=U0x46d0x809 -acodec pcm_mulaw -b:a 64 -ac 1 -ar 8000 -f rtp rtp://

On a side note you will probably want to host your SDP file or files on a robust web server or perhaps even behind a load balancer. From the logs that I have parsed over the years the Cisco VOIP server retrieves to the SDP file every time a person was put back into the queue. The highest number of connections I recall seeing was around 3,000 per second ,so people that have to support a high volume call center or a large corporation should make themselves well prepared for this behavior by putting up a web server dedicated to delivery their SDP files or several web servers behind a load balancer.

The only way this DDoS effect could be either mitigated or resolved is if the Cisco VOIP server was modified to grab the multicast information in the SDP file, retain it for use among the clients, and then check the multicast SDP file every minute or so in case the structure of the audio feed changed or was updated along with the associated SDP file. Frankly I just don’t see that happening.

And for those few who are interested in what a Scalable Multicast u-Law SDP file that is generated from Helix Server looks like, or for some reason the SDP file format I describe above doesn’t work for you, then look no further than the output below:

o=- 275648743 275648743 IN IP4
s=War Pigs+Luke’s Wall
i=Black Sabbath
c=IN IP4
t=0 0
a=ASMRuleBook:string;”#($Bandwidth >= 0),Stream0Bandwidth = 64000;”
m=audio 21414 RTP/AVP 0
c=IN IP4
a=rtpmap:0 PCMU/8000/1
a=ASMRuleBook:string;”marker=0, AverageBandwidth=64000, Priority=9, timestampdelivery=true;”

2017-07-23 Update:
If you are wanting to deliver your live stream through a streaming server and have your Cisco server pick up an RTSP feed from that streaming server instead then please take a look at the following command. This method is both easier and more reliable than the direct SDP method shown above.

$ ffmpeg -f dshow -i audio=”Microphone (HD Pro Webcam C920)” -acodec pcm_mulaw -b:a 64 -ac 1 -ar 8000 -f rtsp rtsp://username:password@[server_address]:[port]/live/audiostream

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s