VoIP Load Balancing over PPPoE links

I initially wanted to say over DSL, but then realized that that’s not quite appropriate since we’ve just actually completed the naïve approach making use of iBurst.  So on my way back home … I churned some ideas that I’d like to share (And log, before I forget them).  The ideas will build around the IAX/2 protocol as it’s much, much simpler, but the concepts should apply equally to SIP.  Obviously the ideal is to just get a bigger pipe …

The Naïve approach

Yesterday I still thought this would work quite well.  I was wrong.  Essentially, on the server just add three additional IPs for a total of four IPs, so we now have $ip1, $ip2, $ip3 and $ip4, all in the same /24 subnet, and all assigned to the same physical NIC.  The routing table ends up looking (ignoring other the rest of the NICs on the setup):

${subnet}/24 dev eth0 scope link src ${ip1}
default via ${subnet_gw}

Now anybody familiar with how udp and the sendto() function works will immediately spot the problem, whatever we send back to the client, no matter on which IP we recieved the request, the response will always have ${ip1} as it’s source.  So whilst it’s perfectly legal to set up four iax accounts on the client pointing to the four different IPs on the server this just ends you in a spot of trouble (Set up the four peers with qualify=yes and routing to each of the four server IPs over separate PPPoE connections and you’ll notice that three of them end up being unreachable).

To clarify, let’s say on the client machine we have four pppoe connections, with four IPs of $dsl1, $dsl2, $dsl3 and $dsl4, so we don’t do a default route on any of them (that goes via our normal gateway anyway), so our routing ends up something like (please note there are some additional prohibit routes in order to prevent stuff going over our data link but in an ideal case the below is sufficient):

${ip1} scope link dev ppp0
${ip2} scope link dev ppp1
${ip3} scope link dev ppp2
${ip4} scope link dev ppp3
192.168.0.0/24 scope link dev eth0
default via 192.168.0.1

Now, when we try to POKE on $ip2 we’d want to receive an ACK response from $ip2, instead we get an ACK from $ip1, which then evokes an INVAL response back to $ip1 (from $dsl1 no less).

In a similar fashion call setup also fails, never mind actual voice streams.

Making the naïve aproach work

The best way of making this work is probably to update asterisk to actually explicitly set the from address on outbound packets which are replies to requests (or related to already associated connections).  This doesn’t look too trivial and the asterisk devs are going to raise more than just a few eyebrows.  Probably not too hard either, we already need to store the peer’s IP somewhere … we could just add the local IP there as well.

You’d reckon I’d simply give up there.  But fortunately there is a relatively simple fix:  Add a specific route for $dsl[234] on the server side.  This works, but is a nasty, nasty hack imho.  Basically you need to perform some kind of DNS/registration tracking on the server side which knows how to keep track of the installed routes, when to remove them and when to add new ones.  It also needs to know which _source_ IP to use for each of these.  The simple, stupid way is a script like:

#! /bin/bash

ip1=??
ip2=??
ip3=??
ip4=??

ppp2_name=???
ppp2_dnsname=???
ppp3_name=???
ppp3_dnsname=???
ppp4_name=???
ppp4_dnsname=???

function check()
{
    local name=$1
    local dst=$2
    local src=$3
    local olddst=""

    [ -r /var/lib/route-check-${name} ] && olddst=$( /var/lib/route-check-${name}
    fi
}

check ${ppp2_name} "$(/usr/bin/dnsip ${ppp2_dnsname} | sed -e 's/ //')" $ip2
check ${ppp3_name} "$(/usr/bin/dnsip ${ppp3_dnsname} | sed -e 's/ //')" $ip3
check ${ppp4_name} "$(/usr/bin/dnsip ${ppp4_dnsname} | sed -e 's/ //')" $ip4

There is no need to explicitly monitor ppp1 as this will just make use of the defaul IP anyway.  Surprisingly this does actually work.

Abusing policy based routing

This was quite annoying actually.  I set up policy-based routing on the ppp devices anyway such that any traffic that gets originated with that IP will actually be sent back using the appropriate device.  This then made me think about the server case … what if we only bound asterisk to ${ip1} and then had a small relay agent run on each of the other IPs that litterally consisted of about 50 lines of C code that opens port 4569 on the IP, and whatever it receives from ${ip1} it forwards to the appropriate dynamic IP (since we have three IPs we’re listening on we can accomodate up to three additional links.  The downside here is that asterisk loses some source information regarding the IPs, also, I don’t think we can use this for multiple clients unless the agent actually understand at least a small amount of the IAX protocol (or use a port other than the default 4569).  For a two-ip on the server case we’d thus have three channels, in ${ppp1} -> ${ip1}, ${ppp2} -> ${ip2}, and lastly ${ip2} -> ${ip1}.  This should work as far as I can tell.  Asterisk on the server will see the calls as coming from ${ppp1}, ${ip2}, ${ip3} and ${ip4} but I’m fine with that.  It does reduce our redirect options significantly though.

Each call is also limited to a dsl line, and if that line goes down mid-call the call is screwed.

Extreme routing

This idea stems from the first, basically we run a small packet capturing sniffing for packets coming into the system on port 4569 (and 5060 would actually work cleanly for this too as far as I can tell) for each packet we look at {source, dest} and see if we have a route to source via our normal gateway and with src set to dest.  If not, update our routing table. This will likely hammer the system quite badly though with continuous routing table updates, result in an insanely large routing table unless the program also flushes routes again from time to time that it hasn’t seen in a longer than maxexpiry period.

Thus, if a packet comes in from ${ppp2} to ${ip2}, ensure that we have the route ${ppp2} via ${gateway_ip} src ${ip2}.  This will effectively fix the redirect problem too, and at the same time keep latency down as there is not an additional process for packets to go through.  One could even possibly rather make use of ipsets in iptables (along with the SNAT target)… efficiency will need to be trialed though.

Scary enough, this actually seems quite feasible to me.  Except for the resources that will be spent keeping track of the IP pairings I actually quite like this solution.  It’s relatively simple, doesn’t require assistance from asterisk, can happen realtime and without the need for third-party intervention (the less the system needs to rely on other systems the better).

Truly load balancing the IAX data stream

Then I came to the realization that we’re splitting up the calls, and packing them into partitions in order to get load balancing done.  Now if I tell you that the dialplan to get this done is fugly, please understand that I’m not trying to brag about getting it right.  It truly is FUGLY, and I really fail to see a way of cleaning this up.  It’s a hack making use of the GROUP functions to count the number of calls over the individual lines and balancing them out.  Nor does it solve for the inbound case!  And it still drops calls on the lines that die (and with the iBurst connections I’ve seen the last 24 hours or so … damn).

Ok, so the idea becomes to have a really, really dumb relay agent.  Doing something like ip-in-ip, but since the end points don’t give a rats about the original IPs (IAX/2 is very friendly towards both SNAT and DNAT … thanks be to Mark Spencer) we can do some really cool forwarding.

So on both servers we bind the IAX in asterisk to 127.0.0.1, we assign a secondary IP of 127.0.0.2 to the local loopback, and we bind a “proxy” on that, which also binds the four actual IPs.  Now we just need to know what the “pairing” IP is for each of our public facing IPs, so we need to register a set somehow.  So a client will let us know “these are my IPs, I’m going to be sending from all of them to your IPs, please round-robin between them when sending back”, after which it should be able to immediately send a register and start getting going.  At this point I can address another issue, since we now round-robin the packets, and we know which set of peer IPs belong together we can actually get rid of our additional IPs on the server!  I’d still keep two public-facing ones, one for the raw IAX/2 asterisk port (for efficiencies sake for those clients that don’t need the load balancing), and the latter for balancing the IAX/2, which then dynamically creates an alias on local loopback and transmits to the other IP (last I checked the Linux kernel could get quite confucious about this …. but it should be possible to get something working).

The downside here is that we lose the ability to redirect our calls, but seeing as we’re load balancing we may not actually want those redirected streams as they mess with our ability to “trunk” calls.  Instead I’d say just charge the client a premium for the ability to load balance.

Even more scary with this is that if a link does go down we just gained the ability to (permitting we realize the dead link quick enough) not drop the call (we can now actually “jump” the voice data over to different links).  Let’s say we do have a 4-way setup, and we do lcp echo requests every 5 seconds, and 3 consecutive no-responses kills the link it’s going to imply a bi-directional packet loss of 25% for around 15 seconds.  This is NOT so hot, but in many cases better than dropping the call entirely.

Just flippen update the protocol!

Ok, by now I’m slightly fed up actually (tired).  Come to think of it though – what is there to prevent this being added directly into asterisk and the IAX/2 protocol?  Basically tel asterisk to perform the round-robin on a set of local interfaces (source IPs?) for a particular peer/user, and if the peer supplied us with multiple IPs in it’s registration, round robin it on that side too!  Really, consider something like this quickly:

[ulsvoip]
type=friend
host=voip.uls.co.za
local_ifaces=ppp0,ppp1,ppp2,ppp3
...

Also assume that voip.uls.co.za (for now, resolves to a single IP only), now, when we register we supply all active IPs for ppp0, ppp1, ppp2 and ppp3 (so if one of more of the devs is down they simply get ignored for that round but if a device does change state we need to refresh our registration).  Now we have a set of IPs for local, and we simply alternate with using those IPs as source.  Let the kernel handle the actual routing (just make sure your policy based routing is set up properly – not difficult to do).

On the server side it’s dead simple, when the registration comes in it’ll contain multiple IPs, so any packets coming from any of those IPs are treated as coming from a single source.  When sending packets back, we simply rotate the destinations so as to effectively round-robin between the different links on it’s side.

And lasty, there is no reason why this can’t be done on both ends simultaniously, even with different numbers of links on both sides!  In other words, the example above, but voip.uls.co.za resolves to 4 or even 5 different IPs (Rather moot at this point since all IPs are assigned to the same interface anyway).

Comments are closed.