Cell C following in the footsteps of Vodacom?

Most people that know me well will know that I really don’t like the way Vodacom runs their firewalls for their 3G consumers. In fact, they’ve managed to make it onto my blog no less than 3 times now – and not once for anything they’ve done right. And now Cell C have decided to join the crowd of braindead arseholes who can’t run firewalls. I present to you the man-in-the-middle TCP connection reset. As it stands right now I can’t ssh. I can’t connect to my jabber server. I can’t even browse. At least, not using my Cell C internet connection.

UPDATE: Please note that Cell C has already contacted me regarding this. See comment #1 below for more details.

Unfortunately it’s insanely hard to prove conclusively where the TCP resets are coming from, again the only evidence I’ve got that it has to be Cell C is the fact that it works flawlessly from everywhere else (SAIX ADSL, Mweb ADSL and Vodacom 3G). So the first things I started noticing yesterday was ssh connections going something down these lines (serenity is my local machine, linux.delter.co.za a relatively big mail server from one of my clients):

jkroon@serenity ~ $ ssh root@linux.delter.co.za 
ssh_exchange_identification: read: Connection reset by peer

Now my employees knows, if I can’t ssh and it’s your fault, you’re going to get it. Firstly I will hunt you down, then I will do things which cannot be considered polite, and if you’re name is bigger than mine and my client believes that because your name is bigger than mine that implies you’re right and I’m wrong I will ensure that I prove them wrong and make very sure that they understand that I don’t take these things lightly. Well, not when it affects my work anyway, but I do understand if things breaks periodically, but at the moment I can’t even browse and in excess of 95 % of the connections I’m pushing out over my Cell C SIM is outright being reset.

So after seeing the above for for approximately 3 out of 5 connections this morning whilst sitting in a data center in johannesburg I just ran a tcpdump in a different shell on serenity to see what happens:

08:53:56.835503 IP 196.35.70.139.ssh > 10.213.51.133.47019:
    Flags [R.], seq 2902460094, ack 3806794395, win 199,
    options [nop,nop,TS val 26312228 ecr 4876698], length 0

Now I KNOW the way I set up my servers. And if my server is in fact generating that RST then there is something severely broken. But I’m getting this from different servers. So, having had one of the craziest weekends for a while going on I decided to push this to the back of my mind and concentrate on more urgent matters. It’s only about 30 minutes back that I wanted to quickly check mail, browse a bit and just unwind a little that I couldn’t actually browse, ssh to my servers for a quick checkup after the weekend’s events and write an official complaint to a certain hosting company that I decided enough is enough. Got a jump box and ssh’ed via another route to linux.delter.co.za (and no surprises) it worked flawlessly. Fire up pppd and add a route for linux.delter.co.za over that, fire up tcpdump on both ends and I get this, first on serenity (sorry for the horizontal scrolling, and also note that the time on my laptop is out by ~30 minutes due to ntp failing and the CMOS on this Lenovo being of the ultra crappy kind):

18:46:14.706636 IP 10.212.100.200.46247 > 196.35.70.139.ssh: Flags [S], seq 8197450, win 5840, options [mss 1460,sackOK,TS val 187563 ecr 0,nop,wscale 7], length 0
18:46:15.326350 IP 196.35.70.139.ssh > 10.212.100.200.46247: Flags [S.], seq 1497501202, ack 8197451, win 5792, options [mss 1460,sackOK,TS val 2837150 ecr 187563,nop,wscale 6], length 0
18:46:15.326445 IP 10.212.100.200.46247 > 196.35.70.139.ssh: Flags [.], ack 1, win 46, options [nop,nop,TS val 187626 ecr 2837150], length 0
18:46:15.659155 IP 196.35.70.139.ssh > 10.212.100.200.46247: Flags [P.], seq 1:21, ack 1, win 91, options [nop,nop,TS val 2837190 ecr 187626], length 20
18:46:15.659267 IP 10.212.100.200.46247 > 196.35.70.139.ssh: Flags [.], ack 21, win 46, options [nop,nop,TS val 187659 ecr 2837190], length 0
18:46:15.659458 IP 10.212.100.200.46247 > 196.35.70.139.ssh: Flags [P.], seq 1:22, ack 21, win 46, options [nop,nop,TS val 187659 ecr 2837190], length 21
18:46:15.969153 IP 196.35.70.139.ssh > 10.212.100.200.46247: Flags [.], ack 22, win 91, options [nop,nop,TS val 2837221 ecr 187659], length 0
18:46:15.969221 IP 10.212.100.200.46247 > 196.35.70.139.ssh: Flags [P.], seq 22:814, ack 21, win 46, options [nop,nop,TS val 187690 ecr 2837221], length 792
18:46:16.349157 IP 196.35.70.139.ssh > 10.212.100.200.46247: Flags [P.], seq 21:805, ack 22, win 91, options [nop,nop,TS val 2837221 ecr 187659], length 784
18:46:16.386434 IP 10.212.100.200.46247 > 196.35.70.139.ssh: Flags [.], ack 805, win 58, options [nop,nop,TS val 187732 ecr 2837221], length 0
18:46:16.599149 IP 196.35.70.139.ssh > 10.212.100.200.46247: Flags [.], ack 814, win 116, options [nop,nop,TS val 2837284 ecr 187690], length 0
18:46:16.599227 IP 10.212.100.200.46247 > 196.35.70.139.ssh: Flags [P.], seq 814:838, ack 805, win 58, options [nop,nop,TS val 187753 ecr 2837284], length 24
18:46:16.926374 IP 196.35.70.139.ssh > 10.212.100.200.46247: Flags [.], ack 838, win 116, options [nop,nop,TS val 2837315 ecr 187753], length 0
18:46:16.986396 IP 196.35.70.139.ssh > 10.212.100.200.46247: Flags [P.], seq 805:957, ack 838, win 116, options [nop,nop,TS val 2837315 ecr 187753], length 152
18:46:16.986458 IP 10.212.100.200.46247 > 196.35.70.139.ssh: Flags [.], ack 957, win 71, options [nop,nop,TS val 187792 ecr 2837315], length 0
18:46:16.988362 IP 10.212.100.200.46247 > 196.35.70.139.ssh: Flags [P.], seq 838:982, ack 957, win 71, options [nop,nop,TS val 187792 ecr 2837315], length 144
18:46:17.477898 IP 196.35.70.139.ssh > 10.212.100.200.46247: Flags [.], ack 982, win 140, options [nop,nop,TS val 2837372 ecr 187792], length 0
18:46:17.798919 IP 196.35.70.139.ssh > 10.212.100.200.46247: Flags [P.], seq 957:1677, ack 982, win 140, options [nop,nop,TS val 2837372 ecr 187792], length 720
18:46:17.801802 IP 10.212.100.200.46247 > 196.35.70.139.ssh: Flags [P.], seq 982:998, ack 1677, win 83, options [nop,nop,TS val 187873 ecr 2837372], length 16
18:46:18.048892 IP 196.35.70.139.ssh > 10.212.100.200.46247: Flags [R.], seq 1677, ack 982, win 0, length 0
18:46:18.088924 IP 196.35.70.139.ssh > 10.212.100.200.46247: Flags [R.], seq 1677, ack 998, win 0, length 0

And on linux.delter.co.za:

20:17:52.448213 IP 41.157.80.24.57790 > linux.delter.co.za.ssh: S 8197450:8197450(0) win 5840 
20:17:52.448247 IP linux.delter.co.za.ssh > 41.157.80.24.57790: S 1497501202:1497501202(0) ack 8197451 win 5792 
20:17:52.840000 IP 41.157.80.24.57790 > linux.delter.co.za.ssh: . ack 1 win 46 
20:17:52.846789 IP linux.delter.co.za.ssh > 41.157.80.24.57790: P 1:21(20) ack 1 win 91 
20:17:53.079622 IP 41.157.80.24.57790 > linux.delter.co.za.ssh: . ack 21 win 46 
20:17:53.161323 IP 41.157.80.24.57790 > linux.delter.co.za.ssh: P 1:22(21) ack 21 win 46 
20:17:53.161345 IP linux.delter.co.za.ssh > 41.157.80.24.57790: . ack 22 win 91 
20:17:53.162056 IP linux.delter.co.za.ssh > 41.157.80.24.57790: P 21:805(784) ack 22 win 91 
20:17:53.750503 IP 41.157.80.24.57790 > linux.delter.co.za.ssh: P 22:814(792) ack 21 win 46 
20:17:53.784763 IP linux.delter.co.za.ssh > 41.157.80.24.57790: . ack 814 win 116 
20:17:53.839959 IP 41.157.80.24.57790 > linux.delter.co.za.ssh: . ack 805 win 58 
20:17:54.100058 IP 41.157.80.24.57790 > linux.delter.co.za.ssh: P 814:838(24) ack 805 win 58 
20:17:54.100072 IP linux.delter.co.za.ssh > 41.157.80.24.57790: . ack 838 win 116 
20:17:54.102989 IP linux.delter.co.za.ssh > 41.157.80.24.57790: P 805:957(152) ack 838 win 116 
20:17:54.479858 IP 41.157.80.24.57790 > linux.delter.co.za.ssh: . ack 957 win 71 
20:17:54.630280 IP 41.157.80.24.57790 > linux.delter.co.za.ssh: P 838:982(144) ack 957 win 71 
20:17:54.664784 IP linux.delter.co.za.ssh > 41.157.80.24.57790: . ack 982 win 140 
20:17:54.673005 IP linux.delter.co.za.ssh > 41.157.80.24.57790: P 957:1677(720) ack 982 win 140 
20:17:55.236939 IP 41.157.80.24.57790 > linux.delter.co.za.ssh: R 982:982(0) ack 1677 win 0

Upon initial inspection I have to say, I don’t see the tell-tale signs of tcp splicing as I did with Vodacom. There doesn’t appear to be any sequence number adjustments. There is some NAT going on which isn’t desirable (and Vodacom moved away from using NAT once they’re user base started getting beyond a certain point because “it didn’t scale” according to one of their lead technicians).

When I say I can’t find signs of tampering I really mean it. Looking at the above you’ll see there is 19 packets on linux.delter.co.za and 21 on serenity. The first 18 of both these traces ARE IDENTICAL (other than the NAT’ed IP). After this 18th packet the server side receives an RST packet directly after it sent the data for 957:1677, along with a correct ACK for 1677. The client side receives this data, and no surprisingly doesn’t actually respond with an RST but instead with an ACK. Directly after sending this ACK it receives two identical RST packets, which again, has not been sent by the server.

So I ask this – who is generating these RST packets? Who can I have beaten with a blunt object? I want to unwind – it’s been a bad weekend with this little cherry on top.

8 Responses to “Cell C following in the footsteps of Vodacom?”

  1. Jaco Kroon says:

    Well, I must say, Cell C, you’re extremely responsive. I just received an email from the CIO of Cell C, had a chat with the guy, and kudus to Cell C, and thanks for the chat. It’s good to know that some companies are on the lookout for improving their customer service. I am severely impressed. I’m not easily impressed.

  2. Strauss says:

    So did you ever get a workaround for this?

    I’ve recently moved and am Waiting For Telkomâ„¢; Using my 4Gs stick as a temporary stopgap, my SSH connections keep dropping off, much to my annoyance. :\

    Regards,
    Strauss

    • Jaco Kroon says:

      Hi,

      No, I’m afraid not. I noticed again the other day that the problem persists. At least for Vodacom I could disable selective acks, but alas, that does not solve it for Cell C.

  3. Did you manage to sort out this issue? I have same problem and since I need to look after 4 headless boxes which is doing processing for my site from my home, (thus allowing me to run my site on shared server), I need my ssh connections to work right…
    Any advice on this from your side? Do you maybe have reference numbers for the log to CellC we can use to press the matter?
    Thanks for a useful post

  4. Jaco Kroon says:

    Haven’t seen it in a while myself. If it is still the exact issue as per above you’re pretty stuffed. There really isn’t anything you can do from your side to fix it. Cell C, however, beyond initial contact has been a major let-down in all aspects. They looked for user feedback, and then when it was provided rejected it out of hand, when asked about the NAT issue disclosed that they don’t have sufficient IP space, and generally admitting that their network is not really up to the task.

    From the sounds of it and the type of responses I’ve received on some of my queries it also doesn’t look like they themselves knows what’s really going on.

    My recommendation: Unless you’re a user that don’t care about anything but browsing, or don’t mind dealing with sporadic problems like described here: Get a different ISP.

  5. Thank you for the reply. I did find Internet browsing very satisfactory, with data consumption far less than on a Vodacom connection, (though I think that relates to the handset/modem). I have not diagnosed the origin of the interruptions I have to the same extend you mention – but seems worse when using ssh connections to my house (where my router updates dyndns) than when I connect to the webhost in USA.
    Even so, the connections to USA are not really satifactory – they drop quite often.

  6. Kyle says:

    Hey Jaco, nice article.
    I’m still having issues with using ssh on my cell c cellular device. Its been 2 years, have you found a solution yet? I’m pulling my hair out, this is my back up internet device -> Telkom line is down. So this limits me from checking in on servers to see if nothing is going funny…

  7. Jaco Kroon says:

    Kyle, yea, I moved my business elsewhere. But even on my Vodacom modem I can’t seem to maintain a single SSH connection longer than ~15 minutes without it dropping, even with SSHs keepalive enabled at 15-20s intervals. And more recently, I’ve seen the same thing with SPICE (similar to rdesktop/vnc, but mostly used by KVM and qemu), going over anything other than a mobile network it never drops, going over mobile, 5 to 10 minutes max.