[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Traceroute losses through NYC1.gblx.net?
- Subject: Traceroute losses through NYC1.gblx.net?
- From: skbohrer at simons-rock.edu (Steve Bohrer)
- Date: Fri, 16 Sep 2011 14:42:52 -0400
My general question is "what meaning do I give to lossy traceroutes,
even when pings show no problem."
Can I expect that backbone routers should never give me timeouts on a
traceroute through them, so, lots of asterisks from these systems
indicate a packet loss problem that needs to be fixed?
Or, are these traceroute asterisks essentially meaningless, and should
be expected on any busy link?
More specifically, is anyone else getting lots of *s for NYC1.gblx.net
for traceroutes through them? If I do three traceroutes through there,
at least one will show losses at or beyond the NYC1 hops (and, the *s
beyond NYC1 might be getting lost in NYC1, rather than indicating a
different error). But, Global Crossing's on-line tools don't show any
loss.
I am at simons-rock.edu, in Western Mass, and we connect via Boston. A
few days ago, our users of a database that's hosted at our parent
campus, bard.edu, started complaining of many frequent (but
intermittent) delays. Bard is in the Hudson Valley, and connects via
Poughkeepsie. Both of our local providers connect to Global Crossing.
Once before, we saw similar database symptoms, and that time, Bard had
a problem dropping packets at their gateway. So I think these symptoms
mean packet loss is happening somewhere. However, this time, pings
from Simon's Rock to Bard, and vice-versa, show essentially no errors,
typically 1000 pings will get through 100%.
Still, despite the good pings, traceroutes from either end show lots
of asterisks at or after Global Crossing's NYC1.gblx.net links. I have
opened a ticket with our provider, who has opened one with Global
Crossing; and Bard has done the same with their end, but no
significant response so far. (Bard's Graduate campus, located in New
York City, is having similar poor database performance, so I'm pretty
sure it is not just my end. Staff at the main Bard campus have no
troubles, so it seems a network problem, not a server problem.)
As I understand it, an asterisk in traceroute means that the sending
machine did not get any reply to a given packet. Since the traceroute
packets have small TTL values, it expects to get a reply when the TTL
is decremented to zero. But, I don't know if big routers are just lazy
about sending such responses, or if these asterisks really indicate
packets getting lost. (As far as I remember in the past, when things
work well, I never see *s at the central links, but, I have not really
done any baseline testing of the link from here to Bard when the
database was working.)
So, another question is why pings work so well when traceroutes work
so poorly. (By experiment, I believe our database application performs
more like traceroute than like ping.) Is it packet size? Different
handling for different sorts of traffic? Magic?
Here are some sample traceroutes each way:
Simon's Rock to Bard:
2h189:bin skbohrer$ traceroute -q5 -S bip.bard.edu
traceroute to bip.bard.edu (192.246.228.16), 64 hops max, 40 byte
packets
1 10.30.2.1 (10.30.2.1) 1.514 ms 1.791 ms 0.684 ms 0.761 ms
0.712 ms (0% loss)
2 michael.simons-rock.edu (208.81.88.1) 2.509 ms 1.882 ms 0.899
ms 1.345 ms 2.057 ms (0% loss)
3 64.213.79.249 (64.213.79.249) 104.294 ms 10.605 ms 17.106 ms
18.987 ms 38.740 ms (0% loss)
4 pos2-0-155M.cr2.BOS1.gblx.net (67.17.70.166) 21.962 ms 20.411
ms 8.394 ms 23.308 ms 10.192 ms (0% loss)
5 so1-2-0-2488M.scr2.NYC1.gblx.net (67.17.94.158) 15.738 ms
14.582 ms 17.306 ms 24.444 ms 15.466 ms (0% loss)
6 ae3-30g.scr3.NYC1.gblx.net (67.17.104.189) 15.586 ms 13.358 ms
ae0-30G.scr4.NYC1.gblx.net (67.16.139.2) 13.875 ms 13.495 ms 12.780
ms (0% loss)
7 e5-1-30G.ar9.NYC1.gblx.net (67.16.142.54) 75.184 ms
lag1.ar9.NYC1.gblx.net (67.16.142.50) 15.766 ms 11.947 ms *
e5-1-30G.ar9.NYC1.gblx.net (67.16.142.54) 25.916 ms (20% loss)
8 * * wbs-connect.gigabitethernet1-0-2.asr1.jfk1.gblx.net
(64.211.195.6) 55.909 ms 73.803 ms * (60% loss)
9 * pghknyshj42-xe-0-3-0.lightower.net (72.22.160.150) 16.521 ms
21.817 ms 23.715 ms 17.236 ms (20% loss)
10 pghknyshj91-ae0-66.lightower.net (72.22.160.165) 76.257 ms
27.712 ms 20.372 ms 18.923 ms 55.355 ms (0% loss)
11 kgtnnykgj91-ae3.66.lightower.net (72.22.160.107) 18.088 ms
51.631 ms 19.052 ms 20.876 ms 22.942 ms (0% loss)
12 BardCollege-cust.customer.hvdata.net (64.72.66.234) 51.243 ms
47.800 ms 32.835 ms 19.040 ms 55.661 ms (0% loss)
13 *^C
Bard to SR (their version of traceroute doen't have the handy -S
option):
SRDB/users/usrsr/finrep: traceroute mail.simons-rock.edu
trying to get source for mail.simons-rock.edu
source should be 10.20.11.23
traceroute to hedwig.simons-rock.edu (208.81.88.14) from 10.20.11.23
(10.20.11.23), 30 hops max
outgoing MTU = 1500
1 hcrcgw (10.20.11.1) 1 ms 0 ms 0 ms
2 hyphen (192.246.235.1) 1 ms 1 ms 1 ms
3 BardCollege-hvdn.customer.hvdata.net (64.72.66.233) 1 ms 1 ms
1 ms
4 pghknyshj91-xe-5-2-0.lightower.net (72.22.160.106) 2 ms 2 ms 2
ms
5 pghknyshj42-ae0-66.lightower.net (72.22.160.159) 27 ms 2 ms 2 ms
6 nycmnyzrj42-xe-0-3-0.lightower.net (72.22.160.151) 4 ms 4 ms 4
ms
7 ve463.ar9.NYC1.gblx.net (64.211.195.5) 4 ms 4 ms 4 ms
8 * ae0-40G.scr1.NYC1.gblx.net (67.16.138.253) 4 ms 4 ms
9 pos5-0-2488M.cr1.BOS1.gblx.net (67.17.94.57) 9 ms
pos9-0-2488M.cr2.BOS1.gblx.net (67.17.94.157) 9 ms 11 ms
10 pos1-0-0-155M.ar1.BOS1.gblx.net (67.17.70.165) 14 ms 10 ms 9 ms
11 64.213.79.250 (64.213.79.250) 15 ms 15 ms 18 ms
^C
For more automated testing, I used -m10 to set the max hops so that
the traces stop within the backbone network, as this avoids any issue
of the boxes at the ends not really responding to traceroutes. That
way, I could assume any * was a real time out. I also used -q4 for 4
queries to each host. With a few hundred traceroutes each direction,
more than 75% from SR to Bard, and more than 94% from Bard to SR,
showed an asterisk at or past the NYC1 hops. There were zero asterisks
on the links before NYC1 from either side.
Thanks for any insights.
Steve Bohrer
Network Administrator
ITS, Bard College at Simon's Rock
413-528-7645