Log in

No account? Create an account

Previous Entry | Next Entry

And a happy good morning to you too!

I got up this morning around 8:30, toddled down stairs, and plunked myself on the couch for my ritual morning look at the electronic world. My inbox had a bunch of messages from the automated monitoring system (a personal version of this) that keeps watch on my various servers. From the monitor's point of view, Alpha and Beta (my two production web servers) were going up and down.

We've had a lot of rain in the past few days, so my instant suspect was a problem with the DSL circuit that alpha and beta share for their Internet access. Pings from home to the servers were dropping 50% of the time. But, as I started to dig deeper, that hypothesis started losing ground. They weren't responding to SSH, and there was an odd period of high traffic on the MRTG traces for both servers. I began to suspect that the two servers had been compromised. (In other works, hacked! cracked! pwned!)

Thankfully that turned out to be wrong as well. I finally managed to SSH into Alpha and Beta from another host. A quick check of logins, active processes, and open network ports showed that both servers were exactly as they should be.

I turned my attention back to the possibility of a network problem, but things weren't adding up. Normally, when you are having connection problems, the problem is in what telco types call the last mile or the local loop meaning that circuit at the very edge of the carrier's network to you. Yet, both my cable service and the DSL service at Alpha's and Beta's undisclosed secure location appeared to be fine. The problem only manifested itself when one site tried to talk to the other. A traceroute from here to there brought the real problem to the light:

Microsoft Windows 2000 [Version 5.00.2195]
(C) Copyright 1985-2000 Microsoft Corp.

C:\Documents and Settings\netcurmudgeon>tracert alpha.houseofhum.com

Tracing route to gpip.org []
over a maximum of 30 hops:

  1     7 ms     6 ms     8 ms
  2     7 ms     7 ms     7 ms  glstsysc01-gex0102000.ct.ri.cox.net []
  3    10 ms    11 ms    11 ms  provsysj01-atm020311.rd.ri.cox.net []
  4    10 ms    10 ms    12 ms  provdsrj01-ge600.rd.ri.cox.net []
  5    12 ms    10 ms    12 ms  provbbrj01-ge020.rd.ri.cox.net []
  6    17 ms    15 ms    15 ms  NYRKBBRJ01-so000.R2.ny.cox.net []
  7    17 ms    16 ms    16 ms
  8    22 ms    19 ms    20 ms
  9    28 ms    35 ms    24 ms  mrfdbbrj02-ge030.rd.dc.cox.net []
 10    24 ms    23 ms    23 ms  ashbbbrj01-pos020100.r2.as.cox.net []
 11    21 ms     *        *
 12    23 ms     *        *     sp0-4-ASBNVAAS.broadwing.com []
 13     *        *       25 ms  serial2-0-0.e1.nwrk.broadwing.net []
 14     *        *        *     Request timed out.
 15    28 ms    27 ms     *     p6-0.c0.nwyk.broadwing.net []
 16     *        *        *     Request timed out.
 17    25 ms    27 ms     *
 18     *       26 ms     *     hartford.atm.ntplx.net []
 19     *        *        *     Request timed out.
 20    43 ms     *        *     ip-65-75-17-31.ct.dsl.ntplx.com []
 21     *       40 ms     *     ip-65-75-17-31.ct.dsl.ntplx.com []
 22     *        *        *     Request timed out.
 23    39 ms    40 ms     *     ip-65-75-17-31.ct.dsl.ntplx.com []
 24     *        *       38 ms  ip-65-75-17-31.ct.dsl.ntplx.com []

Trace complete.

C:\Documents and Settings\netcurmudgeon>

There, right in the middle of the trip we start losing packets (the red asteriks). Further poking showed that whatever router has the IP address was (still is) dropping half of the traffic that gets to it. The address doesn't resolve to a name, so I can't tell if it's Cox's or Broadwing's problem, but right at the border of their networks something is amis. Hopefully some groggy geek or geekette has been paged in and is looking at it.

Hey, from my perspective at least I'm not owned!


( 2 comments — Leave a comment )
Jun. 4th, 2006 07:40 pm (UTC)
Well, 68/8 is a Cox netblock (note the previous Cox hosts in the same range), so I'd yell at Cox first.
Jun. 4th, 2006 08:16 pm (UTC)
...Ultimately I did. The amusing thing is that Cox residential support had no clue that there was a problem, but Cox business support had a recording on their support line announcing the problem. They appear to have fixed it now.
( 2 comments — Leave a comment )

Latest Month

January 2017


Page Summary

Powered by LiveJournal.com
Designed by Lilia Ahner