User Guide: Finding and Fixing Potholes on the Information Highway
Download
PDF Version
Anyone who has used the Internet for a while realizes it doesn't
always work as consistently as we would like. Not unlike many
highways, it seems there are frequent potholes, traffic tie-ups,
and endless construction.
This results in a litany of questions -
- Everything was working fine yesterday, but today I can't to
get my mail. How come?
- Why is this connection so slow?
- Why can't I access this published URL?
- How do I tell if my Internet software is working properly?
This section of the User Guide describes how to use the IPNetMonitor
tools to identify, report, and get around some of these common
problems.
Internetworking 101
The Internet is actually a network of thousands of privately
run networks using different equipment with only minimal coordination
needed between them. This minimal coordination gives the Internet
the ability to expand and evolve rapidly since almost anyone can
add their network to the Internet. It can also lead to problems
when some piece of equipment you know nothing about breaks down
or coordination fails.
To manage this vast enterprise, the Internet is organized into
hierarchical sections or domains.
Message processing computers called "Routers" or "Gateways"
are used to connect individual networks together, with specialized
Gateways used at exchange points where different carriers or service
providers can exchange traffic for their respective networks.
In order to communicate with another computer on the Internet,
your computer will normally go through four steps:
- Lookup the address of the host you wish to communicate with.
Since people aren't very good at remembering lots of numbers,
host computers on the Internet are usually identified by a name.
The name usually includes both the name of the individual host,
and a hierarchy or list of names that describe the domain or
part of the network where it resides. This is like calling directory
service to find someone's telephone number. On the Internet,
this type of directory service is called Domain Name Service
(DNS), and computers that handle these requests are called Name
Servers.
- Determine if the address is local to this network, or if the
message needs to be forwarded to another network. This is kind
of like determining if you need to dial 1 plus the Area Code
before dialing the rest of a phone number.
- If the address is local to this network, the message is delivered
directly.
- If the address is not local to this network, the message is
sent to a router or gateway that can forward the message on
to its destination. This process can be repeated through a dozen
or more networks before the message is actually delivered.
A problem might occur at any of these stages. The IPNetMonitor
tools help you to quickly identify if there is a problem and where.
Is There a Network Problem?
If you experience trouble or excessive delays in accessing a
service on the Internet, a good first step is to test if there
is a network problem. This is what the Test Connectivity tool
is for.
The Test Connectivity tool can be used to verify the following
information:
(1) Are you able to lookup the address of the service
you are trying to access. If you type the name of the host in
the name field and press return, the tool will contact your Domain
Name Server to lookup the corresponding address. If there is no
response within a reasonable amount of time, or no host name is
found, the next step is to check your connection to the Domain
Name server. Verify in the TCP/IP control panel that you have
specified the correct name servers for the service provider you
are using, and try pinging to their IP addresses directly.
If there is still no response and you are sure you are using
the correct name server (because it has always worked before),
the name server could be down, or your service provider may have
a local routing problem. Press Cmd-R to do a Trace Route to your
name server (refer to the next section for more information on
Trace Route). If you see a response from one or more routers,
your connection to your ISP is working but there appears to be
local routing problem. You might try hanging up and connecting
a few more times to see if the problem corrects itself or if you
can identify a pattern (the TraceRoute stops responding after
a specific router). Copy the Trace Route results and contact your
service provider, they will probably be very interested to see
your Trace Route results.
(2) Are you able to reach the remote host you are trying
to access. If one or more echo response packets are received,
this means the remote host is reachable. Perhaps the specific
server software on the remote host is down. If the problem is
not fixed in a reasonable amount of time, you can try sending
mail to the organization that maintains the host or server. If
you don't know who to send mail to, you may be able to look it
up using the "Who Is" tool described later in this document.
If no responses are received, this usually means the remote host
is currently unreachable. Press Cmd-R to begin a Trace Route to
help you further isolate the problem.
(3) Is there excessive delay or packet loss on route to
the serivce your are trying to access. The example above shows
the round trip delay is around 1.5 seconds which is slower than
usual, but not extraordinary. Packet loss was 10%. Some packet
loss is normal due to congestion, but if losses exceed 30% for
several minutes or longer, this could indicated a more serious
problem (and performance will degrade badly).
Where is the Problem?
Once you identify what appears to be a connectivity problem,
the next step is to locate the problem or isolate it in more detail.
Pressing Cmd-R from any IPNetMonitor window will display the Trace
Route tool and initiate a Trace Route to the corresponding address
if any.
In the Test Connectivity example above, we noticed a somewhat
longer round trip delay than expected. The Trace Route tool shows
the sequence of routers our message passes through and the approximate
round trip time to each one. [The time shown is the average round
trip time in seconds from three attempts to solicit a "time
limit exceeded" response from each router. Some routers
do not respond consistently to this condition, and the actual
time may vary widely from one packet to the next. Packets may
even take a different route from one trial to the next which is
indicated by more than one row with the same Hop number. To get
a more accurate reading, you can double-click on any row to start
a ping test to that address].
In the Trace Route example shown above, it's clear that Hop 11
is the slow link.
If the trace continues for several rows with no responses received
(indicated by all red X's in the Received column), this indicates
the message stopped being properly forwarded before reaching its
destination. The last router shown is most likely the one that
was unable to forward the packet correctly. If this problem persists,
you should report it to the appropriate network administrator.
To find out who this is, click on the row containing the last
IP Address and Name found, and press Cmd-I to invoke the "Who
Is" tool described below. [If you are able to ping the destination
but the trace doesn't stop, this could mean the destination is
not responding to the condition that indicates the trace is complete
(port unreachable).
Not all IP implementations respond consistently.]
Another problem you might encounter is a "routing loop"
where a router at a later hop forwards the message to a router
from an earlier hop. In this case, each router thinks the other
is closer to the destination so they forward the message back
and forth until it times out. Again, if this problem persists,
you should report it to the appropriate network administrator.
To Whom Should I Report a Problem?
Once you have isolated the problem to a specific network, and
determined it is serious enough to justify reporting to a network
administrator, you need to find out who this is.
"Who Is" searches the InterNIC (Internet Network Information
Center) Database of registered names allowing you to find the
organization and administrative contact responsible for networks
with registered domain names.
For example, if we had discovered a routing problem at "bbnplanet.net",
we could use "Who Is" to find out who is responsible
for "bbnplanet.net". The information returned will normally
include the email address of an administrative contact to be notified
in case of connectivity problems.
When you decide to report a problem, it is important to include
enough information for the network administrator to figure out
what went wrong. Usually, the Trace Route window will provide
exactly the information a network administrator needs. Click on
a row in the Trace Route window, choose Select All followed by
Copy, and then paste the results into your email message to the
appropriate network administrator (see example below).
To: LIMAN@SUNET.SE (Technical Contact: nordu.net)
Dear Network Administrator,
I believe I have found a routing problem in your network.
The attached Trace Route shows the problem. There is a routing
loop starting at hop 14.
<traceroute: 128.214.48.122>
Hop Sent Received Seconds IP Address Name
1 YYY YYY 0.15 146.115.101.226 dial-12.mbo.ma.ultra.net
2 YYY YYY 0.15 146.115.12.67 un-gw-3.mbo.ma.ultra.net
3 YYY YYY 0.16 199.232.56.71 infra-w-4.mbo.ma.ultra.net
4 YYY YYY 0.15 206.185.153.37 agis-ultra.boston.agis.net
5 YYY YYY 0.19 206.185.153.218 a0.1008.washington2.agis.net
6 YYY YYY 0.25 192.41.177.145 mae-east.agis.net
7 YYY YYY 0.26 204.130.243.35 h0-0.trenton1.agis.net
8 YYY YYY 0.24 205.137.61.1 h0-0.pennsauken1.agis.net
9 YYY YYY 0.20 192.157.69.9 sl-pen-2-f4/0.sprintlink.net
10 YYY YYY 0.27 144.228.60.101 icm-pen-2-f1/0.icp.net
11 YYY YYY 0.31 198.67.131.26 icm-uk-1-h0/0-t3.icp.net
12 YYY YYY 0.44 198.67.131.42 icm-stockholm-1-h0/0-e3.icp.net
13 YYY YYY 0.30 192.36.148.205 syd-gw.nordu.net
14 YYY YNY 0.34 192.36.148.54 fi-gw.nordu.net
15 YYY YYY 0.30 192.36.148.205 syd-gw.nordu.net
16 YYY YYY 0.34 192.36.148.54 fi-gw.nordu.net
17 YYY YYY 0.30 192.36.148.205 syd-gw.nordu.net
18 YYY NYY 0.34 192.36.148.54 fi-gw.nordu.net
Thank you for your attention to this matter.
[Always include your name and email address to be sure the
network administrator can contact you for more information if
needed.]
The better you can describe the problem and report it to the
right person, the faster it is likely to get corrected.
Working Around Problems
It's all very nice to report problems to the right person, but
I have work to do, how can I get around these problems? This is
where the creative art of using the Internet comes in. Depending
on what the problem is, there are a number steps you may be able
to use to get around it. This section will describe some of them.
You'll probably discover others.
The first thing to realize is that the Internet is not a carefully
regulated utility like the telephone or electric company, and
will not approach their reliability for at least several years
to come. Besides being a relatively new technology that's evolving
and growing rapidly, the original Internet was designed around
the idea of sharing resources to provide low cost universal access
to as many computers as possible. The design trades off guaranteed
reliability in favor of cost and simplicity (this is simple? :-).
You can use as much bandwidth as you can get, but nothing is reserved
or guaranteed. [This difference in design philosophy has led some
people to mistakenly view the Internet as "free". This
is incorrect. It still costs money to provide bandwidth. If we
start using the Internet like a telephone and want it to work
like one, it's eventually going to cost like one too.]
If Internet access is critical to your work, look for a service
provider with connections to more than one backbone carrier, and
consider signing up with more than one provider yourself. If the
site you dial into is frequently busy, consider getting an extended
calling plan so you can dial in to more than one POP (Point Of
Presence) site.
If you are unable to send mail or access some other service through
the normal server you use, try using another one. My ISP provides
multiple SMTP (outgoing mail) servers for different realms. Normally
my mail program is set to use "smtp.ma.ultranet.com"
(Massachusets realm). If this server goes down, I can sometimes
use "smtp.nh.ultranet.com" (New Hampshire realm) to
send my mail.
Think about what you are trying to accomplish. There are often
alternate servers (mail, ftp, news, etc.) that can help you get
the information you need.
Some problems might be with your own computer. As you begin to
learn more about how the Internet and your own equipment normally
works, you will be able to find and correct more of these problems
as well. The Monitor tool is espescially helpful for this because
it lets you get a visual sense for how things should look when
everything is working properly. The probe for the Monitor tool
is actually inserted between IP and the Data Link Provider (PPP)
layer. If you suddenly notice there is no receive data, or long
delays, you can begin isolating the problem immediately.
A while ago I wanted to give an Internet demo from my Powerbook.
I installed the latest versions of Open Transport and PPP only
to discover the performance was terrible. Name Server lookups
were taking forever. I did a Trace Route to the Name Server and
realized I had changed service providers a few months ago and
forgotten to update the Name Servers on my Powerbook (in the TCP/IP
control panel). My faithful Powerbook was trying to lookup names
from a Name Server on another network many hops away.
How do you tell if two IP addresses are on the same network?
This is a good use for the Subnet Calculator (it also serves as
a convenient scratch pad). Press Cmd-B from any IPNetMonitor window
to invoke the Subnet Calculator with the corresponding IP address
if any. The Subnet Calculator will determine the default mask
for this class of IP address and separate the address into its
network and host parts. Two IP addresses are on the same network
if they have the same network number. [Determining the correct
Mask is more complicated if subnetting or Classless Inter Domain
Routing (CIDR) is used, but that is beyond the scope of this guide.]
It's a Wrap
I hope you've found these tools and hints useful. Whether you're
a real expert, or someone just starting out, I'd welcome your
comments and suggestions.
Peter Sichel
Sustainable Softworks