Access Intelligence's BROADBAND GROUP
Communications Technology
Current Issue
Subscribe
Advertising Information
Meet the Editors
Advisory Board
Annual Awards
Custom Publishing
WebEvents
Show Dailies
Reprints
List Rentals
Archives
Search Career Center Contact Us Calendar Industry Partners Home

Archives

Communications Technology May 2000 Issue
Feature

High-Speed Troubleshooting for High-Speed Data
Fixing PC Modem Problems, Part 2
By Bruce Bahlmann

Ping, dig, trace, arp, scan, get, set … sound like gibberish? Believe it or not, these are the tools of the next-generation broadband employees who are tasked with resolving customer problems as well as building and maintaining the delivery medium.

In 1996, MediaOne (then Continental Cablevision) began a high-speed data alpha field trial. Back then, we could count our high-speed data customers on a couple pairs of hands, operated in makeshift facilities (known as the Bat Cave) and had customers who were so pleased to be selected for this early trial phase that they were extremely understanding when technical problems arose.

Back then, we used static Internet protocol (IP) addressing, and each customer’s personal computer (PC) and cable modem IP address was written down for easy reference. We used this information along with one of the simplest network troubleshooting tools, called ping, to check if the customer was connected to our growing broadband network. Beyond that, we either corrected the problem over the phone or resorted to rolling a truck to resolve the problem on-site.

Because we hadn’t built up sufficient experience with cable modems, networking PCs or hybrid fiber/coax (HFC) return path issues, we rolled a lot of trucks. Through the alpha and follow-on beta field trials, we acquired a wealth of information about what it takes to run a successful high-speed data service. This information directly led to several infrastructure and process improvements across all high-speed data-affiliated groups. However, these improvement efforts failed to address one important question: How are we going to diagnose customer problems?

Tools of the trade

Today, we no longer count customers on pairs of hands, but rather by the hundreds of thousands, and those same customers who were so understanding in the past light up our phones at the slightest degradation in service quality. We also no longer use static IP addressing for PCs and cable modems, but rather maintain the largest dynamically addressed (via dynamic host configuration protocol, or DHCP) networks in the world.

Troubleshooting devices on this cutting-edge DHCP network is a major challenge. However, since our early trials with DHCP during alpha and beta phases, we began asking the following questions:

  • How can we ping a customer’s cable modem and PC when we don’t know their "current" IP address?
  • How can technicians determine whether the customer’s cable modem and PC are reaching the DHCP server?

To address these questions, I began developing a troubleshooting tool that would enable technicians to look up any device’s current IP address and ping it. This troubleshooting tool began merely as a way for a small group of individuals to confirm the provisioning process. Through the years, I continued to add to this tool using information I learned from watching various groups use the tool and implementing every one of their requests.

New releases of this tool came out in Internet-time (often the very same day), and the tool gained popularity because of its timely response to the needs of its user community. Eventually, I began deploying the tool in other MediaOne locations, adding their feature suggestions as well and making it scalable and customizable to the point where today it is known across all of MediaOne as simply the "Bruce Tools" or the "Bahlmann Tools" (patent pending).

This troubleshooting tool gathers and consolidates information belonging to a number of different sources and shortens the time required to troubleshoot problems while eliminating the need for multiple individuals to have access (shell accounts) to mission-critical servers.

Through consolidation of this information, the troubleshooting tool provides the essence of what every installer, plant operations technician, broadband service representative (BSR), network operations center (NOC) staff and network engineer needs to diagnose most customer and network problems associated with high-speed data. In fact, every one of these groups requires access to this troubleshooting tool to perform their jobs.

Let’s look at the design and function of this troubleshooting tool in resolving customer, network and configuration problems.

Troubleshooting challenges

The troubleshooting tool requires information and connectivity to a number of various high-speed data and subscriber management resources as pictured in Figure 1. Further complicating matters is that several different organizations are responsible for this data, including your Internet service provider (ISP), information technology (IT) folks and network operations group. Each of these groups has its own rules about privacy, security, availability and so on.

In addition, there is another problem regarding the placement of this tool with respect to the networks it must access to perform its designed task. Subscriber management system (SMS) information resides within your company’s internal network, but this tool is most likely to be on the customer network. For information to flow from these various sources to the tool requires access to data from your SMS as well as various online (or out on the Internet) databases. Orchestrating the proper flow of this information into the tool requires significant coordination, and it is best to design this system from the ground up.

Click and go

The troubleshooting tool is a software application consisting of a collection of screens that allow you to look up, troubleshoot, correct and maintain customer premise equipment (CPE) such as cable modems, PCs, set-top boxes and so on. Basically, the troubleshooting tool can be used for anything that has a media access control (MAC) address or an IP address, including the cable modem termination system (CMTS).

The troubleshooting tool starts by generating a lookup screen that provides many different ways to search for the device in question. (See Figure 2.) Through this screen, you can find any and all CPE associated with one or more of these fields. For example, you could enter the MAC address or the city and fiber node for the search criteria and click "Lookup."

Depending on the scope of the search, the tool will either display all known information about the device (see Figure 4) or display a pick list (see Figure 3) of matches associated with the search criteria. Note that the pick list has some interesting functionality that is less obvious than selecting an exact match.

For example, let’s say you just renumbered a network and are waiting to see if CPE is coming up on the new network. Rather than just looking up a CPE that you know will be coming up on the new network, you can search for CPE with the new network’s address (search by IP "24.128.44."). By leaving the last octet off, the tool will look for CPE on the subnet 24.128.44.x and display them in the pick list. If CPE is not coming up on the new network, something could be wrong with the network configuration or routing. This is just one of many ways that network engineers use the troubleshooting tool.

Once an exact match is found for the search criteria, the troubleshooting tool goes out and collects all the information it can about the device in question. The information gathered is organized and displayed to the user in various forms depending on the user’s access rights. Figure 4 shows one such variation. Other variations may include more or less information.

Here, the information is broken down in terms of Server, Device, Owner and Health. Server information comes from the DHCP server, Device information comes from the tools database and DHCP registry, Owner information comes from the billing system, and Health information comes from the device itself as well as the DHCP manager application. From this screen, all troubleshooting, corrective measures and maintenance are performed via a tool kit (which represents functions that are available to the user depending on his or her access rights and the device in question).

Figure 4 represents a portion of the overall toolkit that addresses cable modem functions. This screen also represents a launch point for the user to access additional equipment related to the CPE displayed. For example, the information below provides a link to the customer’s PC and their headend node as well as a way to e-mail this customer.

All the information in Figure 4 has persistence (is stored in a database for later use). The troubleshooting tool uses persistence to enable a "before" vs. "after" function, which allows users to go back and view parameters (for example, Health data). By being able to view former transmission power levels as well as obtain current health data (via Update Health function), a technician can determine what has changed since the last Health update or since initial installation.

Building bridges

The rest of this article will highlight some unique functions available for troubleshooting high-speed data. Although these functions will be explained in the context of the Bahlmann Tools, the underlying application responsible for providing each function will be identified. Simple network management protocol (SNMP) is used in most cases unless otherwise specified. Note that although a legacy cable modem (LANCity) is used as the example, all these same functions apply to Data Over Cable Service Interface Specification (DOCSIS) modems.

The bridging table represents one of the most useful functions in the toolkit. (See Figure 5.) This function reads the bridge-forwarding table on the cable modem that is a dynamic holding place of MAC addresses that have been "learned" by a modem. This table works similarly to a router’s bridging table by learning the MAC addresses of devices that have talked recently. The default time-to-live for bridging table entries is 300 seconds, so only devices that have recently talked will be located in the table.

Because every modem has two ports, Ethernet and cable TV (LANCity calls this Unilink), two groups of devices are contained in the bridge forwarding table. These two different ports are displayed in two different tables on the application’s screen. You can use this function as a way to find devices on the network. For example, you can verify the actual MAC address used by a customer’s PC (especially useful if you believe the customer is not using DHCP and has a different machine connected to the modem). This information would be found on the Ethernet side of the modem.

The Unilink side of customer modems would enable you to locate other devices on a particular fiber node. This functionality exists for headend modems (or CMTS in DOCSIS) as well; however, in this case the Unilink side represents devices on the fiber node, and the Ethernet side represents servers, switches and routers on that network segment.

Listen to your system

The listen function shown in Figure 6 allows the log file of the DHCP/BOOTP (boot protocol) servers to be viewed for a specific device. The log file contains important information regarding the transaction between the DHCP/BOOTP servers and clients (PCs and modems). The troubleshooting tool has the ability to parse this file, extracting any transactions from specific devices. From these log entries, you can determine what (if any) response is destined for specific clients. You also can determine which clients are requesting, who is getting answered, and what configurations (or DHCP options) are being sent to the clients from the servers.

The ping function shown in Figure 7 remains a useful tool in troubleshooting cable modem and PC problems. The troubleshooting tool uses a particular version of ping that provides statistics on packet loss and round-trip times. These statistics are helpful in summarizing the results.

Fairly regularly, the broadband delivery medium becomes noisy. While noise can show up in many different ways in broadband, one of the most obvious ways it affects the modems is generating error packets. When customers complain about slow speed or intermittent connections, there is a good chance that the broadband delivery medium is noisy.

Your plant operations group can use the Clear Errors function shown in Figure 8 to check the progress of cleaning and tuning the broadband delivery medium or verifying the modem operations.

Additional functions

Some other functions worth noting include Reset (enables a cable modem or CMTS to be reset remotely), IP and Port Filter (enables a cable modem or CMTS port or IP filters to be read/set remotely), port scan (enables a computer’s transmission control protocol, or TCP, application ports to be scanned, thus identifying which applications are running, such as Web, FTP, TELNET, SMTP and so on remotely) and history (enables a cable modem or CMTS status log to be read remotely).

Being able to perform these functions on any given device enables more sophisticated troubleshooting earlier in the problem resolution phase. Before the troubleshooting tool existed, many more problems required sending technicians to repair minor configuration settings. Today, armed with an ever more sophisticated tool-kit, BSRs are able to localize and correct an increasing number of problems over the phone—keeping trucks rolling to more installs rather than routine service calls.

Bottom Line: High-Speed Troubleshooting

Today's high-speed data customers demand an unprecedented level of service from broadband operators. From installation, phone support and service, each broadband operator must seek means of fast and efficient problem resolution. Implementing a consolidated troubleshooting tool can:

  • Reduce install times by up to 5 minutes by localizing the problem to the customers' home
  • Reduce the length of support calls by 2-3 minutes by consolidating all information and toolkits to a single interface
  • Reduce the number of unnecessary truck rolls
  • Increase the efficiency of network engineering in resolving configuration problems

Bruce Bahlmann is senior systems engineer for MediaOne’s Internet Services Group. He can be reached at .


Access Intelligence's CABLE GROUP

Communications Technology | CableFAX Daily | CableFAX's CableWORLD | CT's Pipeline
CableFAX Magazine | CableFAX databriefs | Broadband Leaders Retreat | CableFAX Leaders Retreat

Access Intelligence, LLC Copyright © 2005 Access Intelligence, LLC. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of Access Intelligence, LLC is prohibited.