7.1 The Troubleshooting Process Flashcards
Troubleshooting is the process of identifying, locating and correcting problems.
When troubleshooting, proper documentation must be maintained. This documentation should include as much information as possible about the following:
The problem encountered
Steps taken to determine the cause of the problem
Steps to correct the problem and ensure that it will not reoccur
When a problem is first discovered in the network, it is important to verify it and determine how much of the network is affected by it. After the problem is confirmed, the first step in troubleshooting is to gather information. The following checklist provides some of the important information you should check.
Nature of problem
End-user reports
Problem verification report
Equipment
Manufacturer
Make / model
Firmware version
Operating system version
Ownership / warranty information
Configuration and Topology
Physical and logical topology
Configuration files
Log files
Previous Troubleshooting
Steps taken
Results achieved
Structured Troubleshooting Methods
There are several structured troubleshooting approaches that can be used. Which one to use will depend on the situation. Each approach has its advantages and disadvantages.
BOTTOM UP
In bottom-up troubleshooting, you start with the physical layer and the physical components of the network and move up through the layers of the OSI model until the cause of the problem is identified.
Bottom-up troubleshooting is a good approach to use when the problem is suspected to be a physical one. Most networking problems reside at the lower levels, so implementing the bottom-up approach is often effective.
The disadvantage with the bottom-up troubleshooting approach is it requires that you check every device and interface on the network until the possible cause of the problem is found. Remember that each conclusion and possibility must be documented so there can be a lot of paper work associated with this approach. A further challenge is to determine which devices to start examining first.
TOP DOWN
top-down troubleshooting starts with the end-user applications and moves down through the layers of the OSI model until the cause of the problem has been identified.
End-user applications of an end system are tested before tackling the more specific networking pieces. Use this approach for simpler problems, or when you think the problem is with a piece of software.
The disadvantage with the top-down approach is it requires checking every network application until the possible cause of the problem is found. Each conclusion and possibility must be documented. The challenge is to determine which application to start examining first.
Divide-and-Conquer
The network administrator selects a layer and tests in both directions from that layer.
In divide-and-conquer troubleshooting, you start by collecting user experiences of the problem, document the symptoms and then, using that information, make an informed guess as to which OSI layer to start your investigation. When a layer is verified to be functioning properly, it can be assumed that the layers below it are functioning. The administrator can work up the OSI layers. If an OSI layer is not functioning properly, the administrator can work down the OSI layer model.
Follow-the-Path
The approach first discovers the traffic path all the way from source to destination. The scope of troubleshooting is reduced to just the links and devices that are in the forwarding path. The objective is to eliminate the links and devices that are irrelevant to the troubleshooting task at hand. This approach usually complements one of the other approaches.
Substitution
This approach is also called swap-the-component because you physically swap the problematic device with a known, working one. If the problem is fixed, then the problem is with the removed device. If the problem remains, then the cause may be elsewhere.
In specific situations, this can be an ideal method for quick problem resolution, such as with a critical single point of failure. For example, a border router goes down. It may be more beneficial to simply replace the device and restore service, rather than to troubleshoot the issue.
If the problem lies within multiple devices, it may not be possible to correctly isolate the problem.
Comparison
This approach is also called the spot-the-differences approach and attempts to resolve the problem by changing the nonoperational elements to be consistent with the working ones. You compare configurations, software versions, hardware, or other device properties, links, or processes between working and nonworking situations and spot significant differences between them.
The weakness of this method is that it might lead to a working solution, without clearly revealing the root cause of the problem.
Educated Guess
This approach is also called the shoot-from-the-hip troubleshooting approach. This is a less-structured troubleshooting method that uses an educated guess based on the symptoms of the problem. Success of this method varies based on your troubleshooting experience and ability. Seasoned technicians are more successful because they can rely on their extensive knowledge and experience to decisively isolate and solve network issues. With a less-experienced network administrator, this troubleshooting method may too random to be effective.