The first step in diagnosing a network problem is to collect information. This includes collecting information from your users as to the nature of the problems they are having, and it includes collecting data from your network.
Your success will depend, in large part, on your efficiency in collecting this information and on the quality of the information you collect. There is an extraordinary variety of tools available for this purpose, and more become available daily.
A small number of tools can be used to solve most problems.
General Approaches to Troubleshooting
Troubleshooting is a complex process that is best learned through experience.
Network troubleshooting is the collective measures and processes used to identify, diagnose and resolve problems and issues within a computer network.
It is a systematic process that aims to resolve problems and restore normal network operations within the network.
Network troubleshooting is primarily done by network engineers or administrators to repair or optimize a network.
It is generally done to recover and establish network or Internet connections on end nodes/devices.
Some of the processes within network troubleshooting include but are not limited to:
Finding and resolving problems and establishing Internet/network connection of a computer/device/node
Configuring a router, switch or any network management device
Installing cables or Wi-Fi devices
Updating firmware devices on router switch
Removing viruses
Adding, configuring and reinstalling a network printer
Network troubleshooting can be a manual or automated task. When using
automated tools, network management can be done using network diagnostic
software.
Network documentation is important for the following reasons:
Proper documentation can save you from time-consuming research to fix recurring problems.
When everything is in place and everybody follows the same processes and procedures, consistency across the network helps to reduce problems and errors.
You won’t lose important information when a knowledgeable employee leaves the company.
The documentation helps you to onboard new hires much more quickly.
You can troubleshoot your network faster when issues come up.
Troubleshooting and Management
Documentation
Network documentation is a technical record of the hardware, software, servers, directory structure, user profiles, data, and how it all works together.
Network documents should include any information that helps administrators and IT professionals to keep the network up and running smoothly. This information can be in any format you want.
The most important source of information is the local documentation created by you or your predecessor, & There are a couple of sets of standard documentation.
The documentation can be divided into two general categories
Configuration documentation
Process documentation
Configuration documentation statically describes a system. It assumes that the steps involved in setting up the system are well understood and need no further comments, i.e., that configuration information is sufficient to reconfigure or reconstruct the system. This kind of information can usually be collected at any time.
Process documentation describes the steps involved in setting up a device, installing software, or resolving a problem. As such, it is best written while you are doing the task. This creates a different set of collection problems. Here the stress from the task at hand often prevents you from documenting the process.
Management Practices
A fundamental assumption is that troubleshooting should be proactive.
It is preferable to avoid a problem than have to correct it.
Proper management practices can help.
Management practices will determine what you can do and how you do it.
This is true both for avoiding problems and for dealing with problems that can’t be avoided.
Some of the more important management issues
1. Professionalism
To effectively administer a system requires a high degree of professionalism. This includes personal honesty and ethical behavior. Everything you do should be done from the perspective of a cost benefit trade-off. It is too easy to get caught in the trap of doing some- thing “the right way” at a higher cost than the benefits justify. Performance analysis is the key element.
2. Ego management
The most obvious way an administrator may do this is hide what he actually does and how his system works. This can be done many ways. Failing to document the system is one approach— leaving comments out of code or configuration files is common. The goal of such
an administrator is to make sure he is the only one who truly understands the system. He may try to limit others access to a system by restricting accounts or access to passwords.
3. Legal and ethical considerations
From the perspective of tools, you must ensure that you use tools in a manner that conforms not just to the policies of your organization, but to all applicable laws as well. Packet capture software is a prime example. It allows you to examine every packet that travels across a link, including applications data and each and every header. Unless data is encrypted, it can be decoded. This means that passwords can be captured and email can be read. For this reason alone, you should be very circumspect in how you use such tools.
4. Economic considerations
Solutions to problems have economic consequences, so you must understand the economic implications of what you do. Knowing how to balance the cost of the time used to repair a system against the cost of replacing a system is an obvious example. Cost management is a more general issue that has important implications when dealing with failures. One particularly difficult task for many system administrators is to come to terms with the economics of networking.
Troubleshooting
is a form of problem solving, often applied to repair failed products
or processes. It is a logical, systematic search for the source of a
problem in order to solve it, and so the product or process can be made
operational again.
Host Configurations
When reconstructing a host’s configuration, there are two basic approaches.
One is to examine the system’s configuration files.
This can be a very protracted approach. It works well when you know
what you are looking for and when you are looking for a specific detail.
But it can be difficult to impossible to find all the details of the
system, particularly if someone has taken steps to hide them.
The alternative is to use utilities designed to give snapshots of the current state of the system.
Clearly, by itself, neither approach is totally adequate. Where you start will depend in part on how quickly you must be up to speed and what specific problems you are facing. Each approach will be described in turn.
Utilities
Network utilities are software utilities designed to analyze and configure various aspects of computer networks.
Reviewing system configuration files is a necessary step that you will have to address before you can claim mastery of a system. But this can be a very time consuming step.
Even if you plan to jump into the configuration files, you will probably want a quick overview of the current state of the system before you begin. For this reason, we will examine status and configuration utilities first.
Using these utilities is much simpler than looking at kernel configuration files.
ps
The first thing any system administrator should do on a new system is run the ps command.
ps which stands as abbreviation for “Process Status”, used for viewing information related with the processes on a system .
ps command is used to list the currently running processes and their PIDs along with some other information depends on different options.
It reads the process information from the virtual files in /proc file-system. /proc contains virtual files, this is the reason it’s referred as a virtual file system.
It has numerous options for manipulating its output.
ps produces output with a heading line, which represents the meaning of each column of information, you can find the meaning of all the labels on the ps man page.
If you run the ps command without any arguments, it displays processes for the current shell.
There are a number of options available to ps, although they vary from implementation to implementation.
For example, run under FreeBSD, the parameters used were -aux. This combination shows all users’ processes (-a), including those
without controlling terminals (-x), in considerable detail (-u).The options -ax will provide fewer details but show more of the command-line arguments.
Alternately, you can use the -w option to extend the displayed information to 132 columns.
top
Although less ubiquitous, the top command, a useful alternative to ps, is available on many systems.
It was written by William LeFebvre.
When running, top gives a periodically updated listing of processes ranked in order of CPU usage.
Typically, only the top 10 processes are given, but this is implementation dependent, and your implementation may let you select other values.
The advantage to top is that it focuses your attention on resource hogs, and it provides a repetitive update.
top has a large number of options and can provide a wide
range of information.
netstat
- One of the most useful and diverse utilities is netstat.
- The netstat command generates displays that show network status and protocol statistics.
- You can display the status of TCP and UDP endpoints in table format, routing table information, and interface information.
- netstat displays various types of network data depending on the command line option selected. These displays are the most useful for system administration.
- One use of netstat is to display the connections and services available on a host.
lsof
lsof command stands for List Of Open File.
This command provides a list of files that are opened.
Basically, it gives the information to find out the files which are opened by which process.
With one go it lists out all open files in output console.
It cannot only list common regular files but it can list a directory, a block special file, a shared library, a character special file, a regular pipe, a named pipe, an internet socket, a UNIX domain socket, and many others.
It can be combined with grep command can be used to do advanced searching and listing.
List all open files: This command lists out all the files that are opened by any process in the system.
~$ lsof
ifconfig
- ifconfig(interface configuration) command is used to configure the kernel-resident network interfaces.
- It is used at the boot time to set up the interfaces as necessary.
- After that, it is usually used when needed during debugging or when you need system tuning.
- Also, this command is used to assign the IP address and netmask
to an interface or to enable or disable a given interface.
Syntax:
ifconfig [...OPTIONS] [INTERFACE]
arp
- arp command manipulates the System’s ARP cache.
- It also allows a complete dump of the ARP cache.
- ARP stands for Address Resolution Protocol.
- The primary function of this protocol is to resolve the IP address of a system to its mac address, and hence it works between level 2(Data link layer) and level 3(Network layer).
Syntax:
arp [-v] [-i if] [-H type] -a [hostname]
Scanning Tools
Port scanner are used to see which ports are active on your system. There are a large number of freely available port scanners. These include programs like gtkportscan, nessus, portscan, and strobe.
System Configuration Files
A major problem with configuration files under Unix is that there are so many of them in so many places. On a multiuser system that provides a variety of services, there may be scores of configuration files scattered among dozens of directories.
Basic Configuration Files
There are a number of fairly standard configuration files that seem to show up on
most systems. These are usually, but not always, located in the /etc directory.
Configuration Programs
These
utilities can be used to display as well as change system
configurations. Once again, every flavor of Unix will be different. With
Solaris, admintool was the torchbearer for years. In recent years, this
has been superseded with Solstice AdminSuite. With FreeBSD, select the
configure item from the menu presented when you run /stand/sysinstall.
With Linux you can use linuxconf. Both the menu and GUI versions of this
program are common. The list goes on.
Kernel
The first step in starting a system is loading and initializing the kernel. Network services rely on the kernel being configured correctly. Some services will be available only if first enabled in the kernel. While examining the kernel’s configuration won’t tell you which services are actually being used, it can give some insight into what is not available.
Changes to the kernel will usually be required only when building a new system,
installing a new service or new hardware, or tuning system performance.
Startup Files and Scripts
Once
the kernel is loaded, the swapper or scheduler is started and then the
init process runs. This process will, in turn, run a number of startup
scripts that will start the various services and do additional
configuration chores. After the standard configuration files, these are
the next group of files you might want to examine. These will primarily
be scripts, but may include configuration files read by the scripts.
Other Files
There are several other categories of files that are worth mentioning briefly.
Application files
Security files
Trust relationships
Traffic control
Application specific
Log files
No comments:
Post a Comment