We have come a long way, tools-wise, since I started using commercial UNIX systems in the early 1980s. Although most of the system-level tools that were available then, such as sar, still form the core of many diagnostic techniques today, we now have numerous additional methods for determining the health of our systems. The wealth of system administration tools, both commercial and freeware, that have been developed over the years can make our system-management lives considerably easier. Such tools include event managers, network management systems, and system monitoring software to name just a few.
While the system administration tools at our disposal have become more sophisticated, so have the systems that we are charged with keeping healthy. Few of us have the luxury of managing a single, un-networked UNIX system running only a couple of business applications for ASCII terminal-based users. Most of us now manage multiple UNIX servers that are networked with an array of UNIX workstations, both usually from various vendors and running different versions of UNIX. Add to that a mix of non-UNIX servers, X Window terminals, Windows PCs, and Apple Mac clients. Add the multiple network protocols needed to serve the various systems and you have a fair representation of what our systems architectures look like today. Of course, no such system layer cake would be complete without an Internet/Web icing, and all of the performance issues and security concerns that go along with that topping.
The central task of system administration is still that of maintaining a high-level awareness of what is occurring on your systems, and having the appropriate daemons and background tool processes running which will aid you in your diagnosis when something goes wrong. And, while the tools have improved, I have yet to see the monitoring tool that issues alpha-numeric pages along the lines of: "Susan's poorly-formed SQL query on network node 421 is causing server 97 to thrash, and is causing excess network traffic on segments 19 and 27. Would you like me to kill PIDs xxxx and yyyy?"
Thus, I remain convinced that the best system administration tools at your disposal are still your own logic and deductive reasoning. And, to feed that logic, you need software tools that allow you to drill down to the core of a problem. For a tool to have a nice graphical interface is okay, but you also need to be able to get at the details below the simplified representation of the problem. Administrative tools that do not provide access to such detail, or that hide the underlying steps that will be taken when you select a particular action are of little use. How often, for example, have you looked at the choices provided by an administrative shell and wondered exactly what would happen if you selected a particular action? You likely clicked on the help function for that item, were unsatisfied by the explanation, and set off to read the underlying shell code, only to conclude that you were glad you didn't blindly make the menu selection.
So, whether you are building your own system administration tool, or selecting one from a commercial vendor, remember the detail that will be necessary to diagnose what is really going on. Then, make your design or selection decisions accordingly.