System Garden

Habitat 1.0 User Manual

Contents

  1. A Tour of Habitat
  2. Getting Started
  3. Concepts
  4. Clockwork: The Collection Agent
  5. Graphical Tools
  6. Text Terminal Tools
  7. Command Line Tools
  8. System Performance
  9. Events
  10. Administration
  11. Diagnostics
  12. Appendix

A Tour of Habitat

Habitat is a system capacity monitor with flexible historic data storage, easily extendible for applications or third party devices.

Historic data is central to the workings of habitat, with all collected information being sent to the local light weight data store and thence to an optional archive for long term storage. By reducing the samples of data over time (a process called cascading), habitat is also able to give long term trends from only local data whilst keeping modest storage requirements.

A large number of measurements are taken from the system, including simple, overall usage, disk storage, processor utilisation and network usage. All the metrics can be examined over arbitrary time to gain a full perspective of the work the machine has done.

Starting Habitat from your desk top menu, or typing ghabitat on the command line will start habitat's GUI. It appears with a graph of the machine's overall utilisation over recent time.

On the left hand side are the set of choices that can be displayed, with the right-hand pane showing the visualisation of the data. When the system is first started, there may be little or no data, so a blank screen may be presented for a while! By default, data is collected every minute and display is refreshed to show the new curve. In the image above, the collector may have been running for three to four samples (the same number of minutes), but if the collector had been running for longer or independently of the GUI, then there would be more data to see initially. But the a blank display will not stay blank for long!


This manual explains how to visualise data that is collected and how to navigate around the sources that are available. However, as a quick taster, this is a sample of what can be seen.

Long term growth in processor usage, shown to the side, shows a gradual climb over a period of nine months, with a sudden late surge of activity.

The image below shows growth in local storage on two partitions on a single machine over a period of seven months. The top chart shows the root file system (/) capacity use growing gradually in size over the period from 65% to 85%, the bottom /home chart, shows a sudden drop after a similar gradual climb.


The image below takes the previous view and adds in the volume of disk requests on each chart (rios and wios for read and write requests), and scales the new curves to fit in the utilisation range. The curves are colour coded against the pick list on the right onto which the scaling controls have been added.


The next pair of images represent a busy network chart with lots of samples. Zooming into the July part of the chart (by dragging a box over the area of interest with your mouse), will expand the display to show greater detail, as shown in the lower image. Note how the time scale on the horizontal ruler changes to give the most accurate information possible, which in this case switches the display to dates of the week, with the starting date on the left.


Chart zooming in habitat runs from years to seconds, as does the representations of time on the horizontal axis. A selection of time scales is shown beside.


By default, habitat collects 'interesting' processes only and filters out smaller or mostly idle processes. This helps to reduce the data that needs to be stored and manipulated, critical for the successful examinations of processes.

The next image shows multiple processes being tracked for memory on a system (Open Office Writer and X-Windows in this example). Processes rarely give significant quantities of memory back to a system, so it is often useful to profile an application over time before it goes into production. The image shows the process sizes in MBytes: the underlying metric is KBytes, but ghabitat has been used to reduce the curve values by 1/1000.

The memory usage may be augmented with the processor usage for each process by displaying %cpu, which is displayed on the same charts in a different colour. The %cpu measure is the amount of the system's cpu taken over the lifetime of the processes involved, thus peaks in demand are flattened and seen over time, it can be counter intuitive. To display both curves, the size values has be reduced in magnitude to 1/10'000, effectively making 10 MBytes units for size.


Other machines can be displayed in the same tool. To the side is the choice tree with several hosts connected (under my hosts) and also some files from previous saved performance sets (under my files).

Not all data has be generated by habitat: both habitat and harvest can import arbitrary tabular time series data in the FHA format (explained later). In the example below, data from from the Unix data gathering tool sar has been imported into the harvest repository. The repository appears as an option in the choice tree and the machine sources have been assigned an organisational hierarchy within harvest to help navigation. The node sarsys holds the system information from sar.

Integrating the repository is simple: an administrator provides a URL, together with optional authentication information. All repository data is then grafted into the choice tree.