Around Q3-Q4 of 2014 I started looking at ASCII based admin tools that I use when administering clusters and general Linux systems. These tools are typically either installed with Linux or are add-ons written by other people. Either way I’ve come to depend upon them at various times when all I can get is an ssh to a server or just a crash cart.
One the articles (link) focused on one my favorite tools, vmstat. I like to use vmstat because many times I’ve found the reason for a node crashing, locking, or having poor performance is that the user application is swapping. With vmstat I can quickly tell if the app is swapping.
While writing the article I decided it might be interesting to look for ways to plot vmstat data. I wasn’t overly happy with the tools I found. There is an on-line tool (you upload you vmstat data) and there are some instructions on how to use awk and gnuplot to plot the data but to be honest, I kind of like looking at an html report with the plots. Everything is in one place and it’s easy to save to a pdf. So I decided to write my own vmstat plotting tool.
To start, I made a few assumptions about using vmstat. Since i’m interested in plotting the data that is collected over a period of time, you can’t really plot the output when “-s” is used (event counters and memory information) or the “-f” option which displays the number of forks. If you try plotting that data with this tool it will fail miserably and I will ignore and requests to plot that data (apologies). But it can plot data that is captured for the following modes:
- “VM” mode (default mode for vmstat)
- “disk” mode (“-d”)
- “slabinfo” mode (“-m”)
- “partition” mode (“-p”)
If you look at the vmstat man page.
The other assumptions I made are pretty easy (I think):
- Have to use the “-n” option so that the header is only printed once
- Don’t use the “-S” option to change the units. The code assumes everything defaults to “K” and the tool plots everything in “M”.
After that you can use the other options include “-t” to capture the time (only in VM mode).
An example of what a vmstat command is the following:
[laytonjb@home4 VMSTAT_PLOT]$ vmstat -n 1 100 > vmstat_default.out
This tells vmstat to grab the output every second and do that for 100 seconds (this is the “1 100” part of the commany line). Then I just push the output to a file for plotting. Note: Don’t forget the “-n” option so the headers only get printed once.
I could even add option such as “-a” which adds the active/inactive memory information to the memory output, and “-t” which adds the time to the output. An example of using both is,
[laytonjb@home4 VMSTAT_PLOT]$ vmstat -n -a -t 1 100 > vmstat_default_at.out
For “disk” mode I could do something like the following:
[laytonjb@home4 VMSTAT_PLOT]$ vmstat -n -d 1 100 > vmstat_disk.out
This gathers the disk or device metrics for 100 seconds and does it every second. Also don’t forget the “-n” option.
For the “slabinfo” mode, the command line could like something like the following:
[laytonjb@home4 VMSTAT_PLOT]$ vmstat -n -m 1 100 > vmstat_slab.out
Or for “partition” mode you could have something like:
[laytonjb@home4 VMSTAT_PLOT]$ vmstat -n -p /dev/sdc 1 100 > vmstat_partition.out
In fact it might not be a bad idea to run several vmstat streams at the same time if you are really interested in seeing what is happening with the system. But the data gathering and writing might impact the timing a bit so be careful.
One thing to note is that the only vmstat option that gathers the time when the data was gathered was the “VM” or default mode. Other modes ignore the “-t” option.
Except for the “VM” mode which can plot the data versus time, the other modes just plot versus the data count. So the x-axis starts at “1”, goes to “2” and so on. You can also do this in “VM” mode if you don’t want to plot versus time. But if you do want to plot versus time, the tool normalizes the time to 0.0 at the start of the data collection. You can change this in the code if you like.
I wanted to present an example of what the tool does so you have some idea. I ran vmstat while I ran a simple iozone test. The iozone command line I ran was the following:
./iozone -i 2 -w -r 4k -I -O -w -+n -s 4g -t 2 -+n > iozone_random_1.out
This simply runs a read and write test using 4KB records to a 4GB file. It uses two threads (“-t 2”). While this command was running I ran vmstat with the following command line:
vmstat -n -t 1 480 > iozone_random.log
The command just collected data in the default, “VM” mode, for 480 counts and did it every second (so a total of of 480 seconds of data). It also added the time output (“-t”).
Then I ran the vmstat plotter code on the vmstat output file.
[laytonjb@home4 VMSTAT_PLOT]$ ./vmstat_plotter_v2.py iozone_random.log vmstat plotting script input filename: iozone_random.log VMSTAT in default mode
The resulting output is posted at this link so you can see what the output looks like.
Getting the Code
The code is downloadable from
The code is broken into several files and without explaining how to import modules in Python, I suggest you just put all of the code in a single directory and then copy the vmstat output file to that directory for processing. The code will put the HTML report in a subdirectory called “HTML_REPORT”. Open the file named “report.html” in that directory in your browser and it should be displayed.
I hope the script proves useful in some way.