vmtouch is a tiny little tool that could be very useful to you if you want to improve performance on a low level basis or just learn something about the file system cache in Linux.

Database indices are most useful when they are in the RAM, but on large databases this often isn't the case. So index files that are stored on the hard disk are used. To get better performance, this files should be held in the file system cache. Of course, this doesn't only apply to database systems. You can touch all types of files to get them into the cache and get them loaded fast.


Installing vmtouch isn't that complicated. It comes as a single C file without any dependency. Just make sure gcc is installed on your target system. Download the C file:

wget http://hoytech.com/vmtouch/vmtouch.c

and compile it using gcc:

 gcc -Wall -O3 -o vmtouch vmtouch.c

This will (if all goes well) create an executable file vmtouch, which we will then use to get information about the file system cache and load files into it.

Using vmtouch

First of all, I want to now how much of my machine's /bin directory is in the cache. Simply run `vmtouch /bin[/code]` and you will see something like this:

Files: 44
 Directories: 1
 Resident Pages: 715/3948 2M/15M 18.1%
 Elapsed: 0.34154 seconds

The important part in the output is the Resident Pages, you see 18.1% of my /bin directory is currently in the file system cache. Note that on large directories it will shurely run longer than under one second and that you can run vmtouch on both files and directories.

Now that we have found out that git (which is in /bin) isn't in the cache, but our application heavily relies on it and we want to improve performance, we want to get that file into the cache. (Note: this is a fictional use case!)

$ vmtouch -vt /bin/git
<p>Files: 1
 Directories: 0
 Touched Pages: 344 (1M)
 Elapsed: 0.1636 seconds

Voilá! git is now in the file system cache and if you run `vmtouch /bin/git` you will see that 100% are in the cache. I encourage you to experiment with that tool, e.g. use tail/head/cat and see how much of the files you are viewing are loaded in the cache, etc. Have fun!