I had the fun of trying to find out what was wrong with a long running process - after a day, the process would slow to a halt. I’ve used in a drunken haze in the past a program called strace. This lets you attach to the process and monitors the system calls and signals it makes.

The script is a network and database specific application. I got the process ID from ps aux, and proceeded to attach to the processes (including the threads) using the command:

strace -f -s512 -p<process id>

Immediately, the issue became clear:


The screen scrolled this thousands and thousands of times, briefly showing chunks of other system calls, including the MySQL calls and the queries inline. While at the moment we are unsure what this means, it certainly goes a long way in (a) identifying the issue is with the script and a low-level idea of what the problem is.

What is really fun is seeing what things do in the background.

strace <command>

Will start an application and give you the full list of calls. Adding the -r flag adds the relative time for each call too, so this way you can start to see what system calls are taking the longest.

Hopefully there will be more useful information as I get to use it more often!