| Demonstrations of cpudist. |
| |
| This program summarizes task on-CPU time as a histogram, showing how long tasks |
| spent on the CPU before being descheduled. This provides valuable information |
| that can indicate oversubscription (too many tasks for too few processors), |
| overhead due to excessive context switching (e.g. a common shared lock for |
| multiple threads), uneven workload distribution, too-granular tasks, and more. |
| |
| By default CPU idle time are excluded by simply excluding PID 0. |
| |
| Alternatively, the same options are available for summarizing task off-CPU |
| time, which helps understand how often threads are being descheduled and how |
| long they spend waiting for I/O, locks, timers, and other causes of suspension. |
| |
| # ./cpudist.py |
| Tracing on-CPU time... Hit Ctrl-C to end. |
| ^C |
| usecs : count distribution |
| 0 -> 1 : 0 | | |
| 2 -> 3 : 1 | | |
| 4 -> 7 : 1 | | |
| 8 -> 15 : 13 |** | |
| 16 -> 31 : 187 |****************************************| |
| 32 -> 63 : 89 |******************* | |
| 64 -> 127 : 26 |***** | |
| 128 -> 255 : 0 | | |
| 256 -> 511 : 1 | | |
| |
| This is from a mostly idle system. Tasks wake up occasionally and run for only |
| a few dozen microseconds, and then get descheduled. |
| |
| Here's some output from a system that is heavily loaded by threads that perform |
| computation but also compete for a lock: |
| |
| # ./cpudist.py |
| Tracing on-CPU time... Hit Ctrl-C to end. |
| ^C |
| usecs : count distribution |
| 0 -> 1 : 51 |* | |
| 2 -> 3 : 395 |*********** | |
| 4 -> 7 : 259 |******* | |
| 8 -> 15 : 61 |* | |
| 16 -> 31 : 75 |** | |
| 32 -> 63 : 31 | | |
| 64 -> 127 : 7 | | |
| 128 -> 255 : 5 | | |
| 256 -> 511 : 3 | | |
| 512 -> 1023 : 5 | | |
| 1024 -> 2047 : 6 | | |
| 2048 -> 4095 : 4 | | |
| 4096 -> 8191 : 1361 |****************************************| |
| 8192 -> 16383 : 523 |*************** | |
| 16384 -> 32767 : 3 | | |
| |
| A bimodal distribution is now clearly visible. Most of the time, tasks were |
| able to run for 4-16ms before being descheduled (this is likely the quantum |
| length). Occasionally, tasks had to be descheduled a lot earlier -- possibly |
| because they competed for a shared lock. |
| |
| If necessary, you can restrict the output to include only threads from a |
| particular process -- this helps reduce noise: |
| |
| # ./cpudist.py -p $(pidof parprimes) |
| Tracing on-CPU time... Hit Ctrl-C to end. |
| ^C |
| usecs : count distribution |
| 0 -> 1 : 3 | | |
| 2 -> 3 : 17 | | |
| 4 -> 7 : 39 | | |
| 8 -> 15 : 52 |* | |
| 16 -> 31 : 43 | | |
| 32 -> 63 : 12 | | |
| 64 -> 127 : 13 | | |
| 128 -> 255 : 0 | | |
| 256 -> 511 : 1 | | |
| 512 -> 1023 : 11 | | |
| 1024 -> 2047 : 15 | | |
| 2048 -> 4095 : 41 | | |
| 4096 -> 8191 : 1134 |************************ | |
| 8192 -> 16383 : 1883 |****************************************| |
| 16384 -> 32767 : 65 |* | |
| |
| You can also ask for output at predefined intervals, and include timestamps for |
| easier interpretation. While we're at it, the -P switch will print a histogram |
| separately for each process: |
| |
| # ./cpudist.py -TP 5 3 |
| Tracing on-CPU time... Hit Ctrl-C to end. |
| |
| 03:46:51 |
| |
| pid = 0 |
| usecs : count distribution |
| 0 -> 1 : 0 | | |
| 2 -> 3 : 1 |** | |
| 4 -> 7 : 17 |********************************** | |
| 8 -> 15 : 11 |********************** | |
| 16 -> 31 : 20 |****************************************| |
| 32 -> 63 : 15 |****************************** | |
| 64 -> 127 : 9 |****************** | |
| 128 -> 255 : 6 |************ | |
| 256 -> 511 : 0 | | |
| 512 -> 1023 : 0 | | |
| 1024 -> 2047 : 0 | | |
| 2048 -> 4095 : 1 |** | |
| |
| pid = 5068 |
| usecs : count distribution |
| 0 -> 1 : 0 | | |
| 2 -> 3 : 1 |************* | |
| 4 -> 7 : 3 |****************************************| |
| 8 -> 15 : 0 | | |
| 16 -> 31 : 0 | | |
| 32 -> 63 : 1 |************* | |
| |
| 03:46:56 |
| |
| pid = 0 |
| usecs : count distribution |
| 0 -> 1 : 0 | | |
| 2 -> 3 : 1 |** | |
| 4 -> 7 : 19 |****************************************| |
| 8 -> 15 : 11 |*********************** | |
| 16 -> 31 : 9 |****************** | |
| 32 -> 63 : 3 |****** | |
| 64 -> 127 : 1 |** | |
| 128 -> 255 : 3 |****** | |
| 256 -> 511 : 0 | | |
| 512 -> 1023 : 1 |** | |
| |
| pid = 5068 |
| usecs : count distribution |
| 0 -> 1 : 1 |******************** | |
| 2 -> 3 : 0 | | |
| 4 -> 7 : 2 |****************************************| |
| |
| 03:47:01 |
| |
| pid = 0 |
| usecs : count distribution |
| 0 -> 1 : 0 | | |
| 2 -> 3 : 0 | | |
| 4 -> 7 : 12 |******************************** | |
| 8 -> 15 : 15 |****************************************| |
| 16 -> 31 : 15 |****************************************| |
| 32 -> 63 : 0 | | |
| 64 -> 127 : 3 |******** | |
| 128 -> 255 : 1 |** | |
| 256 -> 511 : 0 | | |
| 512 -> 1023 : 1 |** | |
| |
| pid = 5068 |
| usecs : count distribution |
| 0 -> 1 : 0 | | |
| 2 -> 3 : 1 |****** | |
| 4 -> 7 : 6 |****************************************| |
| 8 -> 15 : 0 | | |
| 16 -> 31 : 0 | | |
| 32 -> 63 : 2 |************* | |
| |
| This histogram was obtained while executing `dd if=/dev/zero of=/dev/null` with |
| fairly large block sizes. |
| |
| You could also ask for an off-CPU report using the -O switch. Here's a |
| histogram of task block times while the system is heavily loaded: |
| |
| # ./cpudist -O -p $(parprimes) |
| Tracing off-CPU time... Hit Ctrl-C to end. |
| ^C |
| usecs : count distribution |
| 0 -> 1 : 0 | | |
| 2 -> 3 : 1 | | |
| 4 -> 7 : 0 | | |
| 8 -> 15 : 0 | | |
| 16 -> 31 : 0 | | |
| 32 -> 63 : 3 | | |
| 64 -> 127 : 1 | | |
| 128 -> 255 : 1 | | |
| 256 -> 511 : 0 | | |
| 512 -> 1023 : 2 | | |
| 1024 -> 2047 : 4 | | |
| 2048 -> 4095 : 3 | | |
| 4096 -> 8191 : 70 |*** | |
| 8192 -> 16383 : 867 |****************************************| |
| 16384 -> 32767 : 141 |****** | |
| 32768 -> 65535 : 8 | | |
| 65536 -> 131071 : 0 | | |
| 131072 -> 262143 : 1 | | |
| 262144 -> 524287 : 2 | | |
| 524288 -> 1048575 : 3 | | |
| |
| As you can see, threads are switching out for relatively long intervals, even |
| though we know the workload doesn't have any significant blocking. This can be |
| a result of over-subscription -- too many threads contending over too few CPUs. |
| Indeed, there are four available CPUs and more than four runnable threads: |
| |
| # nproc |
| 4 |
| # cat /proc/loadavg |
| 0.04 0.11 0.06 9/147 7494 |
| |
| (This shows we have 9 threads runnable out of 147 total. This is more than 4, |
| the number of available CPUs.) |
| |
| Finally, let's ask for a per-thread report and values in milliseconds instead |
| of microseconds: |
| |
| # ./cpudist.py -p $(pidof parprimes) -mL |
| Tracing on-CPU time... Hit Ctrl-C to end. |
| |
| |
| tid = 5092 |
| msecs : count distribution |
| 0 -> 1 : 3 | | |
| 2 -> 3 : 4 | | |
| 4 -> 7 : 4 | | |
| 8 -> 15 : 535 |****************************************| |
| 16 -> 31 : 14 |* | |
| |
| tid = 5093 |
| msecs : count distribution |
| 0 -> 1 : 8 | | |
| 2 -> 3 : 6 | | |
| 4 -> 7 : 4 | | |
| 8 -> 15 : 534 |****************************************| |
| 16 -> 31 : 12 | | |
| |
| tid = 5094 |
| msecs : count distribution |
| 0 -> 1 : 38 |*** | |
| 2 -> 3 : 5 | | |
| 4 -> 7 : 5 | | |
| 8 -> 15 : 476 |****************************************| |
| 16 -> 31 : 25 |** | |
| |
| tid = 5095 |
| msecs : count distribution |
| 0 -> 1 : 31 |** | |
| 2 -> 3 : 6 | | |
| 4 -> 7 : 10 | | |
| 8 -> 15 : 478 |****************************************| |
| 16 -> 31 : 20 |* | |
| |
| tid = 5096 |
| msecs : count distribution |
| 0 -> 1 : 21 |* | |
| 2 -> 3 : 5 | | |
| 4 -> 7 : 4 | | |
| 8 -> 15 : 523 |****************************************| |
| 16 -> 31 : 16 |* | |
| |
| tid = 5097 |
| msecs : count distribution |
| 0 -> 1 : 11 | | |
| 2 -> 3 : 7 | | |
| 4 -> 7 : 7 | | |
| 8 -> 15 : 502 |****************************************| |
| 16 -> 31 : 23 |* | |
| |
| tid = 5098 |
| msecs : count distribution |
| 0 -> 1 : 21 |* | |
| 2 -> 3 : 5 | | |
| 4 -> 7 : 3 | | |
| 8 -> 15 : 494 |****************************************| |
| 16 -> 31 : 28 |** | |
| |
| tid = 5099 |
| msecs : count distribution |
| 0 -> 1 : 15 |* | |
| 2 -> 3 : 4 | | |
| 4 -> 7 : 6 | | |
| 8 -> 15 : 521 |****************************************| |
| 16 -> 31 : 12 | | |
| |
| It looks like all threads are more-or-less equally busy, and are typically |
| switched out after running for 8-15 milliseconds (again, this is the typical |
| quantum length). |
| |
| |
| USAGE message: |
| |
| # ./cpudist.py -h |
| |
| usage: cpudist.py [-h] [-O] [-T] [-m] [-P] [-L] [-p PID] [-I] [-e] |
| [interval] [count] |
| |
| Summarize on- and off-CPU time per task as a histogram. |
| |
| positional arguments: |
| interval output interval, in seconds |
| count number of outputs |
| |
| optional arguments: |
| -h, --help show this help message and exit |
| -O, --offcpu measure off-CPU time |
| -T, --timestamp include timestamp on output |
| -m, --milliseconds millisecond histogram |
| -P, --pids print a histogram per process ID |
| -L, --tids print a histogram per thread ID |
| -p PID, --pid PID trace this PID only |
| -I, --include-idle include CPU idle time |
| -e, --extension show extension summary (average/total/count) |
| |
| examples: |
| cpudist # summarize on-CPU time as a histogram |
| cpudist -O # summarize off-CPU time as a histogram |
| cpudist 1 10 # print 1 second summaries, 10 times |
| cpudist -mT 1 # 1s summaries, milliseconds, and timestamps |
| cpudist -P # show each PID separately |
| cpudist -p 185 # trace PID 185 only |
| cpudist -I # include CPU idle time |
| cpudist -e # show extension summary (average/total/count) |
| |