| Demonstrations of dirtop, the Linux eBPF/bcc version. |
| |
| |
| dirtop shows reads and writes by directory. For example: |
| |
| # ./dirtop.py -d '/hdfs/uuid/*/yarn' |
| Tracing... Output every 1 secs. Hit Ctrl-C to end |
| |
| 14:28:12 loadavg: 25.00 22.85 21.22 31/2921 66450 |
| |
| READS WRITES R_Kb W_Kb PATH |
| 1030 2852 8 147341 /hdfs/uuid/c11da291-28de-4a77-873e-44bb452d238b/yarn |
| 3308 2459 10980 24893 /hdfs/uuid/bf829d08-1455-45b8-81fa-05c3303e8c45/yarn |
| 2227 7165 6484 11157 /hdfs/uuid/76dc0b77-e2fd-4476-818f-2b5c3c452396/yarn |
| 1985 9576 6431 6616 /hdfs/uuid/99c178d5-a209-4af2-8467-7382c7f03c1b/yarn |
| 1986 398 6474 6486 /hdfs/uuid/7d512fe7-b20d-464c-a75a-dbf8b687ee1c/yarn |
| 764 3685 5 7069 /hdfs/uuid/250b21c8-1714-45fe-8c08-d45d0271c6bd/yarn |
| 432 1603 259 6402 /hdfs/uuid/4a833770-767e-43b3-b696-dc98901bce26/yarn |
| 993 5856 320 129 /hdfs/uuid/b94cbf3f-76b1-4ced-9043-02d450b9887c/yarn |
| 612 5645 4 249 /hdfs/uuid/8138a53b-b942-44d3-82df-51575f1a3901/yarn |
| 818 21 6 166 /hdfs/uuid/fada8004-53ff-48df-9396-165d8e42925b/yarn |
| 174 23 1 171 /hdfs/uuid/d04fccd8-bc72-4ed9-bda4-c5b6893f1405/yarn |
| 376 6281 2 97 /hdfs/uuid/0cc3683f-4800-4c73-8075-8d77dc7cf116/yarn |
| 370 4588 2 96 /hdfs/uuid/a78f846a-58c4-4d10-a9f5-42f16a6134a0/yarn |
| 190 6420 1 86 /hdfs/uuid/2c6a7223-cb18-4916-a1b6-8cd02bda1d31/yarn |
| 178 123 1 17 /hdfs/uuid/b3b2a2ed-f6c1-4641-86bf-2989dd932411/yarn |
| [...] |
| |
| This shows various directories read and written when hadoop runs. |
| By default the output is sorted by the total read size in Kbytes (R_Kb). |
| Sorting order can be changed via -s option. |
| This is instrumenting at the VFS interface, so this is reads and writes that |
| may return entirely from the file system cache (page cache). |
| |
| While not printed, the average read and write size can be calculated by |
| dividing R_Kb by READS, and the same for writes. |
| |
| This script works by tracing the vfs_read() and vfs_write() functions using |
| kernel dynamic tracing, which instruments explicit read and write calls. If |
| files are read or written using another means (eg, via mmap()), then they |
| will not be visible using this tool. |
| |
| This should be useful for file system workload characterization when analyzing |
| the performance of applications. |
| |
| Note that tracing VFS level reads and writes can be a frequent activity, and |
| this tool can begin to cost measurable overhead at high I/O rates. |
| |
| |
| A -C option will stop clearing the screen, and -r with a number will restrict |
| the output to that many rows (20 by default). For example, not clearing |
| the screen and showing the top 5 only: |
| |
| # ./dirtop -d '/hdfs/uuid/*/yarn' -Cr 5 |
| Tracing... Output every 1 secs. Hit Ctrl-C to end |
| |
| 14:29:08 loadavg: 25.66 23.42 21.51 17/2850 67167 |
| |
| READS WRITES R_Kb W_Kb PATH |
| 100 8429 0 48243 /hdfs/uuid/b94cbf3f-76b1-4ced-9043-02d450b9887c/yarn |
| 2066 4091 8176 26457 /hdfs/uuid/d04fccd8-bc72-4ed9-bda4-c5b6893f1405/yarn |
| 10 2043 0 8172 /hdfs/uuid/b3b2a2ed-f6c1-4641-86bf-2989dd932411/yarn |
| 38 1368 0 2652 /hdfs/uuid/a78f846a-58c4-4d10-a9f5-42f16a6134a0/yarn |
| 86 19 0 123 /hdfs/uuid/c11da291-28de-4a77-873e-44bb452d238b/yarn |
| |
| 14:29:09 loadavg: 25.66 23.42 21.51 15/2849 67170 |
| |
| READS WRITES R_Kb W_Kb PATH |
| 1204 5619 4388 33767 /hdfs/uuid/b94cbf3f-76b1-4ced-9043-02d450b9887c/yarn |
| 2208 3511 8744 22992 /hdfs/uuid/d04fccd8-bc72-4ed9-bda4-c5b6893f1405/yarn |
| 62 4010 0 21181 /hdfs/uuid/8138a53b-b942-44d3-82df-51575f1a3901/yarn |
| 22 2187 0 8748 /hdfs/uuid/b3b2a2ed-f6c1-4641-86bf-2989dd932411/yarn |
| 74 1097 0 4388 /hdfs/uuid/4a833770-767e-43b3-b696-dc98901bce26/yarn |
| |
| [..] |
| |
| |
| |
| USAGE message: |
| |
| # ./dirtop.py -h |
| usage: dirtop.py [-h] [-C] [-r MAXROWS] [-s {all,reads,writes,rbytes,wbytes}] |
| [-p PID] -d ROOTDIRS |
| [interval] [count] |
| |
| File reads and writes by process |
| |
| positional arguments: |
| interval output interval, in seconds |
| count number of outputs |
| |
| optional arguments: |
| -h, --help show this help message and exit |
| -C, --noclear don't clear the screen |
| -r MAXROWS, --maxrows MAXROWS |
| maximum rows to print, default 20 |
| -s {all,reads,writes,rbytes,wbytes}, --sort {all,reads,writes,rbytes,wbytes} |
| sort column, default all |
| -p PID, --pid PID trace this PID only |
| -d ROOTDIRS, --root-directories ROOTDIRS |
| select the directories to observe, separated by commas |
| |
| examples: |
| ./dirtop -d '/hdfs/uuid/*/yarn' # directory I/O top, 1 second refresh |
| ./dirtop -d '/hdfs/uuid/*/yarn' -C # don't clear the screen |
| ./dirtop -d '/hdfs/uuid/*/yarn' 5 # 5 second summaries |
| ./dirtop -d '/hdfs/uuid/*/yarn' 5 10 # 5 second summaries, 10 times only |
| ./dirtop -d '/hdfs/uuid/*/yarn,/hdfs/uuid/*/data' # Running dirtop on two set of directories |