| Demonstrations of drsnoop, the Linux eBPF/bcc version. |
| |
| |
| drsnoop traces the direct reclaim system-wide, and prints various details. |
| Example output: |
| |
| # ./drsnoop |
| COMM PID LAT(ms) PAGES |
| summond 17678 0.19 143 |
| summond 17669 0.55 313 |
| summond 17669 0.15 145 |
| summond 17669 0.27 237 |
| summond 17669 0.48 111 |
| summond 17669 0.16 75 |
| head 17821 0.29 339 |
| head 17825 0.17 109 |
| summond 17669 0.14 73 |
| summond 17496 104.84 40 |
| summond 17678 0.32 167 |
| summond 17678 0.14 106 |
| summond 17678 0.16 67 |
| summond 17678 0.29 267 |
| summond 17678 0.27 69 |
| summond 17678 0.32 46 |
| base64 17816 0.16 85 |
| summond 17678 0.43 283 |
| summond 17678 0.14 182 |
| head 17736 0.57 135 |
| ^C |
| |
| While tracing, the processes alloc pages,due to insufficient memory available |
| in the system, direct reclaim events happened, which will increase the waiting |
| delay of the processes. |
| |
| drsnoop can be useful for discovering when allocstall(/proc/vmstat) continues to increase, |
| whether it is caused by some critical processes or not. |
| |
| The -p option can be used to filter on a PID, which is filtered in-kernel. Here |
| I've used it with -T to print timestamps: |
| |
| # ./drsnoop -Tp 17491 |
| TIME(s) COMM PID LAT(ms) PAGES |
| 107.364115000 summond 17491 0.24 50 |
| 107.364550000 summond 17491 0.26 38 |
| 107.365266000 summond 17491 0.36 72 |
| 107.365753000 summond 17491 0.22 49 |
| ^C |
| |
| This shows the summond process allocs pages, and direct reclaim events happening, |
| and the delays are not affected much. |
| |
| The -U option include UID on output: |
| |
| # ./drsnoop -U |
| UID COMM PID LAT(ms) PAGES |
| 1000 summond 17678 0.32 46 |
| 0 base64 17816 0.16 85 |
| 1000 summond 17678 0.43 283 |
| 1000 summond 17678 0.14 182 |
| 0 head 17821 0.29 339 |
| 0 head 17825 0.17 109 |
| ^C |
| |
| The -u option filtering UID: |
| |
| # ./drsnoop -Uu 1000 |
| UID COMM PID LAT(ms) PAGES |
| 1000 summond 17678 0.19 143 |
| 1000 summond 17669 0.55 313 |
| 1000 summond 17669 0.15 145 |
| 1000 summond 17669 0.27 237 |
| 1000 summond 17669 0.48 111 |
| 1000 summond 17669 0.16 75 |
| 1000 summond 17669 0.14 73 |
| 1000 summond 17678 0.32 167 |
| ^C |
| |
| A maximum tracing duration can be set with the -d option. For example, to trace |
| for 2 seconds: |
| |
| # ./drsnoop -d 2 |
| COMM PID LAT(ms) PAGES |
| head 21715 0.15 195 |
| |
| The -n option can be used to filter on process name using partial matches: |
| |
| # ./drsnoop -n mond |
| COMM PID LAT(ms) PAGES |
| summond 10271 0.03 51 |
| summond 10271 0.03 51 |
| summond 10259 0.05 51 |
| summond 10269 319.41 37 |
| summond 10270 111.73 35 |
| summond 10270 0.11 78 |
| summond 10270 0.12 71 |
| summond 10270 0.03 35 |
| summond 10277 111.62 41 |
| summond 10277 0.08 45 |
| summond 10277 0.06 32 |
| ^C |
| |
| This caught the 'summond' command because it partially matches 'mond' that's passed |
| to the '-n' option. |
| |
| |
| The -v option can be used to show system memory state (now only free mem) at |
| the beginning of direct reclaiming: |
| |
| # ./drsnoop.py -v |
| COMM PID LAT(ms) PAGES FREE(KB) |
| base64 34924 0.23 151 86260 |
| base64 34962 0.26 149 86260 |
| head 34931 0.24 150 86260 |
| base64 34902 0.19 148 86260 |
| head 34963 0.19 151 86228 |
| base64 34959 0.17 151 86228 |
| head 34965 0.29 190 86228 |
| base64 34957 0.24 152 86228 |
| summond 34870 0.15 151 86080 |
| summond 34870 0.12 115 86184 |
| |
| USAGE message: |
| |
| # ./drsnoop -h |
| usage: drsnoop.py [-h] [-T] [-U] [-p PID] [-t TID] [-u UID] [-d DURATION] |
| [-n NAME] |
| |
| Trace direct reclaim |
| |
| optional arguments: |
| -h, --help show this help message and exit |
| -T, --timestamp include timestamp on output |
| -U, --print-uid print UID column |
| -p PID, --pid PID trace this PID only |
| -t TID, --tid TID trace this TID only |
| -u UID, --uid UID trace this UID only |
| -d DURATION, --duration DURATION |
| total duration of trace in seconds |
| -n NAME, --name NAME only print process names containing this name |
| |
| examples: |
| ./drsnoop # trace all direct reclaim |
| ./drsnoop -T # include timestamps |
| ./drsnoop -U # include UID |
| ./drsnoop -p 181 # only trace PID 181 |
| ./drsnoop -t 123 # only trace TID 123 |
| ./drsnoop -u 1000 # only trace UID 1000 |
| ./drsnoop -d 10 # trace for 10 seconds only |
| ./drsnoop -n main # only print process names containing "main" |