blob: 1bc8f591d405000889848562b1b26786bfc9d115 [file] [log] [blame] [edit]
Demonstrations of biolatency, the Linux eBPF/bcc version.
biolatency traces block device I/O (disk I/O), and records the distribution
of I/O latency (time), printing this as a histogram when Ctrl-C is hit.
For example:
# ./biolatency
Tracing block device I/O... Hit Ctrl-C to end.
^C
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 1 | |
128 -> 255 : 12 |******** |
256 -> 511 : 15 |********** |
512 -> 1023 : 43 |******************************* |
1024 -> 2047 : 52 |**************************************|
2048 -> 4095 : 47 |********************************** |
4096 -> 8191 : 52 |**************************************|
8192 -> 16383 : 36 |************************** |
16384 -> 32767 : 15 |********** |
32768 -> 65535 : 2 |* |
65536 -> 131071 : 2 |* |
The latency of the disk I/O is measured from the issue to the device to its
completion. A -Q option can be used to include time queued in the kernel.
This example output shows a large mode of latency from about 128 microseconds
to about 32767 microseconds (33 milliseconds). The bulk of the I/O was
between 1 and 8 ms, which is the expected block device latency for
rotational storage devices.
The highest latency seen while tracing was between 65 and 131 milliseconds:
the last row printed, for which there were 2 I/O.
For efficiency, biolatency uses an in-kernel eBPF map to store timestamps
with requests, and another in-kernel map to store the histogram (the "count")
column, which is copied to user-space only when output is printed. These
methods lower the performance overhead when tracing is performed.
In the following example, the -m option is used to print a histogram using
milliseconds as the units (which eliminates the first several rows), -T to
print timestamps with the output, and to print 1 second summaries 5 times:
# ./biolatency -mT 1 5
Tracing block device I/O... Hit Ctrl-C to end.
06:20:16
msecs : count distribution
0 -> 1 : 36 |**************************************|
2 -> 3 : 1 |* |
4 -> 7 : 3 |*** |
8 -> 15 : 17 |***************** |
16 -> 31 : 33 |********************************** |
32 -> 63 : 7 |******* |
64 -> 127 : 6 |****** |
06:20:17
msecs : count distribution
0 -> 1 : 96 |************************************ |
2 -> 3 : 25 |********* |
4 -> 7 : 29 |*********** |
8 -> 15 : 62 |*********************** |
16 -> 31 : 100 |**************************************|
32 -> 63 : 62 |*********************** |
64 -> 127 : 18 |****** |
06:20:18
msecs : count distribution
0 -> 1 : 68 |************************* |
2 -> 3 : 76 |**************************** |
4 -> 7 : 20 |******* |
8 -> 15 : 48 |***************** |
16 -> 31 : 103 |**************************************|
32 -> 63 : 49 |****************** |
64 -> 127 : 17 |****** |
06:20:19
msecs : count distribution
0 -> 1 : 522 |*************************************+|
2 -> 3 : 225 |**************** |
4 -> 7 : 38 |** |
8 -> 15 : 8 | |
16 -> 31 : 1 | |
06:20:20
msecs : count distribution
0 -> 1 : 436 |**************************************|
2 -> 3 : 106 |********* |
4 -> 7 : 34 |** |
8 -> 15 : 19 |* |
16 -> 31 : 1 | |
How the I/O latency distribution changes over time can be seen.
The -Q option begins measuring I/O latency from when the request was first
queued in the kernel, and includes queuing latency:
# ./biolatency -Q
Tracing block device I/O... Hit Ctrl-C to end.
^C
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 3 |* |
256 -> 511 : 37 |************** |
512 -> 1023 : 30 |*********** |
1024 -> 2047 : 18 |******* |
2048 -> 4095 : 22 |******** |
4096 -> 8191 : 14 |***** |
8192 -> 16383 : 48 |******************* |
16384 -> 32767 : 96 |**************************************|
32768 -> 65535 : 31 |************ |
65536 -> 131071 : 26 |********** |
131072 -> 262143 : 12 |**** |
This better reflects the latency suffered by the application (if it is
synchronous I/O), whereas the default mode without kernel queueing better
reflects the performance of the device.
Note that the storage device (and storage device controller) usually have
queues of their own, which are always included in the latency, with or
without -Q.
The -D option will print a histogram per disk. Eg:
# ./biolatency -D
Tracing block device I/O... Hit Ctrl-C to end.
^C
Bucket disk = 'xvdb'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 1 | |
256 -> 511 : 33 |********************** |
512 -> 1023 : 36 |************************ |
1024 -> 2047 : 58 |****************************************|
2048 -> 4095 : 51 |*********************************** |
4096 -> 8191 : 21 |************** |
8192 -> 16383 : 2 |* |
Bucket disk = 'xvdc'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 1 | |
256 -> 511 : 38 |*********************** |
512 -> 1023 : 42 |************************* |
1024 -> 2047 : 66 |****************************************|
2048 -> 4095 : 40 |************************ |
4096 -> 8191 : 14 |******** |
Bucket disk = 'xvda1'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 18 |********** |
512 -> 1023 : 67 |************************************* |
1024 -> 2047 : 35 |******************* |
2048 -> 4095 : 71 |****************************************|
4096 -> 8191 : 65 |************************************ |
8192 -> 16383 : 65 |************************************ |
16384 -> 32767 : 20 |*********** |
32768 -> 65535 : 7 |*** |
This output sows that xvda1 has much higher latency, usually between 0.5 ms
and 32 ms, whereas xvdc is usually between 0.2 ms and 4 ms.
The -F option prints a separate histogram for each unique set of request
flags. For example:
./biolatency.py -Fm
Tracing block device I/O... Hit Ctrl-C to end.
^C
flags = Read
msecs : count distribution
0 -> 1 : 180 |************* |
2 -> 3 : 519 |****************************************|
4 -> 7 : 60 |**** |
8 -> 15 : 123 |********* |
16 -> 31 : 68 |***** |
32 -> 63 : 0 | |
64 -> 127 : 2 | |
128 -> 255 : 12 | |
256 -> 511 : 0 | |
512 -> 1023 : 1 | |
flags = Sync-Write
msecs : count distribution
0 -> 1 : 5 |****************************************|
flags = Flush
msecs : count distribution
0 -> 1 : 2 |****************************************|
flags = Metadata-Read
msecs : count distribution
0 -> 1 : 3 |****************************************|
2 -> 3 : 2 |************************** |
4 -> 7 : 0 | |
8 -> 15 : 1 |************* |
16 -> 31 : 1 |************* |
flags = Write
msecs : count distribution
0 -> 1 : 103 |******************************* |
2 -> 3 : 106 |******************************** |
4 -> 7 : 130 |****************************************|
8 -> 15 : 79 |************************ |
16 -> 31 : 5 |* |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 1 | |
flags = NoMerge-Read
msecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 5 |****************************************|
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 1 |******** |
flags = NoMerge-Write
msecs : count distribution
0 -> 1 : 30 |** |
2 -> 3 : 293 |******************** |
4 -> 7 : 564 |****************************************|
8 -> 15 : 463 |******************************** |
16 -> 31 : 21 |* |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 5 | |
flags = Priority-Metadata-Read
msecs : count distribution
0 -> 1 : 1 |****************************************|
2 -> 3 : 0 | |
4 -> 7 : 1 |****************************************|
8 -> 15 : 1 |****************************************|
flags = ForcedUnitAccess-Metadata-Sync-Write
msecs : count distribution
0 -> 1 : 2 |****************************************|
flags = ReadAhead-Read
msecs : count distribution
0 -> 1 : 15 |*************************** |
2 -> 3 : 22 |****************************************|
4 -> 7 : 14 |************************* |
8 -> 15 : 8 |************** |
16 -> 31 : 1 |* |
flags = Priority-Metadata-Write
msecs : count distribution
0 -> 1 : 9 |****************************************|
These can be handled differently by the storage device, and this mode lets us
examine their performance in isolation.
The -e option shows extension summary(total, average)
For example:
# ./biolatency.py -e
^C
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 4 |*********** |
128 -> 255 : 2 |***** |
256 -> 511 : 4 |*********** |
512 -> 1023 : 14 |****************************************|
1024 -> 2047 : 0 | |
2048 -> 4095 : 1 |** |
avg = 663 usecs, total: 16575 usecs, count: 25
Sometimes 512 -> 1023 usecs is not enough for throughput tuning.
Especially a little difference in performance downgrade.
By this extension, we know the value in log2 range is about 663 usecs.
The -j option prints a dictionary of the histogram.
For example:
# ./biolatency.py -j
^C
{'ts': '2020-12-30 14:33:03', 'val_type': 'usecs', 'data': [{'interval-start': 0, 'interval-end': 1, 'count': 0}, {'interval-start': 2, 'interval-end': 3, 'count': 0}, {'interval-start': 4, 'interval-end': 7, 'count': 0}, {'interval-start': 8, 'interval-end': 15, 'count': 0}, {'interval-start': 16, 'interval-end': 31, 'count': 0}, {'interval-start': 32, 'interval-end': 63, 'count': 2}, {'interval-start': 64, 'interval-end': 127, 'count': 75}, {'interval-start': 128, 'interval-end': 255, 'count': 7}, {'interval-start': 256, 'interval-end': 511, 'count': 0}, {'interval-start': 512, 'interval-end': 1023, 'count': 6}, {'interval-start': 1024, 'interval-end': 2047, 'count': 3}, {'interval-start': 2048, 'interval-end': 4095, 'count': 31}]}
the key `data` is the list of the log2 histogram intervals. The `interval-start` and `interval-end` define the
latency bucket and `count` is the number of I/O's that lie in that latency range.
# ./biolatency.py -jF
^C
{'ts': '2020-12-30 14:37:59', 'val_type': 'usecs', 'data': [{'interval-start': 0, 'interval-end': 1, 'count': 0}, {'interval-start': 2, 'interval-end': 3, 'count': 0}, {'interval-start': 4, 'interval-end': 7, 'count': 0}, {'interval-start': 8, 'interval-end': 15, 'count': 0}, {'interval-start': 16, 'interval-end': 31, 'count': 1}, {'interval-start': 32, 'interval-end': 63, 'count': 1}, {'interval-start': 64, 'interval-end': 127, 'count': 0}, {'interval-start': 128, 'interval-end': 255, 'count': 0}, {'interval-start': 256, 'interval-end': 511, 'count': 0}, {'interval-start': 512, 'interval-end': 1023, 'count': 0}, {'interval-start': 1024, 'interval-end': 2047, 'count': 2}], 'flags': 'Sync-Write'}
{'ts': '2020-12-30 14:37:59', 'val_type': 'usecs', 'data': [{'interval-start': 0, 'interval-end': 1, 'count': 0}, {'interval-start': 2, 'interval-end': 3, 'count': 0}, {'interval-start': 4, 'interval-end': 7, 'count': 0}, {'interval-start': 8, 'interval-end': 15, 'count': 0}, {'interval-start': 16, 'interval-end': 31, 'count': 0}, {'interval-start': 32, 'interval-end': 63, 'count': 0}, {'interval-start': 64, 'interval-end': 127, 'count': 0}, {'interval-start': 128, 'interval-end': 255, 'count': 2}, {'interval-start': 256, 'interval-end': 511, 'count': 0}, {'interval-start': 512, 'interval-end': 1023, 'count': 2}, {'interval-start': 1024, 'interval-end': 2047, 'count': 1}], 'flags': 'Unknown'}
{'ts': '2020-12-30 14:37:59', 'val_type': 'usecs', 'data': [{'interval-start': 0, 'interval-end': 1, 'count': 0}, {'interval-start': 2, 'interval-end': 3, 'count': 0}, {'interval-start': 4, 'interval-end': 7, 'count': 0}, {'interval-start': 8, 'interval-end': 15, 'count': 0}, {'interval-start': 16, 'interval-end': 31, 'count': 0}, {'interval-start': 32, 'interval-end': 63, 'count': 0}, {'interval-start': 64, 'interval-end': 127, 'count': 0}, {'interval-start': 128, 'interval-end': 255, 'count': 0}, {'interval-start': 256, 'interval-end': 511, 'count': 0}, {'interval-start': 512, 'interval-end': 1023, 'count': 0}, {'interval-start': 1024, 'interval-end': 2047, 'count': 1}], 'flags': 'Write'}
{'ts': '2020-12-30 14:37:59', 'val_type': 'usecs', 'data': [{'interval-start': 0, 'interval-end': 1, 'count': 0}, {'interval-start': 2, 'interval-end': 3, 'count': 0}, {'interval-start': 4, 'interval-end': 7, 'count': 0}, {'interval-start': 8, 'interval-end': 15, 'count': 0}, {'interval-start': 16, 'interval-end': 31, 'count': 0}, {'interval-start': 32, 'interval-end': 63, 'count': 0}, {'interval-start': 64, 'interval-end': 127, 'count': 0}, {'interval-start': 128, 'interval-end': 255, 'count': 0}, {'interval-start': 256, 'interval-end': 511, 'count': 0}, {'interval-start': 512, 'interval-end': 1023, 'count': 4}], 'flags': 'Flush'}
The -j option used with -F prints a histogram dictionary per set of I/O flags.
# ./biolatency.py -jD
^C
{'ts': '2020-12-30 14:40:00', 'val_type': 'usecs', 'data': [{'interval-start': 0, 'interval-end': 1, 'count': 0}, {'interval-start': 2, 'interval-end': 3, 'count': 0}, {'interval-start': 4, 'interval-end': 7, 'count': 0}, {'interval-start': 8, 'interval-end': 15, 'count': 0}, {'interval-start': 16, 'interval-end': 31, 'count': 0}, {'interval-start': 32, 'interval-end': 63, 'count': 1}, {'interval-start': 64, 'interval-end': 127, 'count': 1}, {'interval-start': 128, 'interval-end': 255, 'count': 1}, {'interval-start': 256, 'interval-end': 511, 'count': 1}, {'interval-start': 512, 'interval-end': 1023, 'count': 6}, {'interval-start': 1024, 'interval-end': 2047, 'count': 1}, {'interval-start': 2048, 'interval-end': 4095, 'count': 3}], 'Bucket ptr': b'sda'}
The -j option used with -D prints a histogram dictionary per disk device.
# ./biolatency.py -jm
^C
{'ts': '2020-12-30 14:42:03', 'val_type': 'msecs', 'data': [{'interval-start': 0, 'interval-end': 1, 'count': 11}, {'interval-start': 2, 'interval-end': 3, 'count': 3}]}
The -j with -m prints a millisecond histogram dictionary. The `value_type` key is set to msecs.
USAGE message:
# ./biolatency -h
usage: biolatency.py [-h] [-T] [-Q] [-m] [-D] [-F] [-e] [-j] [-d DISK]
[interval] [count]
Summarize block device I/O latency as a histogram
positional arguments:
interval output interval, in seconds
count number of outputs
optional arguments:
-h, --help show this help message and exit
-T, --timestamp include timestamp on output
-Q, --queued include OS queued time in I/O time
-m, --milliseconds millisecond histogram
-D, --disks print a histogram per disk device
-F, --flags print a histogram per set of I/O flags
-e, --extension summarize average/total value
-j, --json json output
-d DISK, --disk DISK Trace this disk only
examples:
./biolatency # summarize block I/O latency as a histogram
./biolatency 1 10 # print 1 second summaries, 10 times
./biolatency -mT 1 # 1s summaries, milliseconds, and timestamps
./biolatency -Q # include OS queued time in I/O time
./biolatency -D # show each disk device separately
./biolatency -F # show I/O flags separately
./biolatency -j # print a dictionary
./biolatency -e # show extension summary(total, average)
./biolatency -d sdc # Trace sdc only