| '\"! tbl | nroff \-man |
| '\" t macro stdmacro |
| |
| .de SAMPLE |
| .br |
| .RS 0 |
| .nf |
| .nh |
| .. |
| .de ESAMPLE |
| .hy |
| .fi |
| .RE |
| .. |
| .TH DEBUGINFOD 8 |
| .SH NAME |
| debuginfod \- debuginfo-related http file-server daemon |
| |
| .SH SYNOPSIS |
| .B debuginfod |
| [\fIOPTION\fP]... [\fIPATH\fP]... |
| |
| .SH DESCRIPTION |
| \fBdebuginfod\fP serves debuginfo-related artifacts over HTTP. It |
| periodically scans a set of directories for ELF/DWARF files and their |
| associated source code, as well as archive files containing the above, to |
| build an index by their buildid. This index is used when remote |
| clients use the HTTP webapi, to fetch these files by the same buildid. |
| |
| If a debuginfod cannot service a given buildid artifact request |
| itself, and it is configured with information about upstream |
| debuginfod servers, it queries them for the same information, just as |
| \fBdebuginfod-find\fP would. If successful, it locally caches then |
| relays the file content to the original requester. |
| |
| Indexing the given PATHs proceeds using multiple threads. One thread |
| periodically traverses all the given PATHs logically or physically |
| (see the \fB\-L\fP option). Duplicate PATHs are ignored. You may use |
| a file name for a PATH, but source code indexing may be incomplete; |
| prefer using a directory that contains the binaries. The traversal |
| thread enumerates all matching files (see the \fB\-I\fP and \fB\-X\fP |
| options) into a work queue. A collection of scanner threads (see the |
| \fB\-c\fP option) wait at the work queue to analyze files in parallel. |
| |
| If the \fB\-F\fP option is given, each file is scanned as an ELF/DWARF |
| file. Source files are matched with DWARF files based on the |
| AT_comp_dir (compilation directory) attributes inside it. Caution: |
| source files listed in the DWARF may be a path \fIanywhere\fP in the |
| file system, and debuginfod will readily serve their content on |
| demand. (Imagine a doctored DWARF file that lists \fI/etc/passwd\fP |
| as a source file.) If this is a concern, audit your binaries with |
| tools such as: |
| |
| .SAMPLE |
| % eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^File.name.table/p' |
| or |
| % eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^Line.number/p' |
| or even use debuginfod itself: |
| % debuginfod -vvv -d :memory: -F BINARY 2>&1 | grep 'recorded.*source' |
| ^C |
| .ESAMPLE |
| |
| If any of the \fB\-R\fP, \fB-U\fP, or \fB-Z\fP options is given, each |
| file is scanned as an archive file that may contain ELF/DWARF/source |
| files. Archive files are recognized by extension. If \-R is given, |
| ".rpm" files are scanned; if \-U is given, ".deb" and ".ddeb" files |
| are scanned; if \-Z is given, the listed extensions are scanned. |
| Because of complications such as DWZ-compressed debuginfo, may require |
| \fItwo\fP traversal passes to identify all source code. Source files |
| for RPMs are only served from other RPMs, so the caution for \-F does |
| not apply. Note that due to Debian/Ubuntu packaging policies & |
| mechanisms, debuginfod cannot resolve source files for DEB/DDEB at |
| all. |
| |
| If no PATH is listed, or none of the scanning options is given, then |
| \fBdebuginfod\fP will simply serve content that it accumulated into |
| its index in all previous runs, periodically groom the database, and |
| federate to any upstream debuginfod servers. In \fIpassive\fP mode, |
| \fBdebuginfod\fP will only serve content from a read-only index and |
| federated upstream servers, but will not scan or groom. |
| |
| .SH OPTIONS |
| |
| .TP |
| .B "\-F" |
| Activate ELF/DWARF file scanning. The default is off. |
| |
| .TP |
| .B "\-Z EXT" "\-Z EXT=CMD" |
| Activate an additional pattern in archive scanning. Files with name |
| extension EXT (include the dot) will be processed. If CMD is given, |
| it is invoked with the file name added to its argument list, and |
| should produce a common archive on its standard output. Otherwise, |
| the file is read as if CMD were "cat". Since debuginfod internally |
| uses \fBlibarchive\fP to read archive files, it can accept a wide |
| range of archive formats and compression modes. The default is no |
| additional patterns. This option may be repeated. |
| |
| .TP |
| .B "\-R" |
| Activate RPM patterns in archive scanning. The default is off. |
| Equivalent to \fB\%\-Z\~.rpm=cat\fP, since libarchive can natively |
| process RPM archives. If your version of libarchive is much older |
| than 2020, be aware that some distributions have switched to an |
| incompatible zstd compression for their payload. You may experiment |
| with \fB\%\-Z\ .rpm='(rpm2cpio|zstdcat)<'\fP instead of \fB\-R\fP. |
| |
| .TP |
| .B "\-U" |
| Activate DEB/DDEB patterns in archive scanning. The default is off. |
| Equivalent to \fB\%\-Z\ .deb='dpkg-deb\ \-\-fsys\-tarfile\fP' |
| \fB\%\-Z\ .ddeb='dpkg-deb\ \-\-fsys\-tarfile'\fP. |
| |
| .TP |
| .B "\-d FILE" "\-\-database=FILE" |
| Set the path of the sqlite database used to store the index. This |
| file is disposable in the sense that a later rescan will repopulate |
| data. It will contain absolute file path names, so it may not be |
| portable across machines. It may be frequently read/written, so it |
| should be on a fast filesystem. It should not be shared across |
| machines or users, to maximize sqlite locking performance. For quick |
| testing the magic string ":memory:" can be used to use an one-time |
| memory-only database. The default database file is |
| \%$HOME/.debuginfod.sqlite. |
| |
| .TP |
| .B "\-\-passive" |
| Set the server to passive mode, where it only services webapi |
| requests, including participating in federation. It performs no |
| scanning, no grooming, and so only opens the sqlite database |
| read-only. This way a database can be safely shared between a active |
| scanner/groomer server and multiple passive ones, thereby sharing |
| service load. Archive pattern options must still be given, so |
| debuginfod can recognize file name extensions for unpacking. |
| |
| .TP |
| .B "\-D SQL" "\-\-ddl=SQL" |
| Execute given sqlite statement after the database is opened and |
| initialized as extra DDL (SQL data definition language). This may be |
| useful to tune performance-related pragmas or indexes. May be |
| repeated. The default is nothing extra. |
| |
| .TP |
| .B "\-p NUM" "\-\-port=NUM" |
| Set the TCP port number (0 < NUM < 65536) on which debuginfod should |
| listen, to service HTTP requests. Both IPv4 and IPV6 sockets are |
| opened, if possible. The webapi is documented below. The default |
| port number is 8002. |
| |
| .TP |
| .B "\-I REGEX" "\-\-include=REGEX" "\-X REGEX" "\-\-exclude=REGEX" |
| Govern the inclusion and exclusion of file names under the search |
| paths. The regular expressions are interpreted as unanchored POSIX |
| extended REs, thus may include alternation. They are evaluated |
| against the full path of each file, based on its \fBrealpath(3)\fP |
| canonicalization. By default, all files are included and none are |
| excluded. A file that matches both include and exclude REGEX is |
| excluded. (The \fIcontents\fP of archive files are not subject to |
| inclusion or exclusion filtering: they are all processed.) Only the |
| last of each type of regular expression given is used. |
| |
| .TP |
| .B "\-t SECONDS" "\-\-rescan\-time=SECONDS" |
| Set the rescan time for the file and archive directories. This is the |
| amount of time the traversal thread will wait after finishing a scan, |
| before doing it again. A rescan for unchanged files is fast (because |
| the index also stores the file mtimes). A time of zero is acceptable, |
| and means that only one initial scan should performed. The default |
| rescan time is 300 seconds. Receiving a SIGUSR1 signal triggers a new |
| scan, independent of the rescan time (including if it was zero), |
| interrupting a groom pass (if any). |
| |
| .TP |
| .B "\-r" |
| Apply the -I and -X during groom cycles, so that files excluded by the regexes are removed from the index. These parameters are in addition to what normally qualifies a file for grooming, not a replacement. |
| |
| .B "\-g SECONDS" "\-\-groom\-time=SECONDS" |
| Set the groom time for the index database. This is the amount of time |
| the grooming thread will wait after finishing a grooming pass before |
| doing it again. A groom operation quickly rescans all previously |
| scanned files, only to see if they are still present and current, so |
| it can deindex obsolete files. See also the \fIDATA MANAGEMENT\fP |
| section. The default groom time is 86400 seconds (1 day). A time of |
| zero is acceptable, and means that only one initial groom should be |
| performed. Receiving a SIGUSR2 signal triggers a new grooming pass, |
| independent of the groom time (including if it was zero), interrupting |
| a rescan pass (if any).. |
| |
| .TP |
| .B "\-G" |
| Run an extraordinary maximal-grooming pass at debuginfod startup. |
| This pass can take considerable time, because it tries to remove any |
| debuginfo-unrelated content from the archive-related parts of the index. |
| It should not be run if any recent archive-related indexing operations |
| were aborted early. It can take considerable space, because it |
| finishes up with an sqlite "vacuum" operation, which repacks the |
| database file by triplicating it temporarily. The default is not to |
| do maximal-grooming. See also the \fIDATA MANAGEMENT\fP section. |
| |
| .TP |
| .B "\-c NUM" "\-\-concurrency=NUM" |
| Set the concurrency limit for the scanning queue threads, which work |
| together to process archives & files located by the traversal thread. |
| This important for controlling CPU-intensive operations like parsing |
| an ELF file and especially decompressing archives. The default is the |
| number of processors on the system; the minimum is 1. |
| |
| .TP |
| .B "\-L" |
| Traverse symbolic links encountered during traversal of the PATHs, |
| including across devices - as in \fIfind\ -L\fP. The default is to |
| traverse the physical directory structure only, stay on the same |
| device, and ignore symlinks - as in \fIfind\ -P\ -xdev\fP. Caution: a |
| loops in the symbolic directory tree might lead to \fIinfinite |
| traversal\fP. |
| |
| .TP |
| .B "\-\-fdcache\-fds=NUM" "\-\-fdcache\-mbs=MB" "\-\-fdcache\-prefetch=NUM2" |
| Configure limits on a cache that keeps recently extracted files from |
| archives. Up to NUM requested files and up to a total of MB megabytes |
| will be kept extracted, in order to avoid having to decompress their |
| archives over and over again. In addition, up to NUM2 other files |
| from an archive may be prefetched into the cache before they are even |
| requested. The default NUM, NUM2, and MB values depend on the |
| concurrency of the system, and on the available disk space on the |
| $TMPDIR or \fB/tmp\fP filesystem. This is because that is where the |
| most recently used extracted files are kept. Grooming cleans this |
| cache. |
| |
| .TP |
| .B "\-\-fdcache\-\-prefetch\-fds=NUM" "\-\-fdcache\-\-prefetch\-mbs=MB" |
| Configure how many file descriptors (fds) and megabytes (mbs) are |
| allocated to the prefetch fdcache. If unspecified, values of |
| \fB\-\-prefetch\-fds\fP and \fB\-\-prefetch\-mbs\fP depend |
| on concurrency of the system and on the available disk space on |
| the $TMPDIR. Allocating more to the prefetch cache will improve |
| performance in environments where different parts of several large |
| archives are being accessed. |
| |
| .TP |
| .B "\-\-fdcache\-mintmp=NUM" |
| Configure a disk space threshold for emergency flushing of the cache. |
| The filesystem holding the cache is checked periodically. If the |
| available space falls below the given percentage, the cache is |
| flushed, and the fdcache will stay disabled until the next groom |
| cycle. This mechanism, along a few associated /metrics on the webapi, |
| are intended to give an operator notice about storage scarcity - which |
| can translate to RAM scarcity if the disk happens to be on a RAM |
| virtual disk. The default threshold is 25%. |
| |
| .TP |
| .B "\-\-forwarded\-ttl\-limit=NUM" |
| Configure limits of X-Forwarded-For hops. if X-Forwarded-For |
| exceeds N hops, it will not delegate a local lookup miss to |
| upstream debuginfods. The default limit is 8. |
| |
| .TP |
| .B "\-v" |
| Increase verbosity of logging to the standard error file descriptor. |
| May be repeated to increase details. The default verbosity is 0. |
| |
| .SH WEBAPI |
| |
| .\" Much of the following text is duplicated with debuginfod-find.1 |
| |
| debuginfod's webapi resembles ordinary file service, where a GET |
| request with a path containing a known buildid results in a file. |
| Unknown buildid / request combinations result in HTTP error codes. |
| This file service resemblance is intentional, so that an installation |
| can take advantage of standard HTTP management infrastructure. |
| |
| Upon finding a file in an archive or simply in the database, some |
| custom http headers are added to the response. For files in the |
| database X-DEBUGINFOD-FILE and X-DEBUGINFOD-SIZE are added. |
| X-DEBUGINFOD-FILE is simply the unescaped filename and |
| X-DEBUGINFOD-SIZE is the size of the file. For files found in archives, |
| in addition to X-DEBUGINFOD-FILE and X-DEBUGINFOD-SIZE, |
| X-DEBUGINFOD-ARCHIVE is added. X-DEBUGINFOD-ARCHIVE is the name of the |
| archive the file was found in. |
| |
| There are three requests. In each case, the buildid is encoded as a |
| lowercase hexadecimal string. For example, for a program \fI/bin/ls\fP, |
| look at the ELF note GNU_BUILD_ID: |
| |
| .SAMPLE |
| % readelf -n /bin/ls | grep -A4 build.id |
| Note section [ 4] '.note.gnu.buildid' of 36 bytes at offset 0x340: |
| Owner Data size Type |
| GNU 20 GNU_BUILD_ID |
| Build ID: 8713b9c3fb8a720137a4a08b325905c7aaf8429d |
| .ESAMPLE |
| |
| Then the hexadecimal BUILDID is simply: |
| |
| .SAMPLE |
| 8713b9c3fb8a720137a4a08b325905c7aaf8429d |
| .ESAMPLE |
| |
| .SS /buildid/\fIBUILDID\fP/debuginfo |
| |
| If the given buildid is known to the server, this request will result |
| in a binary object that contains the customary \fB.*debug_*\fP |
| sections. This may be a split debuginfo file as created by |
| \fBstrip\fP, or it may be an original unstripped executable. |
| |
| .SS /buildid/\fIBUILDID\fP/executable |
| |
| If the given buildid is known to the server, this request will result |
| in a binary object that contains the normal executable segments. This |
| may be a executable stripped by \fBstrip\fP, or it may be an original |
| unstripped executable. \fBET_DYN\fP shared libraries are considered |
| to be a type of executable. |
| |
| .SS /buildid/\fIBUILDID\fP/source\fI/SOURCE/FILE\fP |
| |
| If the given buildid is known to the server, this request will result |
| in a binary object that contains the source file mentioned. The path |
| should be absolute. Relative path names commonly appear in the DWARF |
| file's source directory, but these paths are relative to |
| individual compilation unit AT_comp_dir paths, and yet an executable |
| is made up of multiple CUs. Therefore, to disambiguate, debuginfod |
| expects source queries to prefix relative path names with the CU |
| compilation-directory, followed by a mandatory "/". |
| |
| Note: the caller may or may not elide \fB../\fP or \fB/./\fP or extraneous |
| \fB///\fP sorts of path components in the directory names. debuginfod |
| accepts both forms. Specifically, debuginfod canonicalizes path names |
| according to RFC3986 section 5.2.4 (Remove Dot Segments), plus reducing |
| any \fB//\fP to \fB/\fP in the path. |
| |
| For example: |
| .TS |
| l l. |
| #include <stdio.h> /buildid/BUILDID/source/usr/include/stdio.h |
| /path/to/foo.c /buildid/BUILDID/source/path/to/foo.c |
| \../bar/foo.c AT_comp_dir=/zoo/ /buildid/BUILDID/source/zoo//../bar/foo.c |
| .TE |
| |
| Note: the client should %-escape characters in /SOURCE/FILE that are |
| not shown as "unreserved" in section 2.3 of RFC3986. Some characters |
| that will be escaped include "+", "\\", "$", "!", the 'space' character, |
| and ";". RFC3986 includes a more comprehensive list of these characters. |
| .SS /metrics |
| |
| This endpoint returns a Prometheus formatted text/plain dump of a |
| variety of statistics about the operation of the debuginfod server. |
| The exact set of metrics and their meanings may change in future |
| versions. Caution: configuration information (path names, versions) |
| may be disclosed. |
| |
| .SH DATA MANAGEMENT |
| |
| debuginfod stores its index in an sqlite database in a densely packed |
| set of interlinked tables. While the representation is as efficient |
| as we have been able to make it, it still takes a considerable amount |
| of data to record all debuginfo-related data of potentially a great |
| many files. This section offers some advice about the implications. |
| |
| As a general explanation for size, consider that debuginfod indexes |
| ELF/DWARF files, it stores their names and referenced source file |
| names, and buildids will be stored. When indexing archives, it stores |
| every file name \fIof or in\fP an archive, every buildid, plus every |
| source file name referenced from a DWARF file. (Indexing archives |
| takes more space because the source files often reside in separate |
| subpackages that may not be indexed at the same pass, so extra |
| metadata has to be kept.) |
| |
| Getting down to numbers, in the case of Fedora RPMs (essentially, |
| gzip-compressed cpio files), the sqlite index database tends to be |
| from 0.5% to 3% of their size. It's larger for binaries that are |
| assembled out of a great many source files, or packages that carry |
| much debuginfo-unrelated content. It may be even larger during the |
| indexing phase due to temporary sqlite write-ahead-logging files; |
| these are checkpointed (cleaned out and removed) at shutdown. It may |
| be helpful to apply tight \-I or \-X regular-expression constraints to |
| exclude files from scanning that you know have no debuginfo-relevant |
| content. |
| |
| As debuginfod runs in normal \fIactive\fP mode, it periodically |
| rescans its target directories, and any new content found is added to |
| the database. Old content, such as data for files that have |
| disappeared or that have been replaced with newer versions is removed |
| at a periodic \fIgrooming\fP pass. This means that the sqlite files |
| grow fast during initial indexing, slowly during index rescans, and |
| periodically shrink during grooming. There is also an optional |
| one-shot \fImaximal grooming\fP pass is available. It removes |
| information debuginfo-unrelated data from the archive content index |
| such as file names found in archives ("archive sdef" records) that are |
| not referred to as source files from any binaries find in archives |
| ("archive sref" records). This can save considerable disk space. |
| However, it is slow and temporarily requires up to twice the database |
| size as free space. Worse: it may result in missing source-code info |
| if the archive traversals were interrupted, so that not all source |
| file references were known. Use it rarely to polish a complete index. |
| |
| You should ensure that ample disk space remains available. (The flood |
| of error messages on -ENOSPC is ugly and nagging. But, like for most |
| other errors, debuginfod will resume when resources permit.) If |
| necessary, debuginfod can be stopped, the database file moved or |
| removed, and debuginfod restarted. |
| |
| sqlite offers several performance-related options in the form of |
| pragmas. Some may be useful to fine-tune the defaults plus the |
| debuginfod extras. The \-D option may be useful to tell debuginfod to |
| execute the given bits of SQL after the basic schema creation |
| commands. For example, the "synchronous", "cache_size", |
| "auto_vacuum", "threads", "journal_mode" pragmas may be fun to tweak |
| via \-D, if you're searching for peak performance. The "optimize", |
| "wal_checkpoint" pragmas may be useful to run periodically, outside |
| debuginfod. The default settings are performance- rather than |
| reliability-oriented, so a hardware crash might corrupt the database. |
| In these cases, it may be necessary to manually delete the sqlite |
| database and start over. |
| |
| As debuginfod changes in the future, we may have no choice but to |
| change the database schema in an incompatible manner. If this |
| happens, new versions of debuginfod will issue SQL statements to |
| \fIdrop\fP all prior schema & data, and start over. So, disk space |
| will not be wasted for retaining a no-longer-useable dataset. |
| |
| In summary, if your system can bear a 0.5%-3% index-to-archive-dataset |
| size ratio, and slow growth afterwards, you should not need to |
| worry about disk space. If a system crash corrupts the database, |
| or you want to force debuginfod to reset and start over, simply |
| erase the sqlite file before restarting debuginfod. |
| |
| In contrast, in \fIpassive\fP mode, all scanning and grooming is |
| disabled, and the index database remains read-only. This makes the |
| database more suitable for sharing between servers or sites with |
| simple one-way replication, and data management considerations are |
| generally moot. |
| |
| .SH SECURITY |
| |
| debuginfod \fBdoes not\fP include any particular security features. |
| While it is robust with respect to inputs, some abuse is possible. It |
| forks a new thread for each incoming HTTP request, which could lead to |
| a denial-of-service in terms of RAM, CPU, disk I/O, or network I/O. |
| If this is a problem, users are advised to install debuginfod with a |
| HTTPS reverse-proxy front-end that enforces site policies for |
| firewalling, authentication, integrity, authorization, and load |
| control. The \fI/metrics\fP webapi endpoint is probably not |
| appropriate for disclosure to the public. |
| |
| When relaying queries to upstream debuginfods, debuginfod \fBdoes not\fP |
| include any particular security features. It trusts that the binaries |
| returned by the debuginfods are accurate. Therefore, the list of |
| servers should include only trustworthy ones. If accessed across HTTP |
| rather than HTTPS, the network should be trustworthy. Authentication |
| information through the internal \fIlibcurl\fP library is not currently |
| enabled. |
| |
| .nr zZ 1 |
| .so man7/debuginfod-client-config.7 |
| |
| .SH ADDITIONAL FILES |
| .TP |
| .B $HOME/.debuginfod.sqlite |
| Default database file. |
| .PD |
| |
| |
| .SH "SEE ALSO" |
| .I "debuginfod-find(1)" |
| .I "sqlite3(1)" |
| .I \%https://prometheus.io/docs/instrumenting/exporters/ |