STG stands for Symbol-Type Graph.
STG models Application Binary Interfaces. It supports extraction of ABIs from DWARF and ingestion of BTF and libabigail XML into its model. Its primary purpose is monitoring an ABI for changes over time and reporting such changes in a comprehensible fashion.
STG captures symbol information, the size and layout of structs, function argument and return types and much more, in a graph representation. Difference reporting happens via a graph comparison.
Currently, STG functionality is exposed as two command-line tools, stg
(for ABI extraction) and stgdiff
(for ABI comparison), and a native file format.
STG's model is an abstraction which does not and cannot capture every possible interface property, invariant or behaviour. Conversely, the model includes distinctions which are API significant but not ABI significant.
Concretely, STG's model is a rooted, connected, directed graph where each kind of node corresponds to a meaningful ABI entity such as a symbol, function type or struct member.
Nodes have specific attributes, such as name or size. Outgoing edges specify things like return type. STG's model does not impose any constraints on which nodes may be joined by edges.
Each node has an identity. However, for the purpose of comparison, nodes are considered equal if they are of the same kind, have the same attributes and matching outgoing edges and all nodes reachable via a pair of matching edges are (recursively) equal. Renumbering nodes, (de)duplicating nodes and adding/removing unreachable nodes do not affect this relationship.
As modelled by STG, symbols correspond closely to ELF symbols as seen in .dynsym
for shared object files or in .symtab
for object files. In the case of the Linux kernel, the .symtab
is enriched with metadata and the effective “ksymtab” is actually a subset of the ELF symbols together with CRC and namespace information.
STG links symbols to their source-level types where these are known. Symbols defined purely in assembly language will not have type information.
The symbol table is contained in the root node of the graph, which is an Interface node.
STG models the C, C++ and (to a limited extent) Rust type systems.
For example, C++ template value parameters are poorly modelled for the simple reason that this would require modelling C++ values as well as types, something that DWARF itself doesn't do to the full extent permitted by C++20.
As type definitions are in general mutually recursive, an STG ABI is in general a cyclic graph.
The root node of the graph can also contain a list of interface types, which may not necessarily be reachable from the interface symbols.
STG can read its own native format for processing or comparison. It can also process libabigail XML and BTF (.BTF
ELF sections), with some limitations due to model, design and implementation differences including missing features.
STG has the following kinds of node.
void
and ...
*
, &
and &&
foo::*
typedef
and using ... = ...
const
and friendsint
and friendsfoo[N]
- there is no distinction between zero and indeterminate length in the modelstruct foo
etc., Rust tuples tooAn STG ABI consists of a rooted, connected graph of such nodes, and nothing else. STG is blind to anything that cannot be represented by its model.
STG's native file format is a protocol buffer text format. It is suitable for revision control, rather than human consumption. It is effectively described by stg.proto
.
In this textual serialisation of ABI graphs, external node identifiers and node order are chosen to minimise file changes when a small subset of the graph changes.
As an example, this is the definition of the Typedef node kind:
message Typedef { fixed32 id = 1; string name = 2; fixed32 referred_type_id = 3; }
libabigail is another project for ABI monitoring. It uses a format that can be parsed as XML.
This command will transform Abigail into STG:
stg --abi library.xml --output library.stg
The main features modelled in Abigail but not STG are:
The Abigail reader has these distinct phases of operation:
BTF is typically used for the Linux kernel where it is generated by pahole -J
from ELF and DWARF information. It can also be generated natively instead of DWARF using gcc -gbtf
and by Clang, but only for eBPF targets.
This command will transform BTF into STG:
stg --btf vmlinux --output vmlinux.stg
STG has primarily been tested against the pahole
(libbtf) dialect of BTF and support is not complete.
.BTF.ext
section is just ignoredBTF_KIND_DATASEC
- skipBTF_KIND_DECL_TAG
- abortBTF_KIND_TYPE_TAG
- abortThe BTF reader has these distinct phases of operation:
.BTF
section data foundThe ELF / DWARF reader operates similarly to the other readers at a high level, but much more work has to be done to turn ELF symbols and DWARF DIEs into STG nodes.
.dynsym
in the case of shared object file)Before stg
outputs a serialised graph, it performs: