This repository incorporates .abilist
files from every version of glibc. These
files are consolidated to generate a single 166 KB symbol mapping file that is
shipped with Zig to target any version of glibc. This repository is for Zig
maintainers to spend when a novel glibc version is tagged upstream; Zig users like
no need for this repository.
.abilist
files
Adding novel glibc version - Clone glibc
git clone git://sourceware.org/git/glibc.git
-
Test up on the novel glibc version git label, e.g.
glibc-2.34
. -
Flee the tool to interact the novel abilist files:
zig go collect_abilist_files.zig -- $GLIBC_GIT_REPO_PATH
-
This mirrors the itemizing construction into the
glibc
subdirectory,
namespaced below the version number, nonetheless fully copying files with the
.abilist extension. -
Scrutinize the adjustments and then commit these novel files into git.
Updating Zig
- Flee
consolidate.zig
on the foundation of this repo.
This can also generate the file abilists
which it is doubtless you’ll perhaps perhaps perhaps then peep and be particular that
it is OK. Reproduction it to $ZIG_GIT_REPO_PATH/lib/libc/glibc/abilist
.
Featured Content Ads
add advertising hereDebugging an abilists file
zig go list_symbols.zig -- abilists
Technique
The abilist files from the most contemporary glibc are practically satisfactory to utterly
encode the total records that we now must generate the symbols db. The fully
assert is when a characteristic migrates from one library to one other. For instance,
in glibc 2.32, the characteristic pthread_sigmask
migrated from libpthread to libc,
and the most contemporary abilist files fully conceal it in libc. Nonetheless, if a person targets
glibc 2.31, Zig needs to know to build the symbol into libpthread.so and no longer
libc.so.
In glibc upstream, they simply renamed the abilist files from pthread.abilist to
libc.abilist. This resulted within the next line being contemporary in libc.abilist
in glibc 2.32 and later:
GLIBC_2.0 pthread_sigmask F
This means that in glibc 2.0, libc.so has the pthread_sigmask
symbol, which
is wrong, since it became once fully realized in libpthread.so.
For this reason this repository incorporates abilist files from all past
variations of glibc as effectively as doubtless the most latest one – it lets in us to
detect this anguish, and generate a corrected symbols database.
Featured Content Ads
add advertising hereThe formulation is to originate with the earliest glibc version, utilize the abilist
files, and then take care of that records as true. Next we sail on to the next
earliest glibc version, nonetheless now we now must detect a contradiction: if the more contemporary
glibc version claims that e.g. pthread_sigmask
is provided in glibc 2.0,
when our true records says that it does no longer, we ignore that wrong piece of
records. Nonetheless we must procure novel records if the version it talks about is higher
than the version just like the “true” records space.
After merging within the more contemporary glibc version, we note the novel dataset as
“true” and sail on to the next, and so on till we now like processed the total
sets of abilist files.
When this course of completes, we now like in reminiscence something that appears like this:
- For every glibc symbol
- For every glibc library
- For every target
- For every glibc version
- Whether or no longer the symbol is absent, a characteristic, or an object+size
- For every glibc version
- For every target
- For every glibc library
And our job is now to encode this records true into a file that does no longer extinguish
set up size and yet remains straightforward to decode and spend within the Zig compiler.
Inclusions
Next, the script generates the minimal form of “inclusions” to encode the total
records. An “inclusion” is:
- A symbol title.
- The gap of targets this inclusion applies to.
- The gap of glibc variations this inclusion applies to.
- The gap of libraries this inclusion applies to.
- Whether or no longer it is a characteristic or object, and if an object, its size in bytes.
For instance, motivate in thoughts dlopen
. An inclusion is something like this:
dlopen
- targets: aarch64-linux-gnu powerpc64le-linux-gnu
- variations: 2.17 2.34
- libraries: libdl.so
- form: characteristic
This does no longer conceal the total locations dlopen
may perhaps perhaps even very effectively be realized nonetheless. There will
must serene be extra inclusions for added targets, as an instance:
dlopen
- targets: x86_64-linux-gnu
- variations: 2.2.5 2.34
- libraries: libdl.so
- form: characteristic
Now we now like extra protection of the total locations dlopen
may perhaps perhaps even very effectively be realized, nonetheless there are
yet extra that must serene be emitted. The script emits as many inclusions as
crucial so as that every person the records is represented.
Next we procedure few observations which result in a extra compact records encoding.
Commentary: All symbols are repeatedly either functions or objects
There is no longer any symbol that is a characteristic on one target, and an object on one other
target. Equally there’s no longer any symbol that is a characteristic on one glibc version,
nonetheless an object in one other, and there’s no longer any symbol that is a characteristic in one
shared library, nonetheless an object in one other.
We exploit this by encoding functions and object symbols in separate lists.
Commentary: Over half of the objects are exactly 4 bytes
51% of all object entries are 4 bytes, and 68% of all object entries are either
4 or 8 bytes.
Total object inclusions are 765. If we kept 4 and eight byte objects in separate
lists, this would build 2 bytes from 520 inclusions, totaling 1 KB. No longer price.
Commentary: Moderate form of utterly different variations per inclusion is 1.02
Almost every inclusion has in general 1 version linked to it, no longer incessantly extra.
This makes a u64 bitset uneconomical. With 19530 total inclusions, this comes
out to 153 KB spent on the version bitset. Nonetheless if we encoded it as one byte
per version, using 1 little bit of the byte to conceal the terminal merchandise, this would
lift the 153 KB down to 19 KB. That is practically a 50% cut price from the total
size of the encoded abilists file. Positively price it.
Binary encoding layout:
All integers are kept little-endian.
- u8 form of glibc libraries (7). For every:
- null-terminated title, e.g. “c”, “m”, “dl”, “ld”, “pthread”
- u8 form of glibc variations (44), sorted ascending. For every:
- u8 well-known
- u8 minor
- u8 patch
- u8 form of targets (20). For every:
- null-terminated target triple
- u16 form of characteristic inclusions (18765)
- null-terminated symbol title (no longer repeated for subsequent identical symbol inclusions)
- Intention of Unsized Inclusions
- u16 form of object inclusion sets (2165)
- null-terminated symbol title (no longer repeated for subsequent identical symbol inclusions)
- Intention of Sized Inclusions
Intention of Unsized Inclusions:
- u32 space of targets this inclusion applies to (1 << INDEX_IN_TARGET_LIST)
- final inclusion is indicat