A RISC-V gcc pitfall revealed by a glibc update
UPDATE 2023-05-11: Finally we have builtin inline subword atomic support in GCC!
- [PATCH] riscv: Don't add -latomic with -pthread https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617355.html
- [PATCH v7] RISCV: Inline subword atomic ops https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616807.html
So we won't be bothered by this issue anymore.
The original blog post:
TL;DR: see "Wrap Up" and "Solution".
Problem
Recently, when we're working on a series of package rebuild pipelines triggered by a glibc
update (from 2.33 to 2.34), we discovered that some previously compiling packages, mostly configured to use CMake, refuse to compile now. Their error logs look like this:
1 | /usr/bin/ld: libbson-1.0.so.0.0.0: undefined reference to `__atomic_exchange_1' |
This is quite strange to us, because we used to think we have got rid of this for all by manually appending set(THREADS_PREFER_PTHREAD_FLAG ON)
to these packages' CMakeLists.txt
, despite replacing all -lpthread
with -pthread
. This was done in a per-package manner, modifying the patch to meet the need of the tech stack used by the specific package, and we are sure that they used to work before the glibc
upgrade.
Before proceeding, you need to know the difference between
-pthread
and-lpthread
, and that-pthread
is the most preferable way if you want to link to thepthread
library, at least forglibc<=2.33
.
So, what's happening? Is there anything vital got changed in this glibc
upgrade?
Cause
After reading the release note of glibc
2.34, we did notice something related:
...all functionality formerly implemented in the libraries
libpthread
,libdl
,libutil
,libanl
has been integrated intolibc
. New applications do not need to link with-lpthread
,-ldl
,-lutil
,-lanl
anymore. For backwards compatibility, empty static archiveslibpthread.a
,libdl.a
,libutil.a
,libanl.a
are provided, so that the linker options keep working.
Hmm, good, sounds like we don't need -pthread
anymore. Actually, it appeared to us that CMake thinks the same. After running cmake
for the failing packages, we can confirm that nothing looks like -pthread
is appended to either CXXFLAGS
or LDFLAGS
. However, if forced to compile with -pthread
, the previous-mentioned link error disappears. In other words, applying this patch fixes the error.
1 | - make |
At this point, every clue we have gathered so far seems to indicate a bug inside cmake
:
- Other tools work fine
set(THREADS_PREFER_PTHREAD_FLAG ON)
appears to be "broken"- Manually appending
-pthread
fixes the issue, socmake
failed to recognize that-pthread
is necessary
Root Cause
Blame CMake?
So we dig into CMake's source code to see what happened. To our surprise, we didn't notice any change to the detection code it used for determining whether -pthread
is necessary. CMake's detection code, at the time of writing, looks like this:
1 |
|
It turned out that according to the glibc
upgrade, this detection code now compiles without additional arguments:
1 | gcc test_pthread.c |
As far as I can tell, this does not look like a CMake bug. The expected behavior, which is exactly what we have seen, is that if the test code can be compiled without -pthread
, then the argument should not be appended to the command like (or {C,CXX}FLAGS
, correspondingly). This lead to another question: why those packages are failing now, provided that the detection code is unchanged, and it used to work properly?
Dive into Source Code
Let's take the package mongo-c-driver
as an example. As of version 1.21.1, we can see the code listed below in its src/libbson/src/bson/bson-atomic.h
(Feel familiar with this path? Remember the libbson-1.0.so.0.0.0: undefined reference
error log mentioned before?):
1 |
|
As we can tell from libbson
's source code, it attempts to use the __atomic_*
builtins of GCC, like __atomic_fetch_add
and __atomic_compare_exchange
. We can construct some test cases to check whether we're right:
1 | // test1.c |
1 | // test2.c |
1 | // test4.c |
1 | // test8.c |
Here, we use the gcc built-in
__atomic_*
becauselibbson
used it. Replacing this withstd::atomic<T>
(and compile it withg++
) yields identical results correspondingly. You may also useuint*_t
provided by thelinux/types.h
header, as their sizes are promised to be the same across all architectures.
Compile results:
1 | gcc test1.c |
If you are kind of familiar with C++, you may have heard that codes using atomic operations might need to link libatomic
(I remembered once reading this, but I can't find it now). It turned out that if we provide -latomic
to gcc, the code compiles:
1 | gcc test1.c -latomic |
And this also works for -pthread
:
1 | gcc test1.c -pthread |
Smart readers might have felt something unusual:
First, as we can see, gcc compiles test4.c
and test8.c
successfully without -latomic
or -pthread
, but it couldn't handle the atomic operations in test1.c
and test2.c
without making a call to libatomic
:
1 | gcc test1.c -latomic -S |
This is partially explained in a GitHub issue (riscv-collab/riscv-gcc#12). Till April 2022, i.e. when this blog is written, gcc does not support inlining subword atomics, and this is the reason why -latomic
must be presented when invoking gcc
.
But what about -pthread
? Why -pthread
works either? I mean, libpthread
itself does not provide those atomic symbols, right? Actually, if you use ldd
to check a.out
generated by gcc test1.c -pthread
, you will notice that libatomic
is linked, instead of libpthread
as we are expecting from -pthread
. Changing -pthread
with -lpthread
won't work, so gcc must have done something internally for -pthread
despite linking libpthread
. In order to figure out the difference, we have to check the gcc spec:
1 | gcc -dumpspecs | grep pthread |
Oh, the answer finally reveals: gcc silently append --as-needed -latomic
if -pthread
is provided. This also explains why changing -lpthread
to -pthread
works.
The reason why gcc decided to link libatomic
when -pthread
is provided had been lost in a squashed commit. My guess is that they tried to cover up the subword atomic pitfall using this strategy. (UPDATE: aswaterman
's initial fix did not limit the scope to -pthread
. The limitation was added one week later.) Generally, atomics are used with multi-threading, and you use pthread in such cases. However, one may use atomics without pthread, like when writing SIGNAL handlers. So IMO this workaround is not good enough, and should be replaced by subword atomic inline support.
UPDATE: We discussed this with some gcc RISC-V devs, and it turned out that when compiling gcc, at stage 1 it may not recognize libatomic (i.e.
-latomic
may not work) if you are compiling it with a gcc that uses the old spec So you'll have to set--with-spec
at stage 1 to workaround this. Hopefully there would be an--always-link-libatomic
configuration flag in the future.
Fun facts: gcc once had added --as-needed -latomic
to its LIB_SPEC
unconditionally, in this commit (2017-02-01):
1 | + #undef LIB_SPEC |
But later in the final port commit submitted to gcc (2017-02-07; svn r245224), they limited the change to -pthread
:
1 | - " " LD_AS_NEEDED_OPTION " -latomic " LD_NO_AS_NEEDED_OPTION \ |
The reason of making such limitation remains unknown.
Wrap Up
glibc
moved libpthread
into libc
, and CMake's pthread sample code compiles directly. Hence, CMake thinks there's no need to provide -pthread
when invoking gcc
/ g++
. Since we used to patch such packages with -pthread
instead of -latomic
, and are actually relying on -pthread
's side effect (--as-needed -latomic
), this led to the consequence that neither -pthread
nor -latomic
is provided, so subword atomics refuse to compile as libatomic
is not linked. For ways of fixing this problem, please refer to the Solution chapter.
Also note that the dependence on libatomic
causes another problem. Some packages / frameworks, e.g. JUCE
insists that its atomic wrap only accepts lock-free values. However, when a program is compiled, the compiler cannot know whether a call to libatomic
will be lock-free or not (actually, the emitted sequence will be lock-free, but the compiler doesn't know). Hence, the compiler will return false for std::atomic<bool>::always_lock_free
, and this also happens for the macro ATOMIC_BOOL_LOCK_FREE
defined in atomic.h
. As a result, if you use juce::Atomic<bool>
, some static asserts will fail, and the package refuses to compile.
Solution
Solutions vary according to the scale of projects.
- As a developer / user on behalf of yourself,
export CFLAGS="$CFLAGS --as-needed -latomic"
and likewise forCXXFLAGS
solve the problem. As an alternative, you can also patch theMakefile
file, and edit the flags inside it. - As a packager maintaining multiple packages with the same toolchain, you may want to write a module / universal patch. For projects with CMake, you can use LLVM's
CheckAtomic.cmake
, which (if you are using a newer version) has taken subword atomics into consideration. - As a developer maintaining a relatively large project that has gained some users, you may need to consider the compatibility with
clang
andcompiler-rt
. Clang should be able to handle-latomic
, but better perform a double-check. You can temporarily rely on the side-effect of gcc-pthread
(which is not recommended), or edit your configure script manually to disable-latomic
whenclang
andcompiler-rt
are detected. - As a developer of a widely-used toolchain project, despite considering
clang
andcompiler-rt
, compatibility with older compilers / linkers that do not support--as-needed
and even-pthread
might need to be considered. CMake reverted their-pthread
fix on this because a compiler named XL does not support-pthread
, and interprets it as-p -t hread
, returning zero (oops! false negative) as its exit code. - The final approach of solving this is to add subword atomic inlining support into gcc. There's a pending patch regarding this, but this patch only adds support for
__atomic_fetch_add
, so the problem would not be mitigated (but this patch indicates a good start!).
Complicated solutions may increase the burden on project developers / maintainers, and their willingness to port their projects to RISC-V might be reduced. Aiming to tackle the problem, we have proposed to change LIB_SPEC
back to its originally patched form to mitigate this issue partially (still not lock-free), and hope that the subword atomic pitfall can be solved completely in the future.
UPDATE: After discussing this with gcc RISC-V devs, we reached the consensus that maybe a gcc configuration argument
--always-link-libatomic
can be added.
UPDATE2: This breaks the bootstrap process of GCC. GCC compiles itself twice to get rid of e.g. dirty compiling environments stuffs. Modifying the spec string too early causes glitches, like libatomic calls itself infinitely. So we have to use
sed
to replace the spec string inside the first-stage GCC binary carefully, filling blanks (U+0020
) to make sure old and new spec strings have the same length.