A RISC-V gcc pitfall revealed by a glibc update
UPDATE 2023-05-11: Finally we have builtin inline subword atomic support in GCC!
- [PATCH] riscv: Don't add -latomic with -pthread https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617355.html
- [PATCH v7] RISCV: Inline subword atomic ops https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616807.html
So we won't be bothered by this issue anymore.
The original blog post:
TL;DR: see "Wrap Up" and "Solution".
Problem
Recently, when we're working on a series of package rebuild pipelines triggered by a glibc update (from 2.33 to 2.34), we discovered that some previously compiling packages, mostly configured to use CMake, refuse to compile now. Their error logs look like this:
1 | /usr/bin/ld: libbson-1.0.so.0.0.0: undefined reference to `__atomic_exchange_1' |
This is quite strange to us, because we used to think we have got rid of this for all by manually appending set(THREADS_PREFER_PTHREAD_FLAG ON) to these packages' CMakeLists.txt, despite replacing all -lpthread with -pthread. This was done in a per-package manner, modifying the patch to meet the need of the tech stack used by the specific package, and we are sure that they used to work before the glibc upgrade.
Before proceeding, you need to know the difference between
-pthreadand-lpthread, and that-pthreadis the most preferable way if you want to link to thepthreadlibrary, at least forglibc<=2.33.
So, what's happening? Is there anything vital got changed in this glibc upgrade?
Cause
After reading the release note of glibc 2.34, we did notice something related:
...all functionality formerly implemented in the libraries
libpthread,libdl,libutil,libanlhas been integrated intolibc. New applications do not need to link with-lpthread,-ldl,-lutil,-lanlanymore. For backwards compatibility, empty static archiveslibpthread.a,libdl.a,libutil.a,libanl.aare provided, so that the linker options keep working.
Hmm, good, sounds like we don't need -pthread anymore. Actually, it appeared to us that CMake thinks the same. After running cmake for the failing packages, we can confirm that nothing looks like -pthread is appended to either CXXFLAGS or LDFLAGS. However, if forced to compile with -pthread, the previous-mentioned link error disappears. In other words, applying this patch fixes the error.
1 | - make |
At this point, every clue we have gathered so far seems to indicate a bug inside cmake:
- Other tools work fine
set(THREADS_PREFER_PTHREAD_FLAG ON)appears to be "broken"- Manually appending
-pthreadfixes the issue, socmakefailed to recognize that-pthreadis necessary
Root Cause
Blame CMake?
So we dig into CMake's source code to see what happened. To our surprise, we didn't notice any change to the detection code it used for determining whether -pthread is necessary. CMake's detection code, at the time of writing, looks like this:
1 |
|
It turned out that according to the glibc upgrade, this detection code now compiles without additional arguments:
1 | gcc test_pthread.c |
As far as I can tell, this does not look like a CMake bug. The expected behavior, which is exactly what we have seen, is that if the test code can be compiled without -pthread, then the argument should not be appended to the command like (or {C,CXX}FLAGS, correspondingly). This lead to another question: why those packages are failing now, provided that the detection code is unchanged, and it used to work properly?
Dive into Source Code
Let's take the package mongo-c-driver as an example. As of version 1.21.1, we can see the code listed below in its src/libbson/src/bson/bson-atomic.h (Feel familiar with this path? Remember the libbson-1.0.so.0.0.0: undefined reference error log mentioned before?):
1 |
|
As we can tell from libbson's source code, it attempts to use the __atomic_* builtins of GCC, like __atomic_fetch_add and __atomic_compare_exchange. We can construct some test cases to check whether we're right:
1 | // test1.c |
1 | // test2.c |
1 | // test4.c |
1 | // test8.c |
Here, we use the gcc built-in
__atomic_*becauselibbsonused it. Replacing this withstd::atomic<T>(and compile it withg++) yields identical results correspondingly. You may also useuint*_tprovided by thelinux/types.hheader, as their sizes are promised to be the same across all architectures.
Compile results:
1 | gcc test1.c |
If you are kind of familiar with C++, you may have heard that codes using atomic operations might need to link libatomic (I remembered once reading this, but I can't find it now). It turned out that if we provide -latomic to gcc, the code compiles:
1 | gcc test1.c -latomic |
And this also works for -pthread:
1 | gcc test1.c -pthread |
Smart readers might have felt something unusual:
First, as we can see, gcc compiles test4.c and test8.c successfully without -latomic or -pthread, but it couldn't handle the atomic operations in test1.c and test2.c without making a call to libatomic:
1 | gcc test1.c -latomic -S |
This is partially explained in a GitHub issue (riscv-collab/riscv-gcc#12). Till April 2022, i.e. when this blog is written, gcc does not support inlining subword atomics, and this is the reason why -latomic must be presented when invoking gcc.
But what about -pthread? Why -pthread works either? I mean, libpthread itself does not provide those atomic symbols, right? Actually, if you use ldd to check a.out generated by gcc test1.c -pthread, you will notice that libatomic is linked, instead of libpthread as we are expecting from -pthread. Changing -pthread with -lpthread won't work, so gcc must have done something internally for -pthread despite linking libpthread. In order to figure out the difference, we have to check the gcc spec:
1 | gcc -dumpspecs | grep pthread |
Oh, the answer finally reveals: gcc silently append --as-needed -latomic if -pthread is provided. This also explains why changing -lpthread to -pthread works.
The reason why gcc decided to link libatomic when -pthread is provided had been lost in a squashed commit. My guess is that they tried to cover up the subword atomic pitfall using this strategy. (UPDATE: aswaterman's initial fix did not limit the scope to -pthread. The limitation was added one week later.) Generally, atomics are used with multi-threading, and you use pthread in such cases. However, one may use atomics without pthread, like when writing SIGNAL handlers. So IMO this workaround is not good enough, and should be replaced by subword atomic inline support.
UPDATE: We discussed this with some gcc RISC-V devs, and it turned out that when compiling gcc, at stage 1 it may not recognize libatomic (i.e.
-latomicmay not work) if you are compiling it with a gcc that uses the old spec So you'll have to set--with-specat stage 1 to workaround this. Hopefully there would be an--always-link-libatomicconfiguration flag in the future.
Fun facts: gcc once had added --as-needed -latomic to its LIB_SPEC unconditionally, in this commit (2017-02-01):
1 | + #undef LIB_SPEC |
But later in the final port commit submitted to gcc (2017-02-07; svn r245224), they limited the change to -pthread:
1 | - " " LD_AS_NEEDED_OPTION " -latomic " LD_NO_AS_NEEDED_OPTION \ |
The reason of making such limitation remains unknown.
Wrap Up
glibc moved libpthread into libc, and CMake's pthread sample code compiles directly. Hence, CMake thinks there's no need to provide -pthread when invoking gcc / g++. Since we used to patch such packages with -pthread instead of -latomic, and are actually relying on -pthread's side effect (--as-needed -latomic), this led to the consequence that neither -pthread nor -latomic is provided, so subword atomics refuse to compile as libatomic is not linked. For ways of fixing this problem, please refer to the Solution chapter.
Also note that the dependence on libatomic causes another problem. Some packages / frameworks, e.g. JUCE insists that its atomic wrap only accepts lock-free values. However, when a program is compiled, the compiler cannot know whether a call to libatomic will be lock-free or not (actually, the emitted sequence will be lock-free, but the compiler doesn't know). Hence, the compiler will return false for std::atomic<bool>::always_lock_free, and this also happens for the macro ATOMIC_BOOL_LOCK_FREE defined in atomic.h. As a result, if you use juce::Atomic<bool>, some static asserts will fail, and the package refuses to compile.
Solution
Solutions vary according to the scale of projects.
- As a developer / user on behalf of yourself,
export CFLAGS="$CFLAGS --as-needed -latomic"and likewise forCXXFLAGSsolve the problem. As an alternative, you can also patch theMakefilefile, and edit the flags inside it. - As a packager maintaining multiple packages with the same toolchain, you may want to write a module / universal patch. For projects with CMake, you can use LLVM's
CheckAtomic.cmake, which (if you are using a newer version) has taken subword atomics into consideration. - As a developer maintaining a relatively large project that has gained some users, you may need to consider the compatibility with
clangandcompiler-rt. Clang should be able to handle-latomic, but better perform a double-check. You can temporarily rely on the side-effect of gcc-pthread(which is not recommended), or edit your configure script manually to disable-latomicwhenclangandcompiler-rtare detected. - As a developer of a widely-used toolchain project, despite considering
clangandcompiler-rt, compatibility with older compilers / linkers that do not support--as-neededand even-pthreadmight need to be considered. CMake reverted their-pthreadfix on this because a compiler named XL does not support-pthread, and interprets it as-p -t hread, returning zero (oops! false negative) as its exit code. - The final approach of solving this is to add subword atomic inlining support into gcc. There's a pending patch regarding this, but this patch only adds support for
__atomic_fetch_add, so the problem would not be mitigated (but this patch indicates a good start!).
Complicated solutions may increase the burden on project developers / maintainers, and their willingness to port their projects to RISC-V might be reduced. Aiming to tackle the problem, we have proposed to change LIB_SPEC back to its originally patched form to mitigate this issue partially (still not lock-free), and hope that the subword atomic pitfall can be solved completely in the future.
UPDATE: After discussing this with gcc RISC-V devs, we reached the consensus that maybe a gcc configuration argument
--always-link-libatomiccan be added.
UPDATE2: This breaks the bootstrap process of GCC. GCC compiles itself twice to get rid of e.g. dirty compiling environments stuffs. Modifying the spec string too early causes glitches, like libatomic calls itself infinitely. So we have to use
sedto replace the spec string inside the first-stage GCC binary carefully, filling blanks (U+0020) to make sure old and new spec strings have the same length.