Rust is incompatible with LLVM, at least partially
Lately we've been compiling rust 1.59.1 on riscv64gc
(will be referred to as rv64
in this article). We are packaging for the PLCT's Arch Linux RISC-V project, and thanks to the Arch wisdom, tools in the toolchain we use are always the latest. Still, we met a strange compile error:
1 | error: unknown directive |
After performing some simple queries (cs.github.com is awesome!), we managed to locate the following code (ref):
1 | //! Shared RISC-V intrinsics |
pause, fence and .insn
TL;DR:
pause
isfence w,0
, which is.insn i 0x0F, 0, x0, x0, 0x010
.pause
is provided by theZihintpause
extension. Though the widely adoptedriscv64gc
does not containsZihintpause
, thefence w,0
won't trigger aSIGILL
, because it is treated similar to anop
instruction, so we can use it safely without bothering about compatibility.
Firstly, by reading the comments, we can infer that this .insn
assembly acts as the pause
instruction. This stunned me a bit because AFAIK, the HINT feature in RISC-V ISA has always been in the reserved state (at least until December 2021):
No standard hints are presently defined. We anticipate standard hints to eventually include memory-system spatial and temporal locality hints, branch prediction hints, thread-scheduling hints, security tags, and instrumentation flags for simulation/emulation.
It's not hard to realize, though, that the pause
instruction is introduced by Zihintpause
extension, as an alias for fence w, 0
. Previously, we have to use nop
to polyfill the pause
function implemented for other architectures:
1 | From 4e559dabe28e57ee27cb45c8297e1e387beed1d3 Mon Sep 17 00:00:00 2001 |
However, in RISC-V, the nop
instruction simply stands for "no operation", and does not provide any further clues to relax the CPU, hence not saving any energy (but instead wasting it). So, it's definitely better to replace the fake pause
(i.e. nop
) with the real pause
, as provided in the Zihintpause
extension.
One may say that, hey, this extension is not part of the riscv64gc
extension set! This argue is valid, as riscv64gc
stands for riscv64imafdc_Zicsr_Zifencei
(used to be riscv64imafdc
, when the I
baseline has not been split to I
+ Zicsr
+ Zifencei
). Let's look at the pause
instruction in detail:
Before doing this, you need to grab a RISC-V run-time, via QEMU or via buying a board from SiFive.
1 | cat pause.asm |
Uh-oh, seems like the assembler does not know this instruction! That's because 1) the pause
instruction is too new; 2) I'm using riscv64gc
, which does not include Zihintpause
. Never mind, we can still use fence w,0
as noted in the spec:
1 | cat pause.asm |
Not good. But at least the rust version should work, as they have RISC-V as tier-2 target, and managed to make the release pass the CI test, right?
1 | cat pause.asm |
Hmm, seems like the .insn i
is working (compiling, at least), but what does this fence w,unknown
stands for? Let's have a loot at the spec:
Shouldn't it be fence w,0
? Actually, it is fence w,0
, as denoted by the hex 0100000f
:
IMO this is a subtle bug (used to be a feature) of the disassembler, and as we can see, this is already fixed in the llvm toolchain:
1 | diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp |
After digging into the underlying mechanism of the pause
instruction, we can easily conclude that it will not trigger a SIGILL
, as the fence
instruction is part of the RV32I
baseline instruction set, hence available in all valid RISC-V instruction sets.
1 | cat pause.c |
In conclusion, it's 100% safe to replace pause
with .insn i 0x0F, 0, x0, x0, 0x010
to make the code compile, regardless of what RISC-V extensions you are using.
Tracing the Problem
Ah, I know someone must be already complaining: you've write so much analysis, but how are they related to the error occurred when compiling Rust? Actually, the direct answer to this question is simple and naive. Let's take the pause.c
, and use clang 13 to compile it, and see what will happen:
1 | clang -v |
Obviously, clang 13.0.1 still doesn't support compiling the .insn
directive. The support is to be added into clang since 14.0, as we can see from the target branch of llvm commit 28387979
: [RISCV] Initial support .insn directive for the assembler. This is imported into rust in this pull request (#91528).
Still, there's a tiny issue haunting: The rust PR (#91528
) is merged months before the initial release of llvm 14.0.0-rc1
. Sure, Rust guys are always keen on trying those nightly, bleeding-edge stuffs, but how can they grab the 14.x llvm toolchain before the upstream has ever released it?
After investigating PR #91528
, things become clear. The Rust team is maintaining a fork of llvm at rust-lang/llvm-project
, and they cherry-picked commit 28387979
to make the .insn
stuff compiles when using llvm 13.
By maintaining a fork and constantly modifying / cherry-picking on demand, the Rust team is able to benefit from unreleased changes, or add support for older OS/platforms that are not supported by the upstream. However, this is definitely not a good news for downstream packagers:
- You can't compile Rust with the original toolchain from time to time, when there're cherry-picked commits, or when the Rust team adds new features to their fork, and failed to submit them to the upstream quickly enough.
- If you compile
rust-lang/llvm-project
first, and use the compiled llvm toolchain (let's call itrust-llvm
) to compile Rust itself, then therust
package would have conflicts with thellvm
package, asrustc
may need to link to.so
files provided byllvm
(orrust-llvm
, depends on which one you are using to compilerustc
), and the.so
files may have different ABI (Application Binary Interface), causing incompatibility. Also,- Arch Linux only provides shared build, so static linking is not a preferable way to solve this.
- Splitting the
rust
package torust-llvm
+rust
won't help. That's because the linked.so
files need to be presented at run-time, sorust-llvm
will be put intorust
'sdepends
array, hence failing to resolve the conflict.
Currently, we can still hide the problem by letting the build fail for some time, and wait for newer llvm releases that contain those features required by building rustc
. Sure, this solution is not elegant, and things might get worse when the difference between rust-llvm
and llvm
becomes so huge that it's impossible to compile rustc
with the upstream llvm
. But we can't take the burden to make rust
incompatible with llvm
-- there are way too much packages that depend on both rust
and llvm
now. Fortunately, consider the Rust team's claim on rust-llvm
, that they will always attempt to submit their new features to upstream, maybe we don't need to worry too much.