Lately we've been compiling rust 1.59.1 on riscv64gc (will be referred to as rv64 in this article). We are packaging for the PLCT's Arch Linux RISC-V project, and thanks to the Arch wisdom, tools in the toolchain we use are always the latest. Still, we met a strange compile error:

After performing some simple queries (cs.github.com is awesome!), we managed to locate the following code (ref)：

## pause, fence and .insn

TL;DR: pause is fence w,0, which is .insn i 0x0F, 0, x0, x0, 0x010. pause is provided by the Zihintpause extension. Though the widely adopted riscv64gc does not contains Zihintpause, the fence w,0 won't trigger a SIGILL, because it is treated similar to a nop instruction, so we can use it safely without bothering about compatibility.

Firstly, by reading the comments, we can infer that this .insn assembly acts as the pause instruction. This stunned me a bit because AFAIK, the HINT feature in RISC-V ISA has always been in the reserved state (at least until December 2021):

No standard hints are presently defined. We anticipate standard hints to eventually include memory-system spatial and temporal locality hints, branch prediction hints, thread-scheduling hints, security tags, and instrumentation flags for simulation/emulation.

It's not hard to realize, though, that the pause instruction is introduced by Zihintpause extension, as an alias for fence w, 0. Previously, we have to use nop to polyfill the pause function implemented for other architectures:

However, in RISC-V, the nop instruction simply stands for "no operation", and does not provide any further clues to relax the CPU, hence not saving any energy (but instead wasting it). So, it's definitely better to replace the fake pause (i.e. nop) with the real pause, as provided in the Zihintpause extension.

One may say that, hey, this extension is not part of the riscv64gc extension set! This argue is valid, as riscv64gc stands for riscv64imafdc_Zicsr_Zifencei (used to be riscv64imafdc, when the I baseline has not been split to I + Zicsr + Zifencei). Let's look at the pause instruction in detail:

Before doing this, you need to grab a RISC-V run-time, via QEMU or via buying a board from SiFive.

Uh-oh, seems like the assembler does not know this instruction! That's because 1) the pause instruction is too new; 2) I'm using riscv64gc, which does not include Zihintpause. Never mind, we can still use fence w,0 as noted in the spec:

Not good. But at least the rust version should work, as they have RISC-V as tier-2 target, and managed to make the release pass the CI test, right?

Hmm, seems like the .insn i is working (compiling, at least), but what does this fence w,unknown stands for? Let's have a loot at the spec:

Shouldn't it be fence w,0? Actually, it is fence w,0, as denoted by the hex 0100000f:

IMO this is a subtle bug (used to be a feature) of the disassembler, and as we can see, this is already fixed in the llvm toolchain:

After digging into the underlying mechanism of the pause instruction, we can easily conclude that it will not trigger a SIGILL, as the fence instruction is part of the RV32I baseline instruction set, hence available in all valid RISC-V instruction sets.

In conclusion, it's 100% safe to replace pause with .insn i 0x0F, 0, x0, x0, 0x010 to make the code compile, regardless of what RISC-V extensions you are using.

## Tracing the Problem

Ah, I know someone must be already complaining: you've write so much analysis, but how are they related to the error occurred when compiling Rust? Actually, the direct answer to this question is simple and naive. Let's take the pause.c, and use clang 13 to compile it, and see what will happen:

Obviously, clang 13.0.1 still doesn't support compiling the .insn directive. The support is to be added into clang since 14.0, as we can see from the target branch of llvm commit 28387979: [RISCV] Initial support .insn directive for the assembler. This is imported into rust in this pull request (#91528).

Still, there's a tiny issue haunting: The rust PR (#91528) is merged months before the initial release of llvm 14.0.0-rc1. Sure, Rust guys are always keen on trying those nightly, bleeding-edge stuffs, but how can they grab the 14.x llvm toolchain before the upstream has ever released it?

After investigating PR #91528, things become clear. The Rust team is maintaining a fork of llvm at rust-lang/llvm-project, and they cherry-picked commit 28387979 to make the .insn stuff compiles when using llvm 13.

By maintaining a fork and constantly modifying / cherry-picking on demand, the Rust team is able to benefit from unreleased changes, or add support for older OS/platforms that are not supported by the upstream. However, this is definitely not a good news for downstream packagers:

• You can't compile Rust with the original toolchain from time to time, when there're cherry-picked commits, or when the Rust team adds new features to their fork, and failed to submit them to the upstream quickly enough.
• If you compile rust-lang/llvm-project first, and use the compiled llvm toolchain (let's call it rust-llvm) to compile Rust itself, then the rust package would have conflicts with the llvm package, as rustc may need to link to .so files provided by llvm (or rust-llvm, depends on which one you are using to compile rustc), and the .so files may have different ABI (Application Binary Interface), causing incompatibility. Also,
• Arch Linux only provides shared build, so static linking is not a preferable way to solve this.
• Splitting the rust package to rust-llvm + rust won't help. That's because the linked .so files need to be presented at run-time, so rust-llvm will be put into rust's depends array, hence failing to resolve the conflict.

Currently, we can still hide the problem by letting the build fail for some time, and wait for newer llvm releases that contain those features required by building rustc. Sure, this solution is not elegant, and things might get worse when the difference between rust-llvm and llvm becomes so huge that it's impossible to compile rustc with the upstream llvm. But we can't take the burden to make rust incompatible with llvm -- there are way too much packages that depend on both rust and llvm now. Fortunately, consider the Rust team's claim on rust-llvm, that they will always attempt to submit their new features to upstream, maybe we don't need to worry too much.