Achievement unlocked: rustc segfault

$ cargo build –example basic –features usdt-probes […snip…] error: could not compile `dropshot` Caused by: process didn’t exit successfully: `rustc […snip…]` (signal: 11, SIGSEGV: invalid memory reference) Achievement unlocked: rustc segfault. Stack trace fffffc7fce3fcbc0 librustc_driver-77cef3efbfa7284c.so`llvm::BranchProbabilityInfo::computeEestimateBlockWeight(llvm::Function const&, llvm::DominatorTree*, llvm::PostDominatorTree*)+0xd84() fffffc7fce3fd370 librustc_driver-77cef3efbfa7284c.so`llvm::BranchProbabilityInfo::calculate(llvm::Function const&, llvm::LoopInfo const&, llvm::TargetLibraryInfo const*, llvm::DominatorTree*, llvm::PostDominatorTree*)+0x131() fffffc7fce3fd3c0 librustc_driver-77cef3efbfa7284c.so`llvm::BranchProbabilityAnalysis::run(llvm::Function&, llvm::AnalysisManager&)+0x134() fffffc7fce3fd5f0 librustc_driver-77cef3efbfa7284c.so`llvm::detail::AnalysisPassModel::run(llvm::Function&, llvm::AnalysisManager&)+0x2f() fffffc7fce3fd6a0 librustc_driver-77cef3efbfa7284c.so`llvm::AnalysisManager::getResultImpl(llvm::AnalysisKey*,…

62
Achievement unlocked: rustc segfault

I be loopy about constituents, because they are unbelievable.

$ cargo construct --instance basic --positive aspects usdt-probes
[...snip...]
error: might perhaps maybe well now not assemble `dropshot`

Precipitated by: 
  job did now not exit efficiently: `rustc [...snip...]` (trace: 11, SIGSEGV: invalid memory reference)

Achievement unlocked: rustc segfault.

Stack hint
fffffc7fce3fcbc0 librustc_driver-77cef3efbfa7284c.so`llvm::BranchProbabilityInfo::computeEestimateBlockWeight(llvm::Feature const&, llvm::DominatorTree*, llvm::PostDominatorTree*)+0xd84()
fffffc7fce3fd370 librustc_driver-77cef3efbfa7284c.so`llvm::BranchProbabilityInfo::calculate(llvm::Feature const&, llvm::LoopInfo const&, llvm::TargetLibraryInfo const*, llvm::DominatorTree*, llvm::PostDominatorTree*)+0x131()
fffffc7fce3fd3c0 librustc_driver-77cef3efbfa7284c.so`llvm::BranchProbabilityAnalysis::lunge(llvm::Feature&, llvm::AnalysisManager<:function>&)+0x134()
fffffc7fce3fd5f0 librustc_driver-77cef3efbfa7284c.so`llvm::detail::AnalysisPassModel<:function llvm::branchprobabilityanalysis llvm::preservedanalyses llvm::analysismanager><:function>::Invalidator>::lunge(llvm::Feature&, llvm::AnalysisManager<:function>&)+0x2f()
fffffc7fce3fd6a0 librustc_driver-77cef3efbfa7284c.so`llvm::AnalysisManager<:function>::getResultImpl(llvm::AnalysisKey*, llvm::Feature&)+0x2de()
fffffc7fce3fd6d0 librustc_driver-77cef3efbfa7284c.so`llvm::BlockFrequencyAnalysis::lunge(llvm::Feature&, llvm::AnalysisManager<:function>&)+0x3f()
fffffc7fce3fd710 librustc_driver-77cef3efbfa7284c.so`llvm::detail::AnalysisPassModel<:function llvm::blockfrequencyanalysis llvm::preservedanalyses llvm::analysismanager><:function>::Invalidator>::lunge(llvm::Feature&, llvm::AnalysisManager<:function>&)+0x26()
fffffc7fce3fd7c0 librustc_driver-77cef3efbfa7284c.so`llvm::AnalysisManager<:function>::getResultImpl(llvm::AnalysisKey*, llvm::Feature&)+0x2de()
fffffc7fce3fdc30 librustc_driver-77cef3efbfa7284c.so`llvm::AlwaysInlinerPass::lunge(llvm::Module&, llvm::AnalysisManager<:module>&)+0xa2c()
fffffc7fce3fdc50 librustc_driver-77cef3efbfa7284c.so`llvm::detail::PassModel<:module llvm::alwaysinlinerpass llvm::preservedanalyses llvm::analysismanager><:module>>::lunge(llvm::Module&, llvm::AnalysisManager<:module>&)+0x15()
fffffc7fce3fddc0 librustc_driver-77cef3efbfa7284c.so`llvm::PassManager<:module llvm::analysismanager><:module>>::lunge(llvm::Module&, llvm::AnalysisManager<:module>&)+0x4b5()
fffffc7fce3ff170 librustc_driver-77cef3efbfa7284c.so`LLVMRustOptimizeWithNewPassManager+0x7f2()
fffffc7fce3ff3a0 librustc_driver-77cef3efbfa7284c.so`rustc_codegen_llvm::help::write::optimize_with_new_llvm_pass_manager+0x372()
fffffc7fce3ff5b0 librustc_driver-77cef3efbfa7284c.so`rustc_codegen_llvm::help::write::optimize+0x388()
fffffc7fce3ff900 librustc_driver-77cef3efbfa7284c.so`rustc_codegen_ssa::help::write::execute_work_item:: <:llvmcodegenbackend>+0x1f3()
fffffc7fce3ffdb0 librustc_driver-77cef3efbfa7284c.so`std::sys_common::backtrace::__rust_begin_short_backtrace::<:llvmcodegenbackend as rustc_codegen_ssa::traits::backend::extrabackendmethods>::spawn_named_thread<:back::write::spawn_work><:llvmcodegenbackend>::{closure#0}, ()>::{closure#0}, ()>+0xf7()
fffffc7fce3fff60 librustc_driver-77cef3efbfa7284c.so`::spawn_unchecked_::spawn_named_thread<:back::write::spawn_work><:llvmcodegenbackend>::{closure#0}, ()>::{closure#0}, ()>::{closure#1} as core::ops::feature::FnOnce>::call_once::{shim:vtable#0}+0xa9()
fffffc7fce3fffb0 libstd-ef15f81a900bedf3.so`std::sys::unix::thread::Thread::unique::thread_start::h24133bfe318082b5+0x27()
fffffc7fce3fffe0 libc.so.1`_thrp_setup+0x6c(fffffc7fed642280)
fffffc7fce3ffff0 libc.so.1`_lwp_start()

Okay, so we’re faulting someplace in LLVM it seems love. From Cliff’s preliminary investigation:

Anyway, yeah, something about the CFG constructing there might perhaps be producing both an empty basic block or a basic block ending in an sudden form of instruction (something that’s now not an LLVM IR terminator instruction) and triggering https://github.com/llvm/llvm-project/blob/essential/llvm/consist of/llvm/IR/BasicBlock.h#L121

First expose of industry then is to appropriate test if the IR is correct. LLVM has a lunge to attend out appropriate that and we can quiz rustc to lunge it first by passing -Z test-llvm-ir=yes (demonstrate we want to change to nightly to make utilize of -Z flags):

$ RUSTFLAGS="-Z test-llvm-ir=yes" cargo +nightly construct --instance basic --positive aspects usdt-probes

Haha, nope:

Sleek Block in feature '_ZN8dropshot6server24http_request_handle_wrap28_$u7b$$u7b$closure$u7d$$u7d$17h503b14ddd4edd1deE' would now not contain terminator!
price %bb24
LLVM ERROR: Broken module stumbled on, compilation aborted!

# Demangle w/ rustfilt (c++filt works neatly ample too)
# Single quotes essential to now not misread $ as shell vars!

$ rustfilt '_ZN8dropshot6server24http_request_handle_wrap28_$u7b$$u7b$closure$u7d$$u7d$17h503b14ddd4edd1deE'
dropshot::server::http_request_handle_wrap::{{closure}}

The IR generated for a closure in dropshot::server::http_request_handle_wrap is invalid—some basic block is lacking a terminator.

Okay, is it rustc producing the corrupt IR at once or the consequence of some transformation lunge miscompiling it?

However first, let’s cheat and appropriate glean the final failing rustc enlighten so we don’t want to rebuild your total deps anytime we commerce RUSTFLAGS. Re-working the failing cargo enlighten might perhaps maybe well serene appropriate output the failing rustc invocation:

$ cargo +nightly construct --instance basic --positive aspects usdt-probes
   Compiling dropshot v0.6.1-dev (/src/dropshot/dropshot)
error: might perhaps maybe well now not assemble `dropshot`

Precipitated by: 
  job did now not exit efficiently: `rustc [...snip...]` (trace: 11, SIGSEGV: invalid memory reference)

From this point we can appropriate at once lunge the rustc enlighten as outputted with a few modifications:

  • add +nightly otherwise the rustc wrapper will are trying to make utilize of the rust model mentioned in rust-toolchain.toml
  • want away the --error-format=json and --json=... flags for human-readable output
  • add -Z test-llvm-ir=yes
  • commerce the --emit argument to --emit=llvm-ir because that desires to be ample to trigger the tell and we would actually like to ogle at the IR later

Stick this in a easy shell script to without effort adjust it and lunge it; call it repro.sh. Take a look at it serene fails as expected:

$ ./repro.sh
Sleek Block in feature '_ZN8dropshot6server24http_request_handle_wrap28_$u7b$$u7b$closure$u7d$$u7d$17h503b14ddd4edd1deE' would now not contain terminator!
price %bb24
LLVM ERROR: Broken module stumbled on, compilation aborted!

Now help to figuring out where this invalid IR is coming from. Even though we’re doing a debug construct, there are serene some LLVM passes that glean lunge. So if we’re attempting to envision the IR that rustc at once generated, we want to web determined no LLVM passes are lunge at all (other than the test lunge itself). The technique to attend out that is by ability of -C no-prepopulate-passes so let’s edit our repro.sh and lunge all of it another time:

Okay rustc has been proven harmless. Appears love some LLVM lunge generates invalid IR which in actuality shouldn’t happen! ⚠️

Effectively, now what? Let’s are trying to discover out what lunge is to blame!

Our first are trying is by asking LLVM to print the IR after each and each lunge—maybe we will glean lucky and anguish the offending lunge final. We provide out this by editing repro.sh all another time:

  • want away -C no-prepopulate-passes & -Z test-llvm-ir=yes
  • add -C llvm-args=--print-after-all to print the IR after each and each lunge
  • add -C codegen-fashions=1 -Z no-parallel-llvm to web the output a chunk of additional readable

Alas, this doesn’t lunge the technique we want as we glean the same segfault as earlier than without any of the explicit output we needed 🙁

Okay, unique are trying. Let’s skip rustc and anguish if we can appropriate invoke the LLVM machinery at once by ability of decide. For that, let’s first install it:

$ rustup ingredient add --toolchain nightly llvm-tools-preview

It is miles now not the most discoverable because it appropriate will get plopped someplace into rustc‘s sysroot directory:

$ OPT=$(discover $(rustc +nightly --print sysroot) -title decide)

We also need the explicit IR to lunge to decide so let’s return and adjust our repro.sh to handiest lunge -C no-prepopulate-passes. We might perhaps maybe well serene discover our preliminary rustc generated IR. It be also price want away the -C debuginfo=2 to web the IR a chunk of smaller:

$ ls ./goal/debug/examples/basic*.ll
./goal/debug/examples/basic-5f5f0491fbb5b7d3.ll

Let’s are trying something easy first and appropriate lunge the IR by decide without any flags as a smoke test:

$ $OPT ./goal/debug/examples/basic-5f5f0491fbb5b7d3.ll 
decide: ./goal/debug/examples/basic-5f5f0491fbb5b7d3.ll: 425470:1: error: expected instruction opcode
bb25:                                             ; preds=%bb24
^

😐 Wat. Taking a ogle at the IR spherical that line, we discover this:

bb24:
; [...snip...]
  %186=invoke i64 asm sideeffect inteldialect "990:   clr raxAA                    .pushsection set_dtrace_probes,22aw22,22progbits22A                    .balign 8A            991: A                    .4byte 992f-991b    // lengthA                    .byte 1A                    .byte 0A                    .2byte 1A                    .8byte 990b         // addressA                    .asciz 22dropshot22A                    .asciz 22interrogate-originate22A                             // null-terminated strings for every and each argumentA                    .balign 8A            992:    .popsectionA                    A                    .pushsection yeet_dtrace_probesA                    .8byte 991bA                    .popsectionA                A        ", "=&{ax}"() #23
          to price %bb25 unwind price %cleanup26, !srcloc !38
  store i64 %186, i64* %is_enabled, align 8

bb25:                                             ; preds=%bb24
  %_78=load i64, i64* %is_enabled, align 8
  %187=icmp eq i64 %_78, 0
  br i1 %187, price %bb46, price %bb26

Effectively that seems awfully love the error the LLVM IR verifier became once telling us about (%bb24 now not having a terminator)!
So seems love our assumption about rustc now not being the one producing correct IR is mistaken. Where did we lunge mistaken?

Some light digging into the rustc source unearths that the usage of the unique LLVM lunge manager (default for LLVM>=13 thus Rust>=1.56) technique -Z test-llvm-ir=yes is pushed aside when blended with -C no-prepopulate-passes. Whelp :/

(To be determined, it be now not in actuality the unique lunge manager’s fault but comparatively the technique it is a ways setup in LLVMRustOptimizeWithNewPassManager).

So now we contain stumbled on one (minor) rustc worm to this point but that would now not help resolve our fashioned interrogate. No worries, we can both switch to the outmoded lunge manager (-Z unique-llvm-lunge-manager=no) or appropriate manually add the verifier lunge (-C passes="test"), both technique we glean the same ol error:

$ ./repro.sh
Sleek Block in feature '_ZN8dropshot6server24http_request_handle_wrap28_$u7b$$u7b$closure$u7d$$u7d$17h503b14ddd4edd1deE' would now not contain terminator!
price %bb24
LLVM ERROR: Broken module stumbled on, compilation aborted!

This brings us help to the explicit offender: rustc!

Scooby Doo Mask Reveal Meme: Panel 1 w/ Mask on

(We in actuality might perhaps maybe well serene contain suspected this after the temporary foray with attempting to print the resulting IR after each and each LLVM lunge did now not give us anything else: the IR we feed it became once botched to originate up with!).

So help to the invalid IR: at the end of bb24 we’re the usage of invoke (a terminator) with our inline meeting from the usdt probes in dropshot adopted by a store instruction. Clearly that is mistaken because store isn’t always in actuality a terminator and thus we shouldn’t end a basic block with it. Let’s see what the corresponding MIR (Rust’s Mid-level IR) seems love.

For the reason that failing code is coming from dropshot and now not the basic instance itself, we can now not utilize our repro.sh hack and so help we lunge to cargo and RUSTFLAGS:
The usage of -Z dump-mir='http_request_handle_wrap':

$ RUSTFLAGS="-Z dump-mir='http_request_handle_wrap'" cargo +nightly construct --instance basic --positive aspects usdt-probes
[...snip...]
(trace: 11, SIGSEGV: invalid memory reference)
$ ls mir_dump
mir_dump: No such file or directory

Okay, that’s now not in actuality working as expected (-Z dump-mir=F might perhaps maybe well serene print appropriate the MIR for positive aspects matches the filter F and placement it in a mir_dump folder). A little annoying and we can handiest shave so many yaks stunning now but nothing an even bigger hammer can now not repair (appropriate utilize --emit=mir to dump out your total mir into the goal folder and discover the corresponding one for dropshot):

$ RUSTFLAGS="--emit=mir" cargo +nightly construct --instance basic --positive aspects usdt-probes
$ ls goal/debug/deps/dropshot-*.mir
goal/debug/deps/dropshot-30d947b7471013cc.mir

Okay, now this seems love cheap:

bb24: {
[...snip...]
        asm!("990:   clr rax

                    .pushsection set_dtrace_probes,"aw","progbits"
                    .balign 8
            991: 
                    .4byte 992f-991b    // length
                    .byte 1
                    .byte 0
                    .2byte 1
                    .8byte 990b         // address
                    .asciz "dropshot"
                    .asciz "interrogate-originate"
                             // null-terminated strings for every and each argument
                    .balign 8
            992:    .popsection
                    
                    .pushsection yeet_dtrace_probes
                    .8byte 991b
                    .popsection
                
        ", out("ax") _77, alternate choices(NOMEM | PRESERVES_FLAGS | NOSTACK)) -> [return: bb25, unwind: bb217]; // scope 10 at dropshot/src/lib.rs: 581:1: 581: 41
    }

    bb25: {
        _78 = _77;                       // scope 9 at dropshot/src/lib.rs: 581:1: 581: 41
        switchInt(switch _78) -> [0_u64: bb46, otherwise: bb26]; // scope 9 at dropshot/src/lib.rs: 581:1: 581: 41
    }

Exhibit in MIR, asm itself is a terminator and so bb24 right here precisely says that below fashioned adjust lunge with the scramble to lunge to bb25 or if unwinding lunge to bb217. In bb25 we see a easy assertion, _78=_77;, which is assigning the output (_77 i.e. is_enabled) from the asm and this might perhaps well serene correspond to the store we noticed in the LLVM IR.

So seems love something in the lowering from Rust MIR to LLVM IR is now not comparatively stunning. Comely eyeballing it, it seems love the store of the asm output is getting added to the mistaken LLVM basic block. Wherein it desires to be section of the “fashioned” basic block taken by the invoke (indicated by the to price %bb25 argument), as an alternative it is a ways incorrectly positioned at once after the invoke.

That seems to occurs right here in rustc.

Now that now we contain a barely factual understanding of why things fail we desires as a ability to web a smaller repro […snip…]:

#![feature(asm_unwind)]

fn essential() {
    let _x = String:: from("string right here appropriate trigger we need something with a non-trivial descend");
    let foo: u64;
    unsafe {
        std:: arch:: asm!(
            "mov {}, 1",
            out(reg) foo,
            alternate choices(may_unwind)
        );
    }
    println!("{}", foo);
}
$ rustc +nightly asm-miscompile.rs
[1]    6057 segmentation fault (core dumped)  rustc +nightly asm-miscompile.rs

Chilly, now we contain purchased our smaller repro segfaulting, but is it the same tell? Let’s desire a ogle at what the LLVM IR says:

$ rustc +nightly asm-miscompile.rs --emit=llvm-ir -C no-prepopulate-passes
$ much less asm-miscompile.ll

[...snip...]
bb1:                                              ; preds=%originate
  %1=invoke i64 asm sideeffect alignstack inteldialect unwind "mov ${0:q}, 1", "=&r,~{dirflag},~{fpsr},~{flags},~{memory}"()
          to price %bb2 unwind price %cleanup, !srcloc !9
  store i64 %1, i64* %foo, align 8

bb2:
[...snip...]

Would you ogle at that, now we contain purchased a store as the final instruction in the basic block with the asm.

Now that now we contain a small repro and know roughly where the tell is in rustc we can are trying fixing it:

diff --git a/compiler/rustc_codegen_llvm/src/asm.rs b/compiler/rustc_codegen_llvm/src/asm.rs
index 03c390b4bd4..91d132eb343 100644
--- a/compiler/rustc_codegen_llvm/src/asm.rs
+++ b/compiler/rustc_codegen_llvm/src/asm.rs
@@ -290,6 +290,11 @@ fn codegen_inline_asm(
         }
         attributes::apply_to_callsite(result, llvm::AttributePlace::Feature, &{ attrs });
 
+        // Switch to the 'fashioned' basic block if we did an `invoke` as an alternative of a `call`
+        if let Some((dest, _, _))=dest_catch_funclet {
+            self.switch_to_block(dest);
+        }
+
         // Write results to outputs
         for (idx, op) in operands.iter().enumerate() {
             if let InlineAsmOperandRef::Out { reg, location: Some(location), .. }

A little of ready later and we can are trying compiling our small repro all another time:

$ rustc +stage1 asm-miscompile.rs
$ echo $?
0

🎉 Success! So what does the LLVM IR ogle love now?

$ rustc +stage1 asm-miscompile.rs --emit=llvm-ir -C no-prepopulate-passes
$ much less asm-miscompile.ll

[...snip...]
bb1:                                              ; preds=%originate
  %1=invoke i64 asm sideeffect alignstack inteldialect unwind "mov ${0:q}, 1", "=&r,~{dirflag},~{fpsr},~{flags},~{memory}"()
          to price %bb2 unwind price %cleanup, !srcloc !9

bb2:
  store i64 %1, i64* %foo, align 8
[...snip...]

The shop has moved down into %bb2 stunning where it desires to be.

[…snip…] Exhibit: we had to be succesful of add alternate choices(may_unwind) and an unused String variable to in actuality glean it to fail. Weeding out both of these will dwell it from segfaulting. The difference being, the LLVM IR that rustc generates. Without both objects, rustc appropriate uses a easy call instruction for the inline asm whereas in the broken case, it be the usage of invoke which is opinion of as a terminator now not like call. After the invoke, the adjust switch goes to both the ‘fashioned’ price or the ‘unwind’ price. By marking our asm with alternate choices(may_unwind) we in fact verbalize rustc to come to a decision our inline meeting into taking part in unwinding. The unused string is there in reveal that there is in actuality something to cleanup in the case that we carry out unwind.

However, attempting at the asm from the unique failing code in dropshot there is no point out of MAY_UNWIND:

asm!("[...snip...]", out("ax") _77, alternate choices(NOMEM | PRESERVES_FLAGS | NOSTACK)) -> [return: bb25, unwind: bb217]; // scope 10 at dropshot/src/lib.rs: 581:1: 581: 41

usdt certainly isn’t always in actuality adding it so what offers?

Poking by rustc, it seems love that could be consequently of the technique MIR inlining is utilized. A cleanup block will be assigned if a terminator (love inline asm) will get inlined. The lowering lunge will utilize the presence of one of these cleanup goal to in the extinguish think whether to make utilize of call or invoke.

However that understanding is readily shot down because the MIR inliner is serene disabled by default.

Comely browsing by rustc for other locations the unwind goal of a terminator might perhaps maybe well very neatly be space yields something promising by strategy of mills. This could video display as the unique dropshot failure case is in the context of an async technique. Further affirmation that that is where our InlineAsm terminator is getting an unwind goal space is that the posion_block mentioned in that little bit of code lines up. It has a single assertion and if we hop help to the MIR of the failing dropshot instance, lo and stare:

    asm!("[...snip...]", out("ax") _77, alternate choices(NOMEM | PRESERVES_FLAGS | NOSTACK)) -> [return: bb25, unwind: bb217];

[...snip...]

bb217 (cleanup): {
    discriminant((*(_1.0: &mut [static generator@dropshot/src/server.rs:651:43: 741:2]))) = 2; // scope 0 at dropshot/src/server.rs: 651: 43: 741:2
    resume;                          // scope 0 at dropshot/src/server.rs: 651: 43: 741:2
}

bb217 there might perhaps be the unwind goal and it matches exactly with the poison_block as constructed in the rustc.

Armed with some unique evidence, we can adapt our easy repro to extra closely match the async tell encountered in dropshot:

extern crate futures; // 0.3.21

async fn bar() {
    let foo: u64;
    unsafe {
        std:: arch:: asm!(
            "mov {}, 1",
            out(reg) foo,
        );
    }
    println!("{}", foo);
}

fn essential() {
    futures:: executor:: block_on(bar());
}

(Segfaults on playground.)

Thus your total mysteries are solved:

  1. the MIR -> LLVM IR lowering for inline meeting outputted invalid LLVM IR when generated with an invoke instruction (Fix submitted right here).
  2. each and each async fn in rust is utilized as a generator and as section of that, terminators in the basic blocks of one of these feature are modified to incorporate a cleanup goal in the occasion that they’ll unwind […snip…] in expose to poison the generator [return: bb25, unwind: bb217].
  3. We also stumbled on a worm with -Z test-llvm-ir=yes and -C no-prepopulate-passes along the technique (Fix submitted right here).

[…snip…] demonstrate the segfault in the above playground goes away in case you need to away the println! from foo because that is the handiest section that can in actuality unwind.

[return: bb25, unwind: bb217] in case it is most likely you’ll maybe well need ever considered a panic saying “async fn resumed after panicking”, for this reason.

()>,llvm::Module>,llvm::Module>,llvm::Function>,llvm::Function>
Read More
Allotment this on knowasiak.com to debate with other folks on this topicBe half of on Knowasiak.com now in case you are now not registered yet.

Vanic
WRITTEN BY

Vanic

“Simplicity, patience, compassion.
These three are your greatest treasures.
Simple in actions and thoughts, you return to the source of being.
Patient with both friends and enemies,
you accord with the way things are.
Compassionate toward yourself,
you reconcile all beings in the world.”
― Lao Tzu, Tao Te ChingBio: About: