Why I use a debugger

51
Why I use a debugger

I have been thinking about the Rust debugging experience, and I want to try
tease apart what value a debugger provides to me.

Last year, I gave a talk at QCon that promoted the use of rr as a tool
for exploring Rust code. I wanted to convey an excitement about debugging,
about using this tool to dive into the weeds of your program’s behavior.

Since then, I have continued to talk to developers, promoting rr (and the
associated service pernos.co). And I heard Rust developers say that they are
happy with their experience using instrumentation such as logging statements or
the tracing infrastructure.

I have to agree: Logging statements are useful, and sometimes they are all you
need. Here is a great quote from “The Practice of Programming”, by Kernighan and
Pike:

As personal choice, we tend not to use debuggers beyond getting a stack trace or the value of a variable or two. One reason is that it is easy to get lost in details of complicated data structures and control flow; we find stepping through a program less productive than thinking harder and adding output statements and self-checking code at critical places. Clicking over statements takes longer than scanning the output of judiciously-placed displays. It takes less time to decide where to put print statements than to single-step to the critical section of code, even assuming we know where that is. More important, debugging statements stay with the program; debugging sessions are transient.

This quote is apt for many reasons; I will revisit it again before the end of
this post.

I do not know the typical way that people use debuggers. But I can imagine it
looks much like the image depicted above: you start your program, you use a
single-step command to go through it line by line, and you examine raw memory
dumps to try to figure out what the machine is doing. After that initial
exploration, you might set breakpoints at specific lines of code, and then tell
the debugger to continue running until it hits any of those breakpoints.


Advanced debugger users will set conditional breakpoints, which only halt the
program if a specific predicate holds when it hits the breakpoint, or even
scripted breakpoints, which will automatically execute a series of debugger
commands each time they hit the breakpoint. One neat trick is that you often can
end a script with continue, so that it seems as though the program never
stopped at all, except for the side-effects on the debugger itself (such as
setting other breakpoints).

But that typical usage pattern doesn’t really provide much value over just
adding log statements. If all you want to do is learn what steps the computer
took, and what data was in certain variables at those steps, printing to a log
is a better way to achieve that.

So, why am I advocating for more people to include a debugger as one of the tools
to keep ready on their Rust development utility belt?

There are four main features that a debugger gives me that are not always
fulfilled by adding log statements:

Of the four, i think hardware watchpoints are the best example of a capability
that is both broadly useful and not typically available outside of a debugger.
I will delve into each of these in turn below.

Object Code Inspection

By “object code inspection”, I mean that I can attach a debugger to any binary,
without having to recompile anything. When your workflow is to add new print
statements each time you want to inspect something new, that means editing your
code and recompiling it. That edit-recompile-debug cycle has two problems: it can
be slow (especially the recompilation step). Worse still, the edits can cause
bugs to mysteriously go away, aka the “observer effect”.
(To be fair, using a debugger can also cause those Heisenbugs to change
behavior. This is where tools like rr can really shine.)

How valuable is “object code inspection”? I have given two reasons to avoid the
edit-recompile steps: reduced latency, and reducing the observer effect.

Honestly: It is not a frequent need for my current work. Much of my work
involves either fixing bugs that reproduce readily in the face of source code
changes, or adding features (which inherently requires source code changes).


However, in practice this can be very
painful: We do not have much support for building an unoptimized rustc (it has
a very long bootstrap time), but once you enable optimizations, the debugging
experience degrades quite a bit in Rust today.

There is one important exception here: rustc itself has a long bootstrap time;
so I have often tried to use a debugger to debug rustc itself, rather than
rely on an edit-recompile-debug cycle.

Raw Control-Flow

By “raw control-flow”, I mean that one can step through the individual assembly
code instructions to see exactly what the machine is executing.

For example, at a method invocation, one can step forward and discover how
exactly to which function that invocation dispatches as it goes through Deref
implementations until it finally reaches the target method.

As another example, one can step through the the instructions corresponding to
the subcomponents of a match pattern. This way, one might discover which parts
matched and which failed to match, and in what order they were evaluated.

Here is a concrete example of the latter:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
struct P { a: char, b: char, c: &'static str }

fn main() {
    let p = P { a: 'h', b: 'i', c: "world" };
    foo(p);
}

fn foo(p: P) {
    match p {
        P { b: 'i', a: 'y', c } => println!("yi, {}", c),
        P { a: 'h', b, c } => println!("h{}, {}", b, c),
        _ => panic!("unmatched"),
    }
}

After compiling the above with debuginfo enabled and running it under gdb,
I can investigate the body of foo while its running:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
(gdb) step
boom::foo (p=...) at boom.rs:9
9     match p {
(gdb) disassemble
Dump of assembler code for function _ZN4boom3foo17h3035bcff6328b12bE:
   0x000055555555db70 <+0>:   sub    $0x128,%rsp
   0x000055555555db77 <+7>:   mov    %rdi,0x38(%rsp)
=> 0x000055555555db7c <+12>:    cmpl   $0x69,0x14(%rdi)
   0x000055555555db80 <+16>:  jne    0x55555555db8d <_ZN4boom3foo17h3035bcff6328b12bE+29>
   0x000055555555db82 <+18>:  mov    0x38(%rsp),%rax
   0x000055555555db87 <+23>:  cmpl   $0x79,0x10(%rax)
   0x000055555555db8b <+27>:  je     0x55555555db9d <_ZN4boom3foo17h3035bcff6328b12bE+45>
   0x000055555555db8d <+29>:  mov    0x38(%rsp),%rax
   0x000055555555db92 <+34>:  cmpl   $0x68,0x10(%rax)
   0x000055555555db96 <+38>:  je     0x55555555dbeb <_ZN4boom3foo17h3035bcff6328b12bE+123>
   0x000055555555db98 <+40>:  jmp    0x55555555dc6e <_ZN4boom3foo17h3035bcff6328b12bE+254>

In the above, we can see three comparison instructions: a comparison of 0x14(%rdi) with $0x69 (the character ‘i’), then a comparison of 0x10(%rax) with $0x79 (the character ‘y’). If they’re both equal, then it jumps to 0x55555555db9d (which is presumably the first arm’s body). If they’re not, then it does a comparison of 0x10(%rax) with $0x68 (the character ‘h’), and if that is equal, then it jumps to 0x55555555dbeb (which is preumably the second arm’s body).


The people I imagine doing this kind of investigation are the ones working out the low-level semantic guarantees when it comes to investigating
things like rust#87520.

Pretty low-level details, admittedly. Someone new to Rust cannot be expected to do this kind of investigation to gain insight into the language.

In short, this seems like a case where Kernighan and Pike’s quote above is
especially apt: we can easily get lost in the details of the
control flow as rendered here.

So, how valuable is “raw control-flow”?

Honestly: The more experience I gain in a language or a particular project,
the less and less I find myself relying on this kind of spulunking. The
evaluation order of match shouldn’t matter in most cases, so I would not
optimize the debugging UX for that. Seeing how method dispatch resolves is
useful, but in practice logging statements provide enough insight that people
rarely find themselves wanting to step through the individual assembly
instructions.

Insight into the raw control-flow is probably most useful for someone peeking
into low-level details of the semantics of the language (or a given project,
especially a code base with tricky macros that generate control-flow
structures). One might hope that that group generalizes to beginners who are
learning the language. But I am not certain that raw control-flow will
efficiently provide practical insight. For example, consider the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
#[derive(PartialEq, Eq, PartialOrd, Ord, Debug)]
struct D(&'static str);

fn main() {
    let d = D("my name is D");
    println!("{:?}", d);
}

impl Drop for D {
    fn drop(&mut self) {
        println!("Goodbye, {}", self.0);
    }
}

A newcomer to Rust who is not familiar with the destructor system might think
that the above code only prints a single line of text. Running the code (even
outside of a debugger) will show that to be incorrect. But: Will stepping
through it in a debugger provide the insight about where the ::drop
call comes from? It is not clear to me that it would.

Memory Exploration

By “memory exploration”, I mean that when my debugger is attached to a paused
program, I can inspect values on the stack or heap, walking down the nodes of a
binary tree or the entries in a hashtable. A good debugger will do some amount
of memory traversal and semantic interpretation on my behalf, using type
information from the program’s associated debuginfo to determine how best to
interpret the bits it finds and print it nicely (e.g. showing comma-separated
array elements).

How valuable is “memory exploration”?

Honestly: In Rust, one can add #[derive(Debug)] to type definitions to get
methods that print the structure for their values. The debugger printouts were
useful when the language didn’t provide it out of the box (like in C and C++),
but they less value in Rust.


Another counter-argument to this supposed first benefit: In a language like Rust,
anyone can use unsafe code to transmute the value to raw bits, and subsequently print them. This power is not solely in the debugger’s hands.

One benefit of using a debugger here, rather than just using log statements, is
that a debugger will let you immediately see the concrete representation (such
as concrete hashtable layout) in places where the developer has chosen a more
abstract view for the Debug trait implementation. But again we hear echos of
the Kernighan and Pike quote: Seeing that concrete representation may be crucial
in some scenarios, but in many cases it will just be unnecessary detail
distracting you from the problem at hand.

A second benefit with respect to memory exploration is that a debugger will also
let you inspect values that were not part of the original log statement. But,
one can view that as a special case of bypassing the edit-recompile cycle, which
we already discussed above. If the edit-recompile cycle is short, then there is
little cost to adding a new log statement that includes the value of interest.

Hardware Watchpoints

So we have come to perhaps my favorite debugger feature: Hardware watchpoints.

This is the main feature that most programming languages do not provide.

The idea is simple: the gdb command watch -location will
break the next time the memory at the given place is modified.
For example, watch -location array[i] will evaluate &array[i] to
its address, and then the next time array[i] is changed, the debugger
will stop and print out both the old value and the new value.

Consider the following (buggy) code for merging two sorted sequences into a
single sorted sequence:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
fn main() {
    let mut r = [99, 98, 97, 96, 95];
    merge(&mut r, &[10, 30], &[20, 40, 60]);
    dbg!(r);
}

// Supposedly merges sorted `a` and `b` into  `recv`.
fn merge(recv: &mut [i32], a: &[i32], b: &[i32]) {
    let mut ai = a.iter();
    let mut bi = b.iter();
    let (mut a_cur, mut b_cur) = (ai.next(), bi.next());
    for i in 0..recv.len() {
        match (a_cur, b_cur) {
            (Some(x), Some(y)) => {
                if *x <= *y {
                    recv[i] = *x;
                    a_cur = ai.next();
                } else {
                    recv[i] = *y;
                    a_cur = ai.next();
                }
            }
            (Some(x), None) => {
                recv[i] = *x;
                a_cur = ai.next();
            }
            (None, Some(y)) => {
                recv[i] = *y;
                b_cur = bi.next();
            }
            (None, None) => {
                return;
            }
        }
    }
}

When I run the code above, I get the output:

1
2
3
4
5
6
7
[boom.rs:4] r = [
    10,
    20,
    20,
    40,
    60,
]

which is not quite right! The third element of the generated array is 20,
instead of 30.

This toy example is small enough that you might be able to see the bug from
immediate inspection. Adding more test cases to the run might also make the
problem apparent. But I am going to use it to illustrate how to use a hardware
watchpoint to jump immediately to the point in the control flow where the third
element of the recv array is overwritten.

After starting up the debugger, we let it evaluate the line that creates
the destination array:

1
2
3
4
5
6
(gdb) start
Temporary breakpoint 1, boom::main () at boom.rs:2
2     let mut r = [99, 98, 97, 96, 95];
(gdb) next
3     merge(&mut r, &[10, 30], &[20, 40, 60]);
(gdb)

Next, lets first check that we understand what we are looking at:

1
2
3
4
(gdb) print r
$1 = [99, 98, 97, 96, 95]
(gdb) print r[2]
$2 = 97

Yep: We have our destination array r, and it currently has 97 as its
third element r[2].

So we tell the debugger that we want it to break as soon as that location
is changed, and then let the program run!

1
2
3
4
5
6
7
8
9
10
11
12
(gdb) watch -location r[2]
Hardware watchpoint 2: -location r[2]
(gdb) continue
Continuing.

Hardware watchpoint 2: -location r[2]

Old value = 97
New value = 20
boom::merge (recv=&mut [i32](size=5) = {...}, a=&[i32](size=2) = {...}, b=&[i32](size=3) = {...}) at boom.rs:29
29                    b_cur = bi.next();
(gdb)

We are now within the body of fn merge instead of fn main (as you might have
expected), immediately after the mutation occured:

1
2
3
4
        (None, Some(y)) => {
            recv[i] = *y;
            b_cur = bi.next();
        }

And indeed, if we print out recv at this point, we can see that the
array has been “corrupted”, since we now see the two copies of 20 in the
output:

1
2
3
(gdb) p recv
$3 = &mut [i32](size=5) = {10, 20, 20, 96, 95}
(gdb) ```

I am running with the rust-gdb command here, which has extensions for
printing slices nicely. If you were doing this under just normal gdb, then
you would see something like this instead.

1
2
(gdb) p recv
$3 = &mut [i32] {data_ptr: 0x7fffffffe238, length: 5}

This kind of break-on-write is most useful for unsafe languages, when you have
data being corrupted due to out of bounds accesses. One certainly can encode
examples of that in unsafe Rust, but we also do not expect them to arise as
frequently.

But honestly (a ha): Is this actually revealing anything about the core problem
here? The logic at this match-arm is entirely sound: if we have (None,
Some(y))
for our two cursors traversing the slices, then we should pass along
y and keep traversing down the right cursor. Whatever the bug is, it is
arising somewhere else, before we even get here.

Reversible Debugging

If I were restricting my attention to gdb/rust-gdb alone, this would be a
point where I would start talking about re-running the program and stepping
through it carefully, or trying to hypothesize points of interest based on
analyzing the source code and setting breakpoints at those points. But if we
were going to do that, then we might as well invest our time adding some of
those logging statements discussed above, because I think those would show the
problem here just as fast, if not faster!

My goal is to show you an alternative strategy, that works for code bases that
you are not that familiar with.

Lets continue looking at the merge example, but let us now run it under rr.

First we record a run of interest:

1
2
3
4
5
6
7
8
9
10
% rr record ./boom
rr: Saving execution to trace directory `/home/pnkfelix/.local/share/rr/boom-2'.
[boom.rs:4] r = [
    10,
    20,
    20,
    40,
    60,
]
%

With that recording in place, we can replay the run under rr, which gives an
interface similar to gdb. (And we will tell rr that we want it to use
rust-gdb as our debugger to get those extensions for rendering Rust
collections.)

1
% rr replay -d ~/.cargo/bin/rust-gdb

Our initial interaction with rr will be much like how we interacted with gdb
above: we’ll start the program, set a hardware watchpoint for the third array
element, and let it run forward.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
(rr) break main
Breakpoint 1 at 0x55e4f162f437: main. (2 locations)
(rr) c
Continuing.

Breakpoint 1, 0x000055e4f162fae0 in main ()
(rr) next
Single stepping until exit from function main,
which has no line number information.

Breakpoint 1, boom::main () at boom.rs:2
2     let mut r = [99, 98, 97, 96, 95];
(rr)
3     merge(&mut r, &[10, 30], &[20, 40, 60]);
(rr) watch -location r[2]
Hardware watchpoint 2: -location r[2]
(rr) c
Continuing.

Hardware watchpoint 2: -location r[2]

Old value = 97
New value = 20
boom::merge (recv=&mut [i32](size=5) = {...}, a=&[i32](size=2) = {...}, b=&[i32](size=3) = {...}) at boom.rs:29
29                    b_cur = bi.next();
(rr)

But now that we have reached this point in replay, we can rewind time.
We can ask questions like “how did a_cur (the left-hand side of the tuple) become None already, without us ever emitting 30 to the recv array?
Well, the question doesn’t quite look like that. But we can ask “When was a_cur last written to before this point?”, which ends up effectively the same question.

1
2
3
(rr) watch a_cur
Hardware watchpoint 3: a_cur
(rr)

Now we tell the system to run backwards.

1
2
3
4
5
6
7
8
9
10
(rr) reverse-continue
Continuing.

Hardware watchpoint 2: -location r[2]

Old value = 20
New value = 97
0x000055e4f162fa95 in boom::merge (recv=&mut [i32](size=5) = {...}, a=&[i32](size=2) = {...}, b=&[i32](size=3) = {...}) at boom.rs:28
28                    recv[i] = *y;
(rr)


The gdb output is potentially confusing because the notions of “Old” and “New”
are inverted here. To dive deeper into this topic, you might consider the
watching the films Primer or TENET.

We still have our first watch point in place for r[2], so it halted almost
immediately. (When we were running forward, a watchpoint makes us stop right
after the write occurs. When we run backwards, a watchpoint makes us stop right
before the write occurs.)

Let us keep going backwards.

1
2
3
4
5
6
7
8
9
10
(rr) reverse-continue
Continuing.

Hardware watchpoint 3: a_cur

Old value = core::option::Option<&i32>::None
New value = core::option::Optio

NOW WITH OVER +8500 USERS. people can Join Knowasiak for free. Sign up on Knowasiak.com
Read More

Vanic
WRITTEN BY

Vanic

“Simplicity, patience, compassion.
These three are your greatest treasures.
Simple in actions and thoughts, you return to the source of being.
Patient with both friends and enemies,
you accord with the way things are.
Compassionate toward yourself,
you reconcile all beings in the world.”
― Lao Tzu, Tao Te ChingBio: About: