[ONGOING] The hitchhiker's guide to LLVM debugging tools
8888-08-08
Prologue
Hello everyone, how's everybody doing? I'm still chugging away at solving LLVM issues at Igalia :) Fall and winter's coming soon, and by the time this blog's up, I will be turning a year older. Welp, another year a few more blogs am i right hahahahhaa.
A few weekends ago, I went up to Thousand Oaks and Malibu beach to visit my friend from Berkeley and then took an Amtrak with a date to San Diego. Isn't life just beautiful like that? I couldn't have asked for more.
This blog will be detailing my debugging methods since working on LLVM. For the foreseeable future (while the blog title has [ONGOING]), this blog will be continually updated with new knowledge of mine for any shortcomings as I'm still new to LLVM.
For external reviewers and discussion of this blog and others, you can join the discord server here or the discord link in my front page.
For internal reviewers at Igalia, you can also dm me on Matrix, up to your preferences!
As tradition, here's a song for interested readers :) I hope everyone enjoys :)
Introduction
Debugging has always been a quintessential tool in a software engineer's toolboxes.
As a piece of software or framework gets more complicated, the flow of a program invariantly gets more complex, which requires software engineers to careful and methodically investigate an issue/bug instead of blindly following one's instinct.
In this sense, although the article talks about debugging, the article inherently provides a sample of the author's problem-solving toolbox.
I hope this helps junior LLVM developer or low level developer just starting out in their journey!
Welp, let's dig in!
Resources
In writing the blog, I utilize these resources, including, but not excluding:
- How to reduce LLVM crashes
- Clear step by step to oneshot debugging any LLVM issue with ChatGPT
- Git manual
- LLVM lang ref
Compiling and testing time
Note: Some parts of the section takes inspiration from Nikita Popov's How to contribute to LLVM and LLVM's Building LLVM with CMake. I highly recommend you take a few minutes to read through them!
Compile time on a big codebase has always been a pain point for programmers. A few seconds here, a few minutes there, and you have wasted at least half an hour extra waiting to compile the LLVM repository.
Once you hit compile, if you have to wait a while, your mind wanders away to other things and you start to drift away, thinking about other topics. Point is, the context switching causes you to lose out on your focusing power.
The following subsections then describes how to save time on your compiling and executing. This will reduce your development cycle and TODO
(Release + Assertion) + (Debug)
A common rule of thumb to have is you build a codebase in Debug mode when you're in development, and then you build it in Release mode when shipping it out to users. But this is not always the case...
If you're not stepping through the code with the LLDB debugger, chances are the trade-off of being in Debug mode; receiving accurate stack trace locations and source layout position is not worth it. Debug mode takes longer to compile and link, test cases with Debug mode are often compiled and ran slower.
Release mode with assertion turned on, on the other hands, compiles and run tests blazingly fast as well as gives good diagnostics.
Therefore, it's very beneficial to have two build modes when contributing to LLVM:
You can set the modes for your build by appending these two options when configuring CMake
Debug : -DCMAKE_BUILD_TYPE=Debug
Release + Assert : -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON
Build your target/subproject only
You can also save on your development time by only building the subproject that are of interests to you.
For example, if you're doing majority webassembly or x86, don't build RISCV on top of it. Some tablegen changes in common between RISCV and WebAssembly will inevitably trigger compilation on the RISCV target, taking a dozen of seconds to compile. Enabling extra targets will also enable extra tests for that target.
The flag to include/exclude subprojects is LLVM_TARGETS_TO_BUILD
From LLVM:
Semicolon-separated list of targets to build, or all for building all targets. Case-sensitive. Defaults to all. Example: -DLLVM_TARGETS_TO_BUILD="X86;PowerPC".
The full list, as of March 2023, is: AArch64, AMDGPU, ARM, AVR, BPF, Hexagon, Lanai, LoongArch, Mips, MSP430, NVPTX, PowerPC, RISCV, Sparc, SystemZ, VE, WebAssembly, X86, XCore.
Use LLVM-release compilers
If you're using a package manager's clang
or a clang
that comes with your operating system, chances are you're losing out
on development time.
With PGO, BOLT and various tips and tricks turned on, LLVM maintainer and volunteers have been hard at work optimizing the LLVM official release version of clang to be highly optimized for the LLVM compiler development experiment. How good is it? You can see for yourself with the claim from the BOLT paper
For datacenter applications, BOLT achieves up to 7.0% performance speedups on top of profile-guided function reordering and LTO. For the GCC and Clang compilers, our evaluation shows that BOLT speeds up their binaries by up to 20.4% on top of FDO and LTO, and up to 52.1% if the binaries are built without FDO and LTO.
Unfortunately, for MacOS users, BOLT only comes with Linux x86 and AArch64. However, I'd still recommend downloading your OS's version of the compiler from LLVM's release page as anecdotally, I can still feel some extra perform when compiling LLVM.
After you've downloaded the LLVM release compiler to a folder, you can use it in your CMake configuration command as follow
-DCMAKE_C_COMPILER=$HOME/Developer/igalia/LLVM-20.1.7-macOS-ARM64/bin/clang \
-DCMAKE_CXX_COMPILER=$HOME/Developer/igalia/LLVM-20.1.7-macOS-ARM64/bin/clang++
provided that you download and extract it to a folder of $HOME/Developer/igalia/LLVM-20.1.7-macOS-ARM64
Btw, I learned about this through my conversation with David Spicket:
bdbt: i'm new grad and joining a company as a compiler engineer, would compiling a new, seperate compiler that's BOLT-activated specifically for compiling compiler makes sense? It will give me a sizable reduction in compile time and allow for faster iteration but i'm not sure if there will be errors when using the special version? any insights would be appreciated
David Spicket: At least for Linux x86 and AArch64, LLVM's release archives on GitHub are already BOLT-ed (you can check with objdump -h clang | grep bolt). So if there are potential issues, we're all at risk 🙂 If you want to try BOLT-ing something yourself, https://github.com/llvm/llvm-project/blob/main/bolt/docs/OptimizingClang.md seems like a good place to start (I've not done this myself).
ccache, ninja and LLD
Haha, you thought we were done huh. Even on top of all this, we can use extra tools to squeeze out extra time in our day-to-day work.
ccache
For increment builds, ccache "speeds up recompilation by caching previous compilations and detecting when the same compilation is being done again."
In my use case, when compiling the codebase from different commits (because some branches are opened a few weeks ago and some are opened recently after I have git pull
from upstream),
compiling the same build
folder would often take a lot of time. This is where ccache comes in.
You can use the LLVM's flag for ccache together with other CMake flags -DLLVM_CCACHE_BUILD=true
.
In addition to this, you can create a ccache configuration in $HOME/.ccache/ccache.conf
, allowing more flexibility and power to ccache.
file_clone = true
inode_cache = true
max_size = 200G
base_dir = /
absolute_paths_in_stderr = true
Below, I'll provide the explanation for each configuration option.
file_clone = true
: performs copy-on-write if possible, improving the caching performance.inode_cache = true
: instruct ccache to cache a file based on its device, inode and timestamps. This cuts down on time spent hashing files and therefore improves caching performance since some file in LLVM can contain tens of thousands of lines.max_size = 200G
: allocates 200 gigabyte of storage on your computer for ccache to use.base_dir = /
: allows caching of all directories (any directory's path that starts with /).absolute_paths_in_stderr
: instruct ccache to rewrite relative paths to absolute paths in a compiler's textual output in lieu of incorrect relative paths in warning or error messages.
I want to thank my colleague Alex Bradbury for introducing me to these options :)
Ninja
Ninja's a build system thats focused on speed and is designed run builds as fast as possible.
From Ninja's wikipedia:
In essence, Ninja is meant to replace Make, which is slow when performing incremental (or no-op) builds. This can considerably slow down developers working on large projects, such as Google Chrome which compiles 40,000 input files into a single executable. In fact, Google Chrome is a main user and motivation for Ninja. It's also used to build Android (via Makefile translation by Kati), and is used by most developers working on LLVM.
You can opt to use Ninja when configuring CMake with -GNinja
and build your code with another seperate command
ninja -C <your_build_folder> <your_build_target>
For example, to build the codebase and test everything:
ninja -C build check-all
LLD
From the LLVM's LLD page:
LLD is a linker from the LLVM project that is a drop-in replacement for system linkers and runs much faster than them.
In fact, it is so fast that I think it's a necessity to enable this by default when developing LLVM, especially when your lld executable comes from the LLVM release page that has PGO and BOLT enabled, providing even faster linking time.
As of Monday, August 11, 2025, here is the comparison from the LLVM LLD page, with the percentage difference between ld, gold and lld.
Program | Output size | GNU ld (time) | GNU gold w/ threads (time) | lld w/ threads (time) |
---|---|---|---|---|
ffmpeg dbg | 92 MiB | 1.72s, (4.91x) | 1.01s, (2.89x) | 0.35s (1.00x) |
mysqld dbg | 154 MiB | 8.50s, (12.50x) | 2.68s, (3.94x) | 0.68s (1.00x) |
clang dbg | 1.67 GiB | 104.03s, (19.70x) | 23.49s, (4.45x) | 5.28s (1.00x) |
chromium dbg | 1.14 GiB | 209.05s, (12.52x) | 60.82s, (3.64x) | 16.70s (1.00x) |
You can enable the LLD linker with -DLLVM_USE_LINKER=lld
Grepping
Well, now all that's out of the way, let's talk about grepping.
In a big and unfamiliar codebase, sometimes, you just don't have much of a clue on how things go. In cases like this, you can start querying random keywords that are related to your issue and work your way from there.
Sometimes, it just so happens that when you're querying a function that's used in a test case via an LSP, the LSP gives you the source from the OS compiler or the LLVM release compiler (since you built the project with it), grepping might be one of the few ways to get you to the function's source in the repository instead.
In my case, I opt for ripgrep instead of grep due to the performance difference.
With such aforementioned importance, often in the following sections, I will try to provide how grepping integrates with each area.
Git potpourri on the pull requests
For a beginner in a codebase (even a seasoned programmer), the ability to obtain more information and context, extending further than the code sitting in front of them, is extremely helpful.
Blame & show a commit
When exploring a new area/file, I would often combine git blame together with git show to obtain more context for what I'm working with.
The workflow is: git blame
the file -> get the commit hash for the line I'm interested in -> input the commit hash to git show
In investigating an issue through commits, it's also helpful to look up on the internet for the commit's pull requests, either through GitHub or through reviews.llvm.org. For example, in this pull request, the author and reviewers have added a TODO, but without reading the PRs, the TODO would seem very unclear and hazy, leaving code readers wanting more context.
In other words, exploring code is the first step, after that, commit messages provides greater additional context on the problems being solved and finally, the pull request
and the reviews.llvm.org
provides opinions and directives.
grepping it
With trying to explore what's happening for a subproject/subsection in LLVM, you can also utilize grepping in git without going through the github online gui.
Belows shows a picture of me applying git grep to the topic of WebAssembly where theres a mention of either "fold" or "DAG" via this command:
git log --grep='WebAssembly' --grep='fold\|DAG' --all-match
godbolt, -debug-only, -print-after-all, -print-changed and all that
If you haven't heard of godbolt, think of it as a way for compiler developer to say no to "but but but it works on my machineeee :('
Formally, godbolt is an interactive online compiler which shows the assembly output of compiled C++, Rust, Go (and many more) code.
Buttt, such a simple description couldn't possibly do the site its justice bla bla blah
TODO: HEY JASMINE: Ok, the mindset here is while providing users how to use godbolt, I should also provide a way to do this via the command line.
Inspecting IR changes
TODO: Explain this in a way that's akin to -print-changed
Cutting out the noise
-print-changed
Deep diving into a specific passes
After you've filtered out the specific pass that you think are affecting your
lldb and debuggers
A quick google on "gdb versus lldb" would give a similar generic answer to how similar the two tools are and it's often up to personal preferences. This article won't try to persuade you to use either gdb or lldb, but perhaps it will introduce you more to the strong integration between lldb and the LLVM codebase :)
Default
lldb has auto repeat on when you press sth then you press return
dump.*
The dump()
helper function in llvm is basically a pretty printer for
A lot of the time, when you perform
ripgrep void dump.*\(\)
Loop class has dumpVerbose
RegAlloc class has dumpState
llvm/utils/lldbDataFormatters.py
lldbDataFormatters.py (LDF) is a python script to integrate with LLDB. In LLDB, when you print a specific LLVM data structure
that doesn't have the dump()
helper function, things can get a bit messy due to how the debugger's treating the data structure as
a pointer only.
For LLVM-specific data structures, LDF provides helpful pretty printer for DenseMap, DenseSet, StringRef, SmallVector, SmallStrings
To utilize the script, once you fire up LLDB, you can provide LLDB with command script import PathToScript/lldbDataFormatters.py
where PathToScript is the directory path to LDF.
For example, if you're debugging in llvm-project
, which houses the lldbDataFormatters naturally in llvm/utils:
...
(lldb) command script import llvm/utils/lldbDataFormatters.py
...
Here's a picture showing the before and after importing the formatter script:
Alas, a programmer would not be a programmer if not for their automation. It would be an inconvenience (and a detriment to their productivity) if a programmer were supposed to remember the script import command and then remember to type it in every time.
Instead, you can add this to .lldbinit
in your home directory and let the computer perform said steps for you:
script
"llvm-project" in os.getcwd() and lldb.debugger.HandleCommand("command script import llvm/utils/lldbDataFormatters.py")
Conditional breakpoints
Personally, I set up a keymap in my neovim so I don't have to type so much. Here's the lua code for the keymap
local yank_for_conditional_break = function ()
local path = vim.fn.expand('%:.')
local line = vim.fn.line('.')
local word = vim.fn.expand("<cword>")
local result = 'breakpoint set --file '.. path .. ' --line ' .. line .. ' --condition ' .. '\'' .. word
vim.fn.setreg('+', result)
end
tablegen
For a short overview on tablegen, the LLVM tablegen docs cannot be more succint:
TableGen’s purpose is to help a human develop and maintain records of domain-specific information. ... [reducing] the amount of duplication, the chance of error, and [making] it easier to structure domain specific information.
.td (tablegen) files are ubiquitous in the LLVM world; if you have contributed to LLVM, chances are you've read a .td
file.
If something's wrong with a tablegen, a programmer needs to understand at least the debugging techniques for a tablegen file.
When a .td file is table-generated, it spits out an .inc (include
) file in the build folder
TODO: Mention a developer's love hate relationship with tablegen, for example, with Jeremy Kun's perspective on mlir's tablegen:
It sounds nice, but I have a love hate relationship with tablegen. I personally find it to be unpleasant to use, primarily because it provides poor diagnostic information when you do things wrong. Today, though, I realize that my part of my frustration came from having the wrong initial mindset around tablegen. I thought, incorrectly, that tablegen was an abstraction layer. That is, I could write my tablegen files, build them, and only think about the parts of the generated code that I needed to implement.
Personally, my experience with tablegen in the backend is akin to this: for certain issues/problems, if you don't use tablegen, then you'll end up writing more code and spend more time/effort maintaining said code.
Grepping tablegen
At the time of writing this article, with version 14.1.1, ripgrep provides a way to filter out the file type so often times, you can use this to exclusively search for ripgrep.
rg --type-add 'td:*.td' -ttd "<What you want to search here" <where you want to search>
gitbisect
It is quite frustrating that on a big repository, after ~100,000 commits from you and a few thousands contributors, you're asked to solve a bug that happen on some new releases but not older ones. You say to yourself: "Welp I can't really build the llvm codebase and run the test case 100,000 times to find the commit that causes the bugs."
Instead, with git-bisect's binary search to find the commit that introduced a bug,
you can reduce this down to
should spawn its own shell so git bisect run is not affected
The rest of this gitbisect section focuses on helping you set up a basic script to automate the bug finding process with gitbisect.
llvm-reduce
Often, when a bug fails on a big test case, it is imperative to reduce
the test case to a smaller one, so that other people can pinpoint exactly where
exactly the bug occurs.
llvm-reduce works on the programmer telling it what test case execution is interesting.
And in a unix fashion, if something returns a non-zero code,
it is considered interesting.
We can then blah blah blah
--- BEFORE llvm-reduce
define i64 @stest_f64i64(double %x) {
entry:
%conv = fptosi double %x to i128
%0 = icmp slt i128 %conv, 9223372036854775807
%spec.store.select = select i1 %0, i128 %conv, i128 9223372036854775807
%1 = icmp sgt i128 %spec.store.select, -9223372036854775808
%spec.store.select7 = select i1 %1, i128 %spec.store.select, i128 -9223372036854775808
%conv6 = trunc i128 %spec.store.select7 to i64
ret i64 %conv6
}
--- AFTER llvm-reduce
define <2 x i128> @stest_f64i64() {
entry:
%conv = fptosi <2 x double> splat (double 0x7FF8000000000000) to <2 x i128>
ret <2 x i128> %conv
}
rsp_bisect.py
rsp file is response file? not sure how relevant this will be or if newcomers will ever use it
llvm-extract
Reading skills, experience, (or think harder)
Yep yep, you read it right
debugging also comes from lang ref: for example, if you're creating a new instruction from another instruction, for example: llvm.reduce.and i32 to i1 i