Here are some notes on the necessary setup for profiling LLVM.
First, we will want to use a Release
build, and it is sufficient to build only clang and the X86 backend (or AArch64 if that’s your host architecture). Additionally, we should specify -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer
in CFLAGS to allow easy call-graph profiling with perf
:
cmake -G Ninja -B build/ -H llvm/ \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS="clang" \
-DLLVM_TARGETS_TO_BUILD="X86" \
-DCMAKE_C_FLAGS="-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer" \
-DCMAKE_CXX_FLAGS="-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer" \
-DLLVM_CCACHE_BUILD=true \
-DLLVM_USE_LINKER=lld
# Last two lines are just to reduce build time, otherwise optional.
ninja -C build
To use CTMark, we should download GitHub - llvm/llvm-test-suite and then configure as follows (for the -O3
configuration):
cmake -G Ninja -B build-O3 -H. \
-DCMAKE_C_COMPILER=$PATH_TO/llvm-project/build/bin/clang \
-DTEST_SUITE_SUBDIRS=CTMark \
-DTEST_SUITE_RUN_BENCHMARKS=false \
-C cmake/caches/O3.cmake
ninja -C build-O3 -v > out
This will build all files in CTMark. It’s possible to collect compile-time stats while doing so, but it’s not really possible to get high-accuracy compile-time statistics with vanilla CTMark. What we’re actually interested in here is manually profiling individual files. The above command collected all the build commands into a file, and we can now pick out one (e.g. for sqlite3.c
) and run it through a profiler:
cd build-O3
perf record -g /home/npopov/repos/llvm-project/build/bin/clang -DNDEBUG -O3 -w -Werror=date-time -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DSQLITE_OMIT_LOAD_EXTENSION=1 -DSQLITE_THREADSAFE=0 -I. -MD -MT MultiSource/Applications/sqlite3/CMakeFiles/sqlite3.dir/sqlite3.c.o -MF MultiSource/Applications/sqlite3/CMakeFiles/sqlite3.dir/sqlite3.c.o.d -o MultiSource/Applications/sqlite3/CMakeFiles/sqlite3.dir/sqlite3.c.o -c /home/npopov/repos/llvm-test-suite/MultiSource/Applications/sqlite3/sqlite3.c
perf report
We can replace perf record -g
with valgrind --tool=callgrind
for a profiler via the valgrind emulator. This is much slower (and does not require frame pointers) but gives a more detailed profile.
Instead of using a profiler, we can also append -ftime-report
to the command line, which will print an overview of pass execution timings to stderr.
Finally, we can use -ftime-trace
. This will produce a .json
file next to the object file, so it will be in some location like MultiSource/Applications/sqlite3/CMakeFiles/sqlite3.dir/sqlite3.c.json
. This file can then be viewed by using about:tracing
in Google Chrome for example.