Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A thorough introduction to bpftrace (brendangregg.com)
66 points by kiyanwang on Aug 9, 2021 | hide | past | favorite | 12 comments


I went to use bpftrace to solve a real problem recently, and unfortunately found I had to resort to systemtap.

I wanted to print a debug log as it was being saved in a kernel module - that's fine, the module has an equivalent of save_log(char* foo), just probe on entrance and print that, right?

...except bpftrace has a hard cap of 200-odd bytes for getting char *s out with str() at a time. [1]

Fine, so you just do some pointer math and print foo, foo+200, etc, right?

No strlen, no printf return value, so you don't know where the end of the string is.

At that point, I said "sod it" and broke out systemtap.

[1] - https://github.com/iovisor/bpftrace/issues/305


What is the main advantage of this over native strace?


bpftrace has lower overhead, because, thanks to BPF, it can accumulate all the data needed in kernel and only send a summary to userspace to be displayed. strace on the other hand uses the ptrace(2) syscall to set a breakpoint on each syscall, including the ones that you do not trace (when you filter calls e.g.) and on each syscall data travels between kernel and strace.

It's possible that in the future strace will use BPF as well to lower its overhead.

bpftrace is also more versatile. You could use it to e.g. collect stack traces for all the syscalls a program makes. You can also attach actions to more things than just syscalls. You can e.g. use kprobes to inject code into kernel functions which aren't exported as syscalls.


strace only traces system calls while bpftrace does much more. Just looking at the first example, vfs_read isn't a system call (read() may also operate on sockets or whatever) and strace doesn't calculate histograms AFAIK.


I use it a lot to figure out why things fail. For example, what if you get an -EPERM from mount()? Was it denied by seccomp, an LSM, because you don't own the user namespace that owns your mount namespace?

strace will tell you it failed, but bpftrace can help understand why.

Note that I said "help": bpftrace can tell you "this function failed with EPERM", but e.g. ovl_fill_super() can fill with EPERM for lots of different reasons. So it's a bit like printf debugging. And you're SOL if the error is generated within that function or from an inlined function :(


bpftrace requires about 318 MB to install

strace requires less than 2 MB

Bigger is better :)


Standalone binary the project publishes is 45MB. https://github.com/iovisor/bpftrace/releases/download/v0.13....


I just tried two commands.

apt install bpftrace: needs 1,201 kB

apt remove strace: frees 1,792 kB


Need to look at the bpftrace dependencies

For example, is clang already installed


How large is strace standalone binary

45MB is still huge; its not even statically-linked

The largest programs I use, even when statically-linked against musl are all under 6MB


And I can count on one hand the number of programs I use that are over 1MB


Title needs a [2019] tag.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: