Monthly Archives: April 2017

Looking for users of new syscalls

Recently Linux got a new syscall to get extended information about files, a super ‘stat’, if you will, read more about it at LWN.

So I grabbed the headers with the definitions for the statx arguments to tools/include/ so that ‘perf trace’ can use them to beautify, i.e. to appear as
a bitmap of strings, as described in this cset.

To test it I used one of things ‘perf trace’ can do and that ‘strace’ does not: system wide stracing. To look if any of the programs running on my machine was using the new syscall I simply did, using strace-like syntax:

# perf trace -e statx

After a few minutes, nothing… So this fedora 25 system isn’t using it in any of the utilities I used in these moments, not surprising, glibc still needs wiring statx up.

So I found out about samples/statx/test-statx.c, and after installing the kernel headers and pointing the compiler to where those files were installed, I restarted that system wide ‘perf trace’ session and ran the test program, much better:

# trace -e statx
16612.967 ( 0.028 ms): statx/562 statx(dfd: CWD, filename: /etc/passwd, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffef195d660) = 0
33064.447 ( 0.011 ms): statx/569 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW|STATX_FORCE_SYNC, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffc5484c790) = 0
36050.891 ( 0.023 ms): statx/576 statx(dfd: CWD, filename: /etc/motd, flags: SYMLINK_NOFOLLOW, mask: BTIME, buffer: 0x7ffeb18b66e0) = 0
38039.889 ( 0.023 ms): statx/584 statx(dfd: CWD, filename: /home/acme/.bashrc, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7fff1db0ea90) = 0
^C#

Ah, to get filenames fetched we need to put in place a special probe, that will collect filenames passed to the kernel right after the kernel copies it from user memory:

[root@jouet ~]# perf probe 'vfs_getname=getname_flags:72 pathname=result->name:string'
Added new event:
probe:vfs_getname    (on getname_flags:72 with pathname=result->name:string)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_getname -aR sleep 1

[root@jouet ~]# trace -e open touch /etc/passwd
0.024 ( 0.011 ms): touch/649 open(filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
0.056 ( 0.018 ms): touch/649 open(filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
0.481 ( 0.014 ms): touch/649 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
0.553 ( 0.012 ms): touch/6649 open(filename: /etc/passwd, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) = 3
[root@jouet ~]#

Make sure you have CONFIG_DEBUG_INFO set in your kernel build or that the matching debuginfo packages are installed. This needs to be done just once per boot, ‘perf trace’ will find it in place and use it.

Lastly, if ‘perf’ is hardlinked to ‘trace’, then the later will be the same as ‘perf trace’.