I’m working on adding a thin python binding for the evsel and evlist abstractions I introduced in the perf tools when I stumbled on this, how peterz calls “friggin mistery mmap events”…
Lets start with the simple python script that started this:
import perf def main(): cpus = perf.cpu_map() threads = perf.thread_map() evsel = perf.evsel(task = 1, comm = 1, wakeup_events = 1, sample_period = 1, sample_id_all = 1, sample_type = perf.SAMPLE_PERIOD | perf.SAMPLE_CPU | perf.SAMPLE_TID) evsel.open(cpus = cpus, threads = threads); evlist = perf.evlist() evlist.add(evsel) evlist.mmap(cpus = cpus, threads = threads) while True: evlist.poll(timeout = -1) for cpu in cpus: event = evlist.read_on_cpu(cpu) if not event: continue print "cpu: %2d, pid: %4d, tid: %4d" % (event.sample_cpu, event.sample_pid, event.sample_tid), print event if __name__ == '__main__': main()
So what does this script? It will ask the kernel perf infrastructure to generate events every time a thread is created and when they exit.
Running it:
[root@emilia linux]# tools/perf/python/twatch.py cpu: 0, pid: 0, tid: 0 { type: mmap, pid: 0, tid: 0, start: 0, length: 0x694ae632, offset: 0, filename: } cpu: 3, pid: 0, tid: 0 { type: mmap, pid: 0, tid: 0, start: 0x3, length: 0xc804d5ca, offset: 0, filename: } cpu: 6, pid: 0, tid: 0 { type: mmap, pid: 0, tid: 0, start: 0x6, length: 0xbba638e9, offset: 0, filename: } cpu: 1, pid: 1893, tid: 1893 { type: mmap, pid: 1893, tid: 1893, start: 0x1, length: 0x7899c915, offset: 0, filename: } cpu: 2, pid: 43, tid: 43 { type: mmap, pid: 43, tid: 43, start: 0x2, length: 0xca5a622e, offset: 0, filename: } cpu: 5, pid: 1523, tid: 1525 { type: mmap, pid: 1523, tid: 1525, start: 0x5, length: 0x9a4cd05f, offset: 0, filename: } cpu: 4, pid: 0, tid: 0 { type: mmap, pid: 0, tid: 0, start: 0x4, length: 0x7a4d4989, offset: 0, filename: } cpu: 2, pid: 1543, tid: 1543 { type: fork, pid: 1970, ppid: 1543, tid: 1970, ptid: 1543, time: 515738951270} cpu: 6, pid: 1970, tid: 1970 { type: comm, pid: 1970, tid: 1970, comm: sshd } cpu: 2, pid: 60, tid: 60 { type: fork, pid: 1971, ppid: 60, tid: 1971, ptid: 60, time: 515743769565} cpu: 3, pid: 1971, tid: 1971 { type: fork, pid: 1972, ppid: 1971, tid: 1972, ptid: 1971, time: 515743810636} cpu: 7, pid: 1972, tid: 1972 { type: comm, pid: 1972, tid: 1972, comm: modprobe } cpu: 7, pid: 1972, tid: 1972 { type: exit, pid: 1972, ppid: 1972, tid: 1972, ptid: 1972, time: 515745080846} cpu: 1, pid: 1971, tid: 1971 { type: exit, pid: 1971, ppid: 1971, tid: 1971, ptid: 1971, time: 515746027272} cpu: 7, pid: 1972, tid: 1972 { type: exit, pid: 1972, ppid: 1972, tid: 1972, ptid: 1972, time: 515745086164} cpu: 1, pid: 1971, tid: 1971 { type: exit, pid: 1971, ppid: 1971, tid: 1971, ptid: 1971, time: 515746034982} cpu: 7, pid: 1970, tid: 1970 { type: fork, pid: 1973, ppid: 1970, tid: 1973, ptid: 1970, time: 515747725784} cpu: 0, pid: 1278, tid: 1281 { type: fork, pid: 1278, ppid: 1278, tid: 1974, ptid: 1281, time: 515803257141} cpu: 3, pid: 1278, tid: 1974 { type: comm, pid: 1278, tid: 1974, comm: rs:main Q:Reg } cpu: 2, pid: 1973, tid: 1973 { type: exit, pid: 1973, ppid: 1973, tid: 1973, ptid: 1973, time: 515803913973} cpu: 0, pid: 1970, tid: 1970 { type: fork, pid: 1975, ppid: 1970, tid: 1975, ptid: 1970, time: 515804961019} cpu: 2, pid: 1973, tid: 1973 { type: exit, pid: 1973, ppid: 1973, tid: 1973, ptid: 1973, time: 515803917806} cpu: 3, pid: 1975, tid: 1975 { type: fork, pid: 1976, ppid: 1975, tid: 1976, ptid: 1975, time: 515808031062} cpu: 7, pid: 1976, tid: 1976 { type: comm, pid: 1976, tid: 1976, comm: bash }
Everything seems nice except for those first 8 mmap events, one per CPU, that I didn’t ask for in the evsel (event selector) constructor, here:
evsel = perf.evsel(task = 1, comm = 1, wakeup_events = 1, sample_period = 1, sample_id_all = 1, sample_type = perf.SAMPLE_PERIOD | perf.SAMPLE_CPU | perf.SAMPLE_TID
See ‘task = 1’? That asks for fork and exit events to be generated in the ring buffer, ‘comm = 1’ asks for the COMM (short, 16 bytes max, process name), but I didn’t specified ‘mmap = 1’, to ask for events to be generated when executable mmaps take place…
As I know that those events are stashed in the ring buffer by the perf_event_mmap_output routine, I thought about using ‘perf probe’ to figure out what was causing those events, so I did:
[root@emilia linux]# perf probe mmap_out=perf_event_mmap_output Add new event: probe:mmap_out (on perf_event_mmap_output) You can now use it on all perf tools, such as: perf record -e probe:mmap_out -aR sleep 1 [root@emilia linux]#
That created a new event, one that will take place every time the perf_event_mmap_output is called. To reduce typing I added it aliased to ‘mmap_out’.
Now lets use it, asking for callchains to be taken every time this event happens:
[root@emilia linux]# perf record -e probe:mmap_out -g tools/perf/python/twatch.py cpu: 7, pid: 0, tid: 0 { type: mmap, pid: 0, tid: 0, start: 0x7, length: 0xc6b8162f, offset: 0, filename: } cpu: 2, pid: 0, tid: 0 { type: mmap, pid: 0, tid: 0, start: 0x2, length: 0xdf243434, offset: 0, filename: } cpu: 3, pid: 0, tid: 0 { type: mmap, pid: 0, tid: 0, start: 0x3, length: 0xf9f2d7ab, offset: 0, filename: } cpu: 5, pid: 1893, tid: 1893 { type: mmap, pid: 1893, tid: 1893, start: 0x5, length: 0xa00963c1, offset: 0, filename: } cpu: 1, pid: 0, tid: 0 { type: mmap, pid: 0, tid: 0, start: 0x1, length: 0x95744474, offset: 0, filename: } cpu: 0, pid: 1893, tid: 1893 { type: mmap, pid: 1893, tid: 1893, start: 0, length: 0xa0655031, offset: 0, filename: } cpu: 4, pid: 0, tid: 0 { type: mmap, pid: 0, tid: 0, start: 0x4, length: 0xb2b21883, offset: 0, filename: } ^CTraceback (most recent call last): File "tools/perf/python/twatch.py", line 41, in main() [ perf record: Woken up 1 times to write data ] File "tools/perf/python/twatch.py", line 30, in main evlist.poll(timeout = -1) [ perf record: Captured and wrote 0.012 MB perf.data (~545 samples) ] KeyboardInterrupt [root@emilia linux]#
And there they are, lets see what caused them?
[root@emilia linux]# perf report --stdio # Events: 30 # # Overhead Command Shared Object Symbol # ........ ......... ................. ...................... # 100.00% twatch.py [kernel.kallsyms] [k] perf_event_mmap_output | --- perf_event_mmap_output perf_event_mmap | |--96.67%-- mmap_region | do_mmap_pgoff | | | |--93.10%-- sys_mmap_pgoff | | sys_mmap | | system_call | | __mmap | | | | | --100.00%-- _dl_map_object | | | --6.90%-- elf_map | load_elf_binary | search_binary_handler | load_script | search_binary_handler | do_execve | sys_execve | stub_execve | 0x3d068a6dd7 | --3.33%-- install_special_mapping arch_setup_additional_pages load_elf_binary search_binary_handler load_script search_binary_handler do_execve sys_execve stub_execve 0x3d068a6dd7 # # (For a higher level overview, try: perf report --sort comm,dso) # [root@emilia linux]#
Ok, loading ELF files, the python interpreter…
The investigation is still underway, but using the perf tools in this way sounds so cool that I thought about sharing it here 🙂