Friggin mistery…

I’m working on adding a thin python binding for the evsel and evlist abstractions I introduced in the perf tools when I stumbled on this, how peterz calls “friggin mistery mmap events”…

Lets start with the simple python script that started this:

import perf

def main():
	cpus = perf.cpu_map()
	threads = perf.thread_map()
	evsel = perf.evsel(task = 1, comm = 1,
			   wakeup_events = 1, sample_period = 1,
			   sample_id_all = 1,
			   sample_type = perf.SAMPLE_PERIOD | perf.SAMPLE_CPU | perf.SAMPLE_TID)
	evsel.open(cpus = cpus, threads = threads);
	evlist = perf.evlist()
	evlist.add(evsel)
	evlist.mmap(cpus = cpus, threads = threads)
	while True:
		evlist.poll(timeout = -1)
		for cpu in cpus:
			event = evlist.read_on_cpu(cpu)
			if not event:
				continue
			print "cpu: %2d, pid: %4d, tid: %4d" % (event.sample_cpu,
								event.sample_pid,
								event.sample_tid),
			print event

if __name__ == '__main__':
    main()

So what does this script? It will ask the kernel perf infrastructure to generate events every time a thread is created and when they exit.

Running it:

[root@emilia linux]# tools/perf/python/twatch.py
cpu:  0, pid:    0, tid:    0 { type: mmap, pid: 0, tid: 0, start: 0, length: 0x694ae632, offset: 0, filename:  }
cpu:  3, pid:    0, tid:    0 { type: mmap, pid: 0, tid: 0, start: 0x3, length: 0xc804d5ca, offset: 0, filename:  }
cpu:  6, pid:    0, tid:    0 { type: mmap, pid: 0, tid: 0, start: 0x6, length: 0xbba638e9, offset: 0, filename:  }
cpu:  1, pid: 1893, tid: 1893 { type: mmap, pid: 1893, tid: 1893, start: 0x1, length: 0x7899c915, offset: 0, filename:  }
cpu:  2, pid:   43, tid:   43 { type: mmap, pid: 43, tid: 43, start: 0x2, length: 0xca5a622e, offset: 0, filename:  }
cpu:  5, pid: 1523, tid: 1525 { type: mmap, pid: 1523, tid: 1525, start: 0x5, length: 0x9a4cd05f, offset: 0, filename:  }
cpu:  4, pid:    0, tid:    0 { type: mmap, pid: 0, tid: 0, start: 0x4, length: 0x7a4d4989, offset: 0, filename:  }

cpu:  2, pid: 1543, tid: 1543 { type: fork, pid: 1970, ppid: 1543, tid: 1970, ptid: 1543, time: 515738951270}
cpu:  6, pid: 1970, tid: 1970 { type: comm, pid: 1970, tid: 1970, comm: sshd }
cpu:  2, pid:   60, tid:   60 { type: fork, pid: 1971, ppid: 60, tid: 1971, ptid: 60, time: 515743769565}
cpu:  3, pid: 1971, tid: 1971 { type: fork, pid: 1972, ppid: 1971, tid: 1972, ptid: 1971, time: 515743810636}
cpu:  7, pid: 1972, tid: 1972 { type: comm, pid: 1972, tid: 1972, comm: modprobe }
cpu:  7, pid: 1972, tid: 1972 { type: exit, pid: 1972, ppid: 1972, tid: 1972, ptid: 1972, time: 515745080846}
cpu:  1, pid: 1971, tid: 1971 { type: exit, pid: 1971, ppid: 1971, tid: 1971, ptid: 1971, time: 515746027272}
cpu:  7, pid: 1972, tid: 1972 { type: exit, pid: 1972, ppid: 1972, tid: 1972, ptid: 1972, time: 515745086164}
cpu:  1, pid: 1971, tid: 1971 { type: exit, pid: 1971, ppid: 1971, tid: 1971, ptid: 1971, time: 515746034982}
cpu:  7, pid: 1970, tid: 1970 { type: fork, pid: 1973, ppid: 1970, tid: 1973, ptid: 1970, time: 515747725784}
cpu:  0, pid: 1278, tid: 1281 { type: fork, pid: 1278, ppid: 1278, tid: 1974, ptid: 1281, time: 515803257141}
cpu:  3, pid: 1278, tid: 1974 { type: comm, pid: 1278, tid: 1974, comm: rs:main Q:Reg }
cpu:  2, pid: 1973, tid: 1973 { type: exit, pid: 1973, ppid: 1973, tid: 1973, ptid: 1973, time: 515803913973}
cpu:  0, pid: 1970, tid: 1970 { type: fork, pid: 1975, ppid: 1970, tid: 1975, ptid: 1970, time: 515804961019}
cpu:  2, pid: 1973, tid: 1973 { type: exit, pid: 1973, ppid: 1973, tid: 1973, ptid: 1973, time: 515803917806}
cpu:  3, pid: 1975, tid: 1975 { type: fork, pid: 1976, ppid: 1975, tid: 1976, ptid: 1975, time: 515808031062}
cpu:  7, pid: 1976, tid: 1976 { type: comm, pid: 1976, tid: 1976, comm: bash }

Everything seems nice except for those first 8 mmap events, one per CPU, that I didn’t ask for in the evsel (event selector) constructor, here:

	evsel = perf.evsel(task = 1, comm = 1,
			   wakeup_events = 1, sample_period = 1,
			   sample_id_all = 1,
			   sample_type = perf.SAMPLE_PERIOD | perf.SAMPLE_CPU | perf.SAMPLE_TID

See ‘task = 1’? That asks for fork and exit events to be generated in the ring buffer, ‘comm = 1’ asks for the COMM (short, 16 bytes max, process name), but I didn’t specified ‘mmap = 1’, to ask for events to be generated when executable mmaps take place…

As I know that those events are stashed in the ring buffer by the perf_event_mmap_output routine, I thought about using ‘perf probe’ to figure out what was causing those events, so I did:

[root@emilia linux]# perf probe mmap_out=perf_event_mmap_output
Add new event:
  probe:mmap_out       (on perf_event_mmap_output)

You can now use it on all perf tools, such as:

	perf record -e probe:mmap_out -aR sleep 1

[root@emilia linux]#

That created a new event, one that will take place every time the perf_event_mmap_output is called. To reduce typing I added it aliased to ‘mmap_out’.

Now lets use it, asking for callchains to be taken every time this event happens:

[root@emilia linux]# perf record -e probe:mmap_out -g tools/perf/python/twatch.py 
cpu:  7, pid:    0, tid:    0 { type: mmap, pid: 0, tid: 0, start: 0x7, length: 0xc6b8162f, offset: 0, filename:  }
cpu:  2, pid:    0, tid:    0 { type: mmap, pid: 0, tid: 0, start: 0x2, length: 0xdf243434, offset: 0, filename:  }
cpu:  3, pid:    0, tid:    0 { type: mmap, pid: 0, tid: 0, start: 0x3, length: 0xf9f2d7ab, offset: 0, filename:  }
cpu:  5, pid: 1893, tid: 1893 { type: mmap, pid: 1893, tid: 1893, start: 0x5, length: 0xa00963c1, offset: 0, filename:  }
cpu:  1, pid:    0, tid:    0 { type: mmap, pid: 0, tid: 0, start: 0x1, length: 0x95744474, offset: 0, filename:  }
cpu:  0, pid: 1893, tid: 1893 { type: mmap, pid: 1893, tid: 1893, start: 0, length: 0xa0655031, offset: 0, filename:  }
cpu:  4, pid:    0, tid:    0 { type: mmap, pid: 0, tid: 0, start: 0x4, length: 0xb2b21883, offset: 0, filename:  }
^CTraceback (most recent call last):
  File "tools/perf/python/twatch.py", line 41, in 
    main()
[ perf record: Woken up 1 times to write data ]
  File "tools/perf/python/twatch.py", line 30, in main
    evlist.poll(timeout = -1)
[ perf record: Captured and wrote 0.012 MB perf.data (~545 samples) ]
KeyboardInterrupt
[root@emilia linux]#

And there they are, lets see what caused them?

[root@emilia linux]# perf report --stdio
# Events: 30 
#
# Overhead    Command      Shared Object                  Symbol
# ........  .........  .................  ......................
#
   100.00%  twatch.py  [kernel.kallsyms]  [k] perf_event_mmap_output
            |
            --- perf_event_mmap_output
                perf_event_mmap
               |          
               |--96.67%-- mmap_region
               |          do_mmap_pgoff
               |          |          
               |          |--93.10%-- sys_mmap_pgoff
               |          |          sys_mmap
               |          |          system_call
               |          |          __mmap
               |          |          |          
               |          |           --100.00%-- _dl_map_object
               |          |          
               |           --6.90%-- elf_map
               |                     load_elf_binary
               |                     search_binary_handler
               |                     load_script
               |                     search_binary_handler
               |                     do_execve
               |                     sys_execve
               |                     stub_execve
               |                     0x3d068a6dd7
               |          
                --3.33%-- install_special_mapping
                          arch_setup_additional_pages
                          load_elf_binary
                          search_binary_handler
                          load_script
                          search_binary_handler
                          do_execve
                          sys_execve
                          stub_execve
                          0x3d068a6dd7

#
# (For a higher level overview, try: perf report --sort comm,dso)
#
[root@emilia linux]# 

Ok, loading ELF files, the python interpreter…

The investigation is still underway, but using the perf tools in this way sounds so cool that I thought about sharing it here 🙂

Post a comment or leave a trackback: Trackback URL.

Leave a comment