Category Archives: Kernel Hacking

dwarves on the spotlight

Recently I saw somebody boasting that Linus now uses Fedora and I don’t know why I found it silly. But then when Ingo Molnar even mentioned the name of the package where codiff is available I found that I can be silly too ;-)

tuna & the oscilloscope

Since I haven’t blogged about these toys, please take a look at Carsten’s article at OSADL and perhaps at my OLS 2008 paper too.

ait and tuna

Work is supposed to take most of your time, right? Survival, mouths to feed, all that stuff… But it can be fun, even if not _directly_ kernel related. Sure, there is life outside the kernel, hey, “kernel”… I keep listening to the head-honchos (heck, I had to use that term, it looks like spanish, the language of our capital, Buenos Aires!): “forget about the kernel, the action is somewhere else, in (l)userland!”

So here I am, in userland, playing with GUI stuff, drag and drop! Python! GTK! Wow! Sounds boring? No, I don’t think so. Discovery time is never boring. Its fun to try for hours to grok some new semantic domain. Even if you, in another life, wrote a DOS GUI system out of reading a marketing insertion on Unix Review in 1988 :-P

So what have I been doing in this strange land? Well, when you work you have to show the numbers, and explain them, and remember what happened when you switched that knob or applied that patch, no?

Damn, alchemy doesn’t requires you remember all that stuff, just taste the new stuff, if you think its not poisonous, that is.

But if you need to remember… There is something cooking for this year OLS, or to the first conference after it if it thinks I’m too bollocksish :-)

And to _show you the code_, full of python newbie mistakes… but hey, I even dared to write python bindings for such supposedly interesting stuff as ethtool and schedutils.

And finally to what uses it: AIT, because short, meaningless names are en vogue. Anyway, try tuna, a tuning application that has a cool name, one that I unfortunately didn’t came up with but fortunately was near the genius that though about it, thanks!

And thanks to whoever did the right thing and brought Evgeniy to kernelplanet, he is da guy from Russia! Get healthy and in crazy coding frenzy mode again!

libdwfl conversion

Finally I found time to convert dwarves to use the more active libdwfl library found in elfutils, thanks to Ulrich Drepper and Roland McGrath we can forget about most of the low level details related to RELA architectures and get the dwarves to work with kernel modules on architectures such as x86_64, where we need to do relocation work on the .debug_info section, now just enjoy pahole & friends working on more targets.

There is still some work to do on two of the dwarves that handle more than one target, codiff and ctracer, where dwfl_default_argp is not enough as it only handles one side of the game needed for these two dwarves, we need to open another set to compare in the case of codiff and another set for resolving the kprobes functions in the case of ctracer.

News fom dwarves land

Pahole:

Now we can also move bitfields around to combine it with other fields, moving a list of members with the same offset (the definition of a bitfield) to after other members that have holes bigger than the size of the (possibly type demoted) list of members comprising a bitfield, also there is a new pahole command line option to show all the steps involved in reorganizing a struct, –show_reorg_steps, look here for how it looks like.

Also related to this pahole news is the fact that the class__reorganize() & supporting code was moved to libdwarves.so, where its is made useful for the other dwarves, which leads us to the next dwarves news:

Ctracer:

The class tracer now is past its stone age, getting fast to an exciting iron age: printk was replaced by a not so carefully (as in SMP?!?! I only have a dual pentium 100 machine here, damn!) stolen from blktrace (I hope that the UTT effort is merged sooner than later so that I can use it), namely using debugfs + relay to, ho hum, relay the internal state of the class (a funny name for structs) being traced to userspace, ah, since I’m talking about the internal state: the way that it is collected at the probe points now is much, much improved, looking like blktrace, except for the fact that the data being collected is the subset of the members that are “reducible” to a basic type (signed and unsigned int, long, char, long long), which, for now, its just the basic types, but will be shortly augmented by a set of “reducers” for things like spinlock_t, wait_queue_head_t, etc.

Ok, lets break the paragraph a bit to get away from my problem with writing: very long paragraphs…

Back to how ctracer packs the internal state: have you seen how pahole can reorganize structs to fill holes and reduce the structure size? Yeah, we just look at all the struct members, looking for DW_TAG_basic_type members and leave just those, then call class__reorganize() and the struct gets in the best possible layout to save space at each probe point.

Ctracer also uses the same opportunity to generate a tool called ctracer2ostra.c, that is used to convert the binary traces to a colon separated list of formatted values easily parsable by ostra-cg, the callgraph tool that produces nifty buzzword soups (CSS, javascript, ajax in the future, cross-referencing with LXR, python matplotlib graphs of values that each traced class member got in the course of tracing, and plottings for the times each of the “class” “methods” took).

The new improved process for this is detailed in the README.ctracer file, but to give you a quick glance of what it involves:

rpm -ivh http://oops.ghostprotocols.net:81/acme/dwarves/rpm/libdwarves1-0-14.i386.rpm http://oops.ghostprotocols.net:81/acme/dwarves/rpm/dwarves-0-14.i386.rpm
mkdir foo
cd foo
ln -s /usr/lib/ctracer/* .
make CLASS=sock
insmod ctracer.ko
# do some networking kung foo fighting activity or change CLASS to your preferred struct (but beware, as we use do_gettimeofday (how lame) you can get some crashes if tracing something more fishy than struct sock methods)
cat /sys/kernel/debug/ctracer.o > /tmp/ctracer.log
rmmod ctracer.ko
make callgraphs
# Enjoy things like this, where I was bold enough to try this thing on my main machine while doing all sorts of stuff, believe me, it was just one or two dozen harmless oopses, kprobes trying to justify its paycheck!

And to top it all I’ve been getting help from Davi Arnaut on the global variables and DW_AT_location front, were he contributed basic support for where variables are in the computer memory pyramid (Registers? Stack?), he even wrote the first non-acme dwarf, pglobal, that shows all the global variables in your dear project binary files, now its just a matter of extending this to use libebl and get the register names for each arch to go to the next ctracer level, where we’ll stop using costly Jay-probes and get the elusive parameters directly from their hiding place!

Kill-a-hole: Reorganizing struct layouts

Say a project has this struct:

struct cheese {
char name[52];
char a;
int b;
char c;
int d;
short e;
};

And we want to see how the layout looks like more precisely:

[acme@filo examples]$ pahole swiss_cheese cheese
/* <11b> /home/acme/git/pahole/examples/swiss_cheese.c:3 */
struct cheese {
char name[52]; /* 0 52 */
char a; /* 52 1 */

/* XXX 3 bytes hole, try to pack */

int b; /* 56 4 */
char c; /* 60 1 */

/* XXX 3 bytes hole, try to pack */

/* — cacheline 1 boundary (64 bytes) — */
int d; /* 64 4 */
short int e; /* 68 2 */
}; /* size: 72, cachelines: 2 */
/* sum members: 64, holes: 2, sum holes: 6 */
/* padding: 2 */
/* last cacheline: 8 bytes */
[acme@filo examples]$

Heck, what a swiss cheese! Surely we can do better, huh? Lets ask pahole for a little help:

[acme@filo examples]$ pahole -kV swiss_cheese cheese
/* moving c(size=1) to after a(offset=52, size=1, hole=3) */
/* moving e(size=2) to after c(offset=53, size=1, hole=2) */

/* <11b> /home/acme/git/pahole/examples/swiss_cheese.c:3 */
struct cheese {
char name[52]; /* 0 52 */
char a; /* 52 1 */
char c; /* 53 1 */
short int e; /* 54 2 */
int b; /* 56 4 */
int d; /* 60 4 */
/* — cacheline 1 boundary (64 bytes) — */
}; /* size: 64, cachelines: 1 */
/* saved 8 bytes and 1 cacheline! */
[acme@filo examples]$

Much better now, no?

Ok, lets try something more interesting, like some Linux kernel structs, the output is big, so here are some links for some structs that spent some jiffies on pahole’s spa:

Type demotion of bigger than needed bitfields will help getting more saved :-)

ctracer made easy

Make sure you have the kernel-debuginfo package installed!

rpm -ivh http://oops.ghostprotocols.net:81/acme/dwarves/rpm/libdwarves1-0-9.i386.rpm http://oops.ghostprotocols.net:81/acme/dwarves/rpm/dwarves-0-9.i386.rpm
mkdir foo
cd foo
ln -s /usr/lib/ctracer/Makefile .
ln -s /usr/lib/ctracer/ctracer_jprobe.c .
make CLASS=sock
insmod ctracer.ko
# do some networking activity or just move the mouse :-)
dmesg # or tail /var/log/messages
rmmod ctracer

Now to work on kahole, kill a hole, a tool that will reorganize structs to kill holes and that will provide the bits needed for struct “views”, i.e. to specify a list of fields in a struct that are of interest for collection at each probe point and that will also be used to create a userspace utility that will read the relay channel (/sys/kernel/ctracer0) and generate what ostra-cg, the callgraph tool from the OSTRA days needs to generate things such as this session collecting struct task_struct methods calls in the Linux kernel, ah, ostra-cg also produces this methods statistics and from this page you get plottings such as this one for sched_
fork calls
, workload is forgotten right now, but as soon as I get my new test machine we’ll have more interesting graphics :-)

5th dwarf fighting for attention

After all it will be eclipsed by the 6th, and then by the long cherished 7th, the one that will trigger the 1.0 pahole (btw, the 1st dwarf!) release. So, without too much ado, ctracer, the aforementioned current dwarf, is making great inroads into being something useful to more than one person, perhaps, but anyway, I’m getting short on time, my wife is almost ready for us to enjoy the current weather here in Curitiba, after our beloved cocker spaniel got his nail cut way too short, bleeding :-(, well, continue reading from here

The 5th dwarf

Have to cook the turkey, so this one will be brief, have been working on a new tool, that ultimately will provide what ostra was capable of, ctracer, the new tool generates a kernel module that after built and loaded provides these preliminary results. Now to work on a not so lame relaying module for the trace information, such as the one in blktrace.

Pahole fighting non network bloat

Andrew Morton merged the two patches I’ve submitted for saving bytes in struct inode and struct mm_struct, great! I’ve also did some more work on pahole based on ideas by Bernhard Fischer, that also provided a patch fixing the –help command line option.

On the DCCP front we’re going thru some growing pains, with CCID3 getting bugs fixed but in the process seemingly having its performance dropped, in the next few days we’ll surely get it to a better state, but its nice to have more people involved and actively working on fixing bugs, implementing missing functionality and discussing the patches that are continuously being submitted, thanks guys!

Follow

Get every new post delivered to your Inbox.