Posts Tagged ‘kernel’

PrintK Format Specifiers

Saturday, June 5th, 2010

As a kernel developer you’ll probably find yourself treating the ‘printk’ function as a drop-in replacement for the ‘printf’ function provided by any useful C library such as uclibc or glibc – After all, it’s usage is virtually the same. It was for this reason that I found my self naively surprised when reading the source for the kernel’s implementation – I was surprised because it offers many more features than the typical C libraries’ implementation. As I was unable to find any useful documentation on this – I thought I’d provide a brief overview.

Let’s start with the typical ‘%p’ type format specifier – we usually use it for printing the address of a pointer. However if you take a peek at the ‘pointer’ function in lib/vsprintf.c you’ll notice that you can further specify the pointer type to print additional information. We’ll look at some examples.

printk("%pF %pf\n", ptr, ptr) will print:
module_start module_start+0x0/0x62 [hello] 



So where ptr is a function pointer, the %pF and %pf format specifiers will print the symbolic name of the function with or without an offset. In order to make use of this you need to ensure your kernel is compiled with support for CONFIG_KALLSYMS - This adds a symbol table to the kernel.

How about this one:

printk("%pM %pm\n", mac, mac) will print:
2c:00:1d:00:1b:00 2c001d001b00 

So where mac refers to a MAC address, the %pM and %pm format specifiers will nicely print the MAC address with or without colons between bytes.

And finally:

printk("%pI4 %pi64\n", ip, ip) will print:
127.0.0.1 127.000.000.001

So where ip refers to an IP address, the %pI and %pi format specifiers will nicely print the IP address. The 4 suffix specifies the address is an IPv4 address – the 6 suffix for IPv6 address could also be used instead. In the case of IPv4 addresses the difference between an upper and lower case ‘I’ determines if leading zeros should be used (only in the most recent of kernels). In the case of IPv6 addresses the capitalization determines if colons are used or not.

So if you find yourself writing a network driver, debugging something with function pointers or wondering why stack traces don’t contain symbols then these format specifiers may come in useful. For more information, and the full extent of the extended format specifiers (there are more), the best place to look is the code. Happy Coding.

Init Call Mechanism in the Linux Kernel

Monday, November 17th, 2008

The Linux Kernel has for a long time (at least since v2.1.23) contained a clever and well optimised mechanism for calling initialisation code in drivers. It’s clever because its functionality is largely abstracted from the driver developer and well optimised because after initialisation, memory containing the initialisation code is released. This post explores how the mechanism works.

We’ll start by seeing how driver developers make use of this functionality; the following code has come from v2.6.27.6/drivers/net/smc911x.c and is the driver for a common Ethernet chipset.

2206: static int __init smc911xinit(void)
2207: {
2208: return platform_driver_register(&smc911x_driver);
2209: }
...
2216: module_init(smc911x_init);

The smc911xinit function can be considered as the entry point into the driver – of particular interest is the __init macro and the static declaration. The __init macro is used to describe the function as only being required during initialisation time. Once initialisation is performed the kernel will remove this function and release its memory. The module_init macro is used to tell the kernel where the initialisation entry point to the module lives, i.e. what function to call at ‘start of day’. In a typical driver you will often see many initialisation functions marked with the __init macro which are used for initialisation, and a single module_init declaration.

Even though we are expecting the kernel to call smc911x_init at ‘start of day’ we have marked it as static and that is OK (we will see later how the function is called). This is a particular strength of the init call mechanism as it reduces the amount of public symbols and reduces the coupling between driver modules and other parts of the kernel.

The optimisation provided by the init call mechanism also provides a means for recovering memory used by initisalation data. Such data can be ‘tagged’ with the __initdata macro.

With the above code in place, at an appropriate time during start-up, the kernel will call the smc911xinit function and once it has been executed it’s memory will be released. You can see this during the output from kernel boot (e.g. dmesg), for example an x86 machine may print the following:

Freeing unused kernel memory: 386k freed

Which means that 386k of memory that previously contained initialisation code and data has now been freed.

OK – So we’ve seen how the mechanism is used, let’s now take a closer look and see how it works under the hood. A quick ‘grep’ reveals that the __init macro is defined in include/linux/init.h:

43: #define __init      __section(.init.text) __cold

And the __section and __cold macros are defined in the include/linux/compiler*.h files:

compiler.h: 182: #define __section(S)  __attribue__ ((__section__(#S)))
compiler-gcc4.h: #define __cold        __attribue__ ((cold))

And when we expand it out we get:

#define __init __attribute__((__section__(".init.text"))) __attribute__ ((cold))

Thus, when the __init macro is used a number of GCC attributes are added to the function declaration – in the case of a different compiler, the compiler.h file will ensure the macros expand out to whatever is necessary for the relevant compiler. The cold attribute is a relatively new GCC attribute and has existed since GCC4.3 – its purpose is to mark the function as one that is rarely used, this results in the compiler optimising the function for size instead of speed. What we are really interested here is the ‘section’ attribute. This __init macro uses this attribute to inform the compiler to put the text for this function is a special section named “.init.text”. The purpose here is to put all initialisation functions in a single ELF section such that a block of them can be removed after initialisation has been performed.

So what does module_init do? Its exact functionality depends if the module in question is built-in or compiled as a loadable module. For the purpose of this post, we’ll just be looking at the built-in modules. Back to include/linux/init.h:

259: #define module_init(x) __initcall(x);
204: #define __initcall(fn) device_initcall(fn)
199: #define device_initcall __define_initcall("6", fn, 6)
169: #define __define_initcall(level, fn, id) \
170:            static initcall_t __initcall_##fn##id __used \
171:            __attribute__ ((__section__(".initcall" level ".init"))) = fn

So another load of macros that result in yet another GCC attribute!

#define module_init(x) static initcall_t __initcall_x6 __used \
                       __attribute__ ((__section(".initcall6.init"))) = x;

And for clarity, let’s exapnd our the module_init macro as seen in our ethernet driver:

static initcall_t __initcall_smc911x_init6 __used \
                  __attribute__ ((__section(".initcall6.init"))) = smc911x_init;

So module_init in the context of a built-in driver results in declaring a function pointer with a unique name to our point of entry. In addition the macro ensures the function pointer is located in a special section of the ELF – we’ll see why shortly.

So at present we have ensured all our initialisation code and data are stored in the .init.text section, and that each module has a function pointer for it’s point of entry – which has a unique name and is also stored in a special section of the resulting ELF. In addition during link time the include/asm-generic/vmlinux.lds.h and arch/*/kernel/vmlinux.lds.S scripts ensure that some labels/symbols surround the start and end of these sections. I.e. __early_initcall_end and __initcall_end mark the start and end of the function pointers and __init_begin and __init_end mark the start and end of the .init.text section.

Finally we are in place to see how these functions get called and how they are eventually freed. During kernel start up a function called do_initcalls in init/main.c is called, this is shown below.

749: static void __init do_initcalls(void)
750: {
751:      initcall_t *call;
752:
753:      for (call = __early_initcall_end; call < __initcall_end; call++)
754:           do_one_initcall(*call);
755:

The purpose of this loop is to execute each of the init functions as set up by the module_init macros. This is achieved with a simple for loop and a function pointer. Initially the function pointer is pointed to the label at the start of our function pointers ELF section, and is incremented (by the size of a function pointer (sizeof(initcall_t *)) until the end of the ELF section is reached. For each step the pointer is invoked and the init function is thus executed.

Once initialisation is complete, a function found in the architecture specific code named free_initmem is used to release the memory pages taken up by the initialisation functions and data. The exact nature of the function depends on the architecture.

So in a nutshell the kernel makes clever use of GCC attributes to ensure that initialisation functions and pointers to them are stored in unique sections of the ELF. Initialisation code at kernel start up then iterates through these function pointers and executes them in turn. Finally once all init code has been executed the entire ELF section (.init.text) is freed for use!

(This article now also be read in the Linux Gazette)

GCC Weak Symbols

Saturday, October 18th, 2008

GNU’s GCC has a useful (and perhaps not very well known) feature known as ‘weak symbols’. I first discovered this a while back when building a Linux kernel – however unbeknown to me the Linux kernel makes great use of weak symbols yet the compiler I used did not correctly support them. Rather than a failed build the kernel built fine and even run – I was instead presented with a number of interesting bugs, but more on this later.

In a nutshell weak symbols permit you to define a symbol that doesn’t need to be resolved at link time, i.e. it allows you to tell the compiler that this function may not have a body and that is OK. Furthermore, if later the compiler comes across another symbol with the same name that doesn’t have the weak attribute the original symbol will be overwritten with the stronger symbol (Without getting a multiple defination linker error). And finally you can also use the symbol to determine, at run-time, if such a body exists.

To give you an example of its use let’s refer back to my original bug…

v2.6.27/arch/sh/kernel/cpu/clock.c
292: void __init __attribute__ ((weak))
293: arch_init_clk_ops(struct clk_ops **ops, int type)
294: {
295: }

This function is part of the architecture specific (SH) code for setting up the various clocks of the device. The function defined above is used to return a structure of clock operations (struct clk_ops) which is later used to register the clock within the kernel. As you can see the function is declared with a weak symbol via the “weak” attribute. Therefore, when built correctly, the function can be overridden.

The design of this part of the kernel is such that generic clock operations are defined in clock.c and can be later overridden via weak symbols by implementations for specific CPU subtypes – for example this function is overriden in the clock-sh7712.c file…

v2.6.27/arch/sh/kernel/cpu/clock-sh7712.c
66: void __init arch_init_clk_ops(struct clk_ops **ops, int idx)
67: {
...

The function hasn’t been defined as a weak symbol and so will override the weak symbol. In this case the function will provide the caller with the clock operations specific to the SH7712. In this manner the existing generic clock support code has been designed such that it can be easily extended to support future SH subtypes. Likewise weak symbols are used elsewhere in the kernel (since 2.4.0) for similar effect.

Whilst my version of GCC claimed to support weak symbols there was a known GCC bug that prevented this from working correctly. I found that the code would only work correctly if the weak arch_init_clk_ops function had code in it’s body – what was happening was that the compiler was optimising out the function all together (with the -O2 optimisation GCC flag) and resulted in the non-weak symbol not being called (There is a quick hack to fix this which is to use the -fno-unit-at-a-time flag, however this is expected to be removed from GCC in the future.)

It’s always worth looking at the “/Documentation/Changes” file included with the kernel, it contains a list of the tools required and the minimal version of each tool. Just because the kernel builds doesn’t mean that it has built in the way intended by the Linux contributors!

References:

GCC Function Attributes (gnu.org)
GCC Help Mailing List Archive – Discussing weak symbols and optimisation (gnu.org)
Further Discussion of this bug in KGDB – Here (osdir.com) and Here (lkml.org)