Thursday, January 1, 2009

Kernel basis(1): the container_of macro

Short version:
The following article explains how container_of macro works and how to use
it(based on

Long version:
The container_of macro is defined in linux/kernel.h
* container_of - cast a member of a structure out to the containing structure
* @ptr: the pointer to the member.
* @type: the type of the container struct this is embedded in.
* @member: the name of the member within the struct.
#define container_of(ptr, type, member) ({ \
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
(type *)( (char *)__mptr - offsetof(type,member) );})

Some fundamentals:
1. typeof:
This is one of GNU C extensions. In ANSI C and ISO C, this should be
__typeof__. The macro takes in two types of arguments: an expression or
a type.

Expression: typeof(x[0](1))
where x is an array of pointers to functions. The macro returns the value
of the function.

Type: typeof(int *)
This is the type of pointers to int.

2. offsetof(TYPE, MEMBER)
This is an ANSI C library feature defined in stddef.h. It evaluates to
offset(in bytes) of a given member within a struct or union type.
Typical implementations:
#define offsetof(TYPE, MEMBER) ((site_t) &((TYPE*)0)->MEMBER)

#define offsetof(TYPE, MEMBER) \
((size_t) ((char *)&((TYPE*)(0))->MEMBER - (char *)0))

gcc has buitin offsetof. To use it, do
#undef offsetof
#ifdef __compiler_offsetof
#define offsetof(TYPE,MEMBER) __compiler_offsetof(TYPE,MEMBER)
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)

Now, we come to the container_of macro at last. Really easy to understand.
The first line defines a pointer(__mptr) that points to *ptr. So we have
tow pointers(__mptr and ptr) both pointing to the same memory location.

The second line finds the real location in memory of the containing
structure that contains what ptr points to. The offsetof macro calculates
the memory offset of member starting from type. Then it dismiss the offset
from memory location __mptr and gives us the memory location of the
containing structure, type.

How to use it?
This is useful for calling back functions between different software layers.
/* test code to illustrate use of Linux kernel container_of macro
* Copyright (c) 2008 Cliff Brake, BEC Systems LLC
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

/* This program illustrates how the container_of macro works.
* The container of macro is very useful in multi layered
* software systems where you have progressivly more detailed
* software layers. Below is an example of a bus layer,
* and then a device layer where a number of different
* devices might register with the bus.
* The device registers itself with the bus subsystem, and
* then the bus subsystem makes a callback into the device.
* Normally if there are multiple devices registered, the
* bus subsystem must store and pass a device structure
* when making callbacks. With the container_of macro, this is
* no longer necessary, and the bus subsystem only has to
* know about one generic device structure, and does not need visibility
* into lots of different device structures, or do tricks
* by casting void pointers, etc. With the container_of macro
* we can backcast from the generic data structure, to the containing
* datastructure. This forces good separation of code in that
* that bus layer cannot modifiy data structures that are specific
* to the device layer.

* (from Linux kernel source)
* container_of - cast a member of a structure out to the containing structure
* @ptr: the pointer to the member.
* @type: the type of the container struct this is embedded in.
* @member: the name of the member within the struct.

#define container_of(ptr, type, member) ({ \
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
(type *)( (char *)__mptr - offsetof(type,member) );})

#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)

* BUS layer code

/* generic bus device structure */
struct bus_device
int general_device_param_a;
int general_device_param_b;
void (*device_callback)(struct bus_device * bd);

/* the following is a global list of
* devices that have registered with the
* bus subsystem. Normally this would
* be something like a dynamic linked list.

struct bus_device * bd_list[5];

/* function to register a device with the bus */
void register_with_bus(struct bus_device * bd)
/* since this example only deals with one
* device, will put it in slot 0

bd_list[0] = bd;

void start_bus()
int i;
struct bus_device * bd;

/* make callbacks to all devices on bus */
for (i=0;i<sizeof(bd_list)/sizeof(bd_list[0]);i++) {
bd = bd_list[i];
if (!bd) continue;
/* call device callback with generic
* bus device structure


* device X specific code
* this would normally be in a different module

/* structure that holds device X specific stuff, as well as
* generic bus_device structure

struct device_x
int device_x_specific_param_a;
int device_x_specific_param_b;
struct bus_device bd;

void device_x_callback(struct bus_device * bd)
/* if we know the structure type that contains the bus_device structure,
* we can extract a pointer to the containing structure using the container_of
* macro

/* ptr type member */
struct device_x * devx = container_of(bd, struct device_x, bd);

/* the above statement expands to
* struct device_x * devx = (
* {
* const typeof( ((struct device_x *)0)->bd ) *__mptr = (bd);
* (struct device_x *)( (char *)__mptr - ((size_t) &((struct device_x *)0)->bd) );
* }
* );

printf("device_x_callback called!, device_x_specific_param_a = %i\n",

void device_x_init()
/* dynamically allocate structures */
struct device_x * devx = malloc(sizeof(*devx));
memset(devx, 0, sizeof(*devx));

/* set a parameter in the device_x structure so
* we can test for this in the callback

devx->device_x_specific_param_a = 1001;

/* set up callback function */
devx->bd.device_callback = device_x_callback;

/* we register the generic bus device structure
* as the bus layer does not need to know
* about the device_x stucture. Note, the
* devx structure is not stored anywhere, yet
* its location is being preserved without
* specifically passing it to the bus
* layer.


int main()

/* test the above system */

/* first, initialize device_x */

/* now, start the bus. This should make
* a callback into the device_x


/* when run, this program returns:
* device_x_callback called!, device_x_specific_param_a = 1001

