Linux: The Intrusive Linked List Implementation
A linked list is a basic data structure widely used among developers to implement stack, queue, hash table, graph, etc. It is quite extensively used in the Linux kernel to implement process and memory management.
A linked list is a basic data structure widely used among developers to implement stack, queue, hash table, graph, etc. It is quite extensively used in the Linux kernel to implement process and memory management. In this blog, I will explain why the Linux kernel implements linked lists differently and how.
A generic Linked List node contains data and reference pointers pointing to the next or previous node.
struct node {
int data;
struct node *prev;
struct node *next;
}
The head
pointer points to the start of the linked list. A typical linked list implementation requires three basic routines -
1. initialize(struct node *head);
2. struct node * add_entry(struct node *head, struct node *new_node);
3. struct node * remove_entry(struct node *head);
initialize
: To initialize a linked list head.add_entry
: Add a new node at the start of the list.remove_entry
: Remove a node from the start of the list.
The problem with this approach
Let us say I want to create another linked list struct student
and use these routines for list operations.
Will it work? The answer is - NO! Why?
We will have to re-implement all these three functions again for struct student
. The link pointers (prev
/next
) in struct node
are pointers to the node itself and can't be made to point to any other struct. In simpler words, the above initialize
, add_entry
,remove_entry
functions can't be used to create another linked list of struct student
as it only accepts struct node
pointer.
Intrusive Implementation: The Linux way
The idea is to make the link pointers independent of the struct node
it is pointing to. So what if we isolate link pointers in a separate struct list_head
and add it to struct node
.
struct list_head {
struct list_head *prev;
struct list_head *next;
};
Now let us create the actual struct node
containing data and add struct list_head
to it.
struct node {
int data;
struct list_head list;
};
Let us create a list head pointing to the start of the list and initialize it.
struct list_head list;
void initialize(struct list_head *list) {
list->next = list;
list->prev = list;
}
...
initialize(&list);
Now, let us add a new node to the list using add_entry
function.
void add_entry(struct list_head *list, struct list_head *new) {
struct list_head *next;
next = list->next;
next->prev = new;
list->next = new;
new->prev = list;
new->next = list;
}
...
struct node A;
A.data = 1;
...
add_entry(&list, &(A.list));
Let us add another struct node
.
struct node B;
B.data = 2;
add_entry(&list, &(B.list));
Here, the link
pointers in a node points to the link
pointers in the newly added node and not the node itself. This makes the linked list independent of the actual data node being added.
You might be wondering if we have a reference for the list_head
in the struct node
but not the node itself. How do we access the data part of struct node
.
Let us say you have a pointer to the struct node
, would you be able to get the pointer to list_node
contained in it?
Going ahead we will learn how to retrieve the reference to struct node
given the reference to list_head
.
Introducing OFFSET_OF
Macro
#define OFFSET_OF(Type, Field) (unsinged long long) ((&(((Type*)0)->Field)))
offset_of_list = OFFSET_OF(struct node, list);
This macro calculates the offset of a field Field
in a struct Type
. For example, let us say you have a reference to a struct node
, you can calculate the address of struct list
contained in it by &(head->list)
.
Consider head
to be a pointer to a struct node
with the starting address as 0x0
, then &(head->list)
will be the offset of the field list
.
Macro simplified
((Type *)0x0)
: Takes the integer0x0
the and casts it to the pointer of the typeType
.(((Type*)0x0)->Field)
: Dereferences that pointer to point to structure memberField
.(&(((Type*)0x0)->Field))
: Computes the address ofField
, which effectively becomes the offset of theField
as well.
Now that you can retrieve the address of a field within a struct. Can we do the reverse? Let us say we have the address of a field list
in struct node
, Can I compute the address of struct node
itself?
Yes, All we need to do is subtract the offset of the list
from the address of list
. This can be done by a beautiful macro.
The CONTAINER_OF
Macro
#define CONTAINER_OF(ptr, Type, Field) \
(Type*)((unsigned long long)(ptr) - OFFSET_OF(Type, Field))
That's how using the macros and the pointer tracking infra that Linux provides, we can access all the linked list nodes without maintaining a direct pointer to it...
Discussion