Kindle Notes & Highlights
Started reading
February 18, 2022
Despite receiving my first bug report five minutes later, I was hooked on wanting to do as much as I could to make this operating system the best it could possibly be.
The Linux kernel remains a large and complex body of code, however, and would-be kernel hackers need an entry point where they can approach the code without being overwhelmed by complexity. Often, device drivers provide that gateway.
They are distinct “black boxes” that make a particular piece of hardware respond to a well-defined internal programming interface; they hide completely the details of how the device works.
Though it may appear strange to say that a driver is “flexible,” we like this word because it emphasizes that the role of a device driver is providing mechanism, not policy.
Most programming problems can indeed be split into two parts: “what capabilities are to be provided” (the mechanism) and “how those capabilities can be used” (the policy). If the two issues are addressed by different parts of the program, or even by different programs altogether, the software package is much easier to develop and to adapt to particular needs.
More generally, the kernel’s process management activity implements the abstraction of several processes on top of a single CPU or a few of them.
The computer’s memory is a major resource, and the policy used to deal with it is a critical one for system performance.
Networking must be managed by the operating system, because most network operations are not specific to a process: incoming packets are asynchronous events. The packets must be collected, identified, and dispatched before a process takes care of them.
a module is said to belong to a specific class according to the functionality it offers.
Each module usually implements one of these types, and thus is classifiable as a char module, a block module, or a network module.
Good programmers, nonetheless, usually create a different module for each new functionality they implement, because decomposition is a key element of scalability and extendability.
Such an entity is not a device driver, in that there’s no explicit device associated with the way the information is laid down; the filesystem type is instead a software driver, because it maps the low-level data structures to high-level data structures.
Be careful with uninitialized memory; any memory obtained from the kernel should be zeroed or otherwise initialized before being made available to a user process or device.
It is also possible, with 2.2 and later kernels, to disable the loading of kernel modules after system boot via the capability mechanism.
Under Unix, the kernel executes in the highest level (also called supervisor mode ), where everything is allowed, whereas applications execute in the lowest level (the so-called user mode ), where the processor regulates direct access to hardware and unauthorized access to memory.
Kernel code does not run in such a simple world, and even the simplest kernel modules must be written with the idea that many things can be happening at once.
If you do not write your code with concurrency in mind, it will be subject to catastrophic failures that can be exceedingly difficult to debug.
The result is an architecture-dependent mechanism that, usually, hides a pointer to the task_struct structure on the kernel stack.
The command name stored in current->comm is the base name of the program file (trimmed to 15 characters if need be) that is being executed by the current process.
The kernel, instead, has a very small stack; it can be as small as a single, 4096-byte page. Your functions must share that stack with the entire kernel-space call chain. Thus, it is never a good idea to declare large automatic variables; if you need larger structures, you should allocate them dynamically at call time.
Essentially, the double underscore says to the programmer: “If you call this function, be sure you know what you are doing.”
Kernel code cannot do floating point arithmetic. Enabling floating point would require that the kernel save and restore the floating point processor’s state on each entry to, and exit from, kernel space — at least, on some architectures. Given that there really is no need for floating point in kernel code, the extra overhead is not worthwhile.
The files found in the Documentation/kbuild directory in the kernel source are required reading for anybody wanting to understand all that is really going on beneath the surface.
One of the steps in the build process is to link your module against a file (called vermagic.o) from the current kernel tree; this object contains a fair amount of information about the kernel the module was built for, including the target kernel version, compiler version, and the settings of a number of important configuration variables.
Version dependency should, however, not clutter driver code with hairy #ifdef conditionals; the best way to deal with incompatibilities is by confining them to a specific header file.
Unlike application developers, who must link their code with precompiled libraries and stick to conventions on parameter passing, kernel developers can dedicate some processor registers to specific roles, and they have done so.
As a general rule, distributing things in source form is an easier way to make your way in the world.
For example, the video-for-linux set of drivers is split into a generic module that exports symbols used by lower-level device drivers for specific hardware.
A relatively recent convention in kernel code, however, is to put these declarations at the end of the file.
By facility, we mean a new functionality, be it a whole driver or a new software abstraction, that can be accessed by an application.
Initialization functions should be declared static, since they are not meant to be visible outside the specific file; there is no hard rule about this, though, as no function is exported to the rest of the kernel unless explicitly requested.
Use of _ _init and _ _initdata is optional, but it is worth the trouble.
The use of module_init is mandatory. This macro adds a special section to the module’s object code stating where the module’s initialization function is to be found. Without this definition, your initialization function is never called.
Most registration functions are prefixed with register_,
If your module is built directly into the kernel, or if your kernel is configured to disallow the unloading of modules, functions marked _ _exit are simply discarded.
Once again, the module_exit declaration is necessary to enable to kernel to find your cleanup function.
If your module does not define a cleanup function, the kernel does not allow it to be unloaded.
Linux doesn’t keep a per-module registry of facilities that have been registered, so the module must back out of everything itself if initialization fails at some point. If you ever fail to unregister what you obtained, the kernel is left in an unstable state; it contains internal pointers to code that no longer exists.
If you are not careful in how you write your initialization function, you can create situations that can compromise the stability of the system as a whole.
The first is that you should always remember that some other part of the kernel can make use of any facility you register immediately after that registration has completed. It is entirely possible, in other words, that the kernel will make calls into your module while your initialization function is still running.
parameters, the module must make them available. Parameters are declared with the module_param macro, which is defined in moduleparam.h.
If you really need a type that does not appear in the list above, there are hooks in the module code that allow you to define them; see moduleparam.h for details on how to do that.
If perm is set to 0, there is no sysfs entry at all; otherwise, it appears under /sys/module [3] with the given set of permissions.
Note that if a parameter is changed by sysfs, the value of that parameter as seen by your module changes, but your module is not notified in any other way. You should probably not make module parameters writable, unless you are prepared to detect the change and react accordingly.
Your code should, of course, never make any assumptions about the internal organization of device numbers; it should, instead, make use of a set of macros found in <linux/kdev_t.h>.
One assumes that the wider range will be sufficient for quite some time, but the computing field is littered with erroneous assumptions of that nature.
Often, however, you will not know which major numbers your device will use; there is a constant effort within the Linux kernel development community to move over to the use of dynamicly-allocated device numbers. The kernel will happily allocate a major number for you on the fly, but you must request this allocation by using a different function:
The usual place to call unregister_chrdev_region would be in your module’s cleanup function.
Before a user-space program can access one of those device numbers, your driver needs to connect them to its internal functions that implement the device’s operations.
Some major device numbers are statically assigned to the most common devices. A list of those devices can be found in Documentation/devices.txt within the kernel source tree.