1. The tip tree handbook¶
1.1. What is the tip tree?¶
The tip tree is a collection of several subsystems and areas of development. The tip tree is both a direct development tree and a aggregation tree for several sub-maintainer trees. The tip tree gitweb URL is: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
The tip tree contains the following subsystems:
x86 architecture
The x86 architecture development takes place in the tip tree except for the x86 KVM and XEN specific parts which are maintained in the corresponding subsystems and routed directly to mainline from there. It’s still good practice to Cc the x86 maintainers on x86-specific KVM and XEN patches.
Some x86 subsystems have their own maintainers in addition to the overall x86 maintainers. Please Cc the overall x86 maintainers on patches touching files in arch/x86 even when they are not called out by the MAINTAINER file.
Note, that
x86@kernel.org
is not a mailing list. It is merely a mail alias which distributes mails to the x86 top-level maintainer team. Please always Cc the Linux Kernel mailing list (LKML)linux-kernel@vger.kernel.org
, otherwise your mail ends up only in the private inboxes of the maintainers.Scheduler
Scheduler development takes place in the -tip tree, in the sched/core branch - with occasional sub-topic trees for work-in-progress patch-sets.
Locking and atomics
Locking development (including atomics and other synchronization primitives that are connected to locking) takes place in the -tip tree, in the locking/core branch - with occasional sub-topic trees for work-in-progress patch-sets.
Generic interrupt subsystem and interrupt chip drivers:
interrupt core development happens in the irq/core branch
interrupt chip driver development also happens in the irq/core branch, but the patches are usually applied in a separate maintainer tree and then aggregated into irq/core
Time, timers, timekeeping, NOHZ and related chip drivers:
timekeeping, clocksource core, NTP and alarmtimer development happens in the timers/core branch, but patches are usually applied in a separate maintainer tree and then aggregated into timers/core
clocksource/event driver development happens in the timers/core branch, but patches are mostly applied in a separate maintainer tree and then aggregated into timers/core
Performance counters core, architecture support and tooling:
perf core and architecture support development happens in the perf/core branch
perf tooling development happens in the perf tools maintainer tree and is aggregated into the tip tree.
CPU hotplug core
RAS core
Mostly x86-specific RAS patches are collected in the tip ras/core branch.
EFI core
EFI development in the efi git tree. The collected patches are aggregated in the tip efi/core branch.
RCU
RCU development happens in the linux-rcu tree. The resulting changes are aggregated into the tip core/rcu branch.
Various core code components:
debugobjects
objtool
random bits and pieces
1.2. Patch submission notes¶
1.2.1. Selecting the tree/branch¶
In general, development against the head of the tip tree master branch is fine, but for the subsystems which are maintained separately, have their own git tree and are only aggregated into the tip tree, development should take place against the relevant subsystem tree or branch.
Bug fixes which target mainline should always be applicable against the mainline kernel tree. Potential conflicts against changes which are already queued in the tip tree are handled by the maintainers.
1.2.2. Patch subject¶
The tip tree preferred format for patch subject prefixes is ‘subsys/component:’, e.g. ‘x86/apic:’, ‘x86/mm/fault:’, ‘sched/fair:’, ‘genirq/core:’. Please do not use file names or complete file paths as prefix. ‘git log path/to/file’ should give you a reasonable hint in most cases.
The condensed patch description in the subject line should start with a uppercase letter and should be written in imperative tone.
1.2.3. Changelog¶
The general rules about changelogs in the process documentation, see Documentation/process/, apply.
The tip tree maintainers set value on following these rules, especially on the request to write changelogs in imperative mood and not impersonating code or the execution of it. This is not just a whim of the maintainers. Changelogs written in abstract words are more precise and tend to be less confusing than those written in the form of novels.
It’s also useful to structure the changelog into several paragraphs and not lump everything together into a single one. A good structure is to explain the context, the problem and the solution in separate paragraphs and this order.
Examples for illustration:
Example 1:
x86/intel_rdt/mbm: Fix MBM overflow handler during hot cpu When a CPU is dying, we cancel the worker and schedule a new worker on a different CPU on the same domain. But if the timer is already about to expire (say 0.99s) then we essentially double the interval. We modify the hot cpu handling to cancel the delayed work on the dying cpu and run the worker immediately on a different cpu in same domain. We donot flush the worker because the MBM overflow worker reschedules the worker on same CPU and scans the domain->cpu_mask to get the domain pointer.Improved version:
x86/intel_rdt/mbm: Fix MBM overflow handler during CPU hotplug When a CPU is dying, the overflow worker is canceled and rescheduled on a different CPU in the same domain. But if the timer is already about to expire this essentially doubles the interval which might result in a non detected overflow. Cancel the overflow worker and reschedule it immediately on a different CPU in the same domain. The work could be flushed as well, but that would reschedule it on the same CPU.Example 2:
time: POSIX CPU timers: Ensure that variable is initialized If cpu_timer_sample_group returns -EINVAL, it will not have written into *sample. Checking for cpu_timer_sample_group's return value precludes the potential use of an uninitialized value of now in the following block. Given an invalid clock_idx, the previous code could otherwise overwrite *oldval in an undefined manner. This is now prevented. We also exploit short-circuiting of && to sample the timer only if the result will actually be used to update *oldval.Improved version:
posix-cpu-timers: Make set_process_cpu_timer() more robust Because the return value of cpu_timer_sample_group() is not checked, compilers and static checkers can legitimately warn about a potential use of the uninitialized variable 'now'. This is not a runtime issue as all call sites hand in valid clock ids. Also cpu_timer_sample_group() is invoked unconditionally even when the result is not used because *oldval is NULL. Make the invocation conditional and check the return value.Example 3:
The entity can also be used for other purposes. Let's rename it to be more generic.Improved version:
The entity can also be used for other purposes. Rename it to be more generic.
For complex scenarios, especially race conditions and memory ordering issues, it is valuable to depict the scenario with a table which shows the parallelism and the temporal order of events. Here is an example:
CPU0 CPU1
free_irq(X) interrupt X
spin_lock(desc->lock)
wake irq thread()
spin_unlock(desc->lock)
spin_lock(desc->lock)
remove action()
shutdown_irq()
release_resources() thread_handler()
spin_unlock(desc->lock) access released resources.
^^^^^^^^^^^^^^^^^^^^^^^^^
synchronize_irq()
Lockdep provides similar useful output to depict a possible deadlock scenario:
CPU0 CPU1
rtmutex_lock(&rcu->rt_mutex)
spin_lock(&rcu->rt_mutex.wait_lock)
local_irq_disable()
spin_lock(&timer->it_lock)
spin_lock(&rcu->mutex.wait_lock)
--> Interrupt
spin_lock(&timer->it_lock)
1.2.4. Function references in changelogs¶
When a function is mentioned in the changelog, either the text body or the subject line, please use the format ‘function_name()’. Omitting the brackets after the function name can be ambiguous:
Subject: subsys/component: Make reservation_count static
reservation_count is only used in reservation_stats. Make it static.
The variant with brackets is more precise:
Subject: subsys/component: Make reservation_count() static
reservation_count() is only called from reservation_stats(). Make it
static.
1.2.5. Backtraces in changelogs¶
1.2.7. Links to documentation¶
Providing links to documentation in the changelog is a great help to later debugging and analysis. Unfortunately, URLs often break very quickly because companies restructure their websites frequently. Non-‘volatile’ exceptions include the Intel SDM and the AMD APM.
Therefore, for ‘volatile’ documents, please create an entry in the kernel bugzilla https://bugzilla.kernel.org and attach a copy of these documents to the bugzilla entry. Finally, provide the URL of the bugzilla entry in the changelog.
1.2.8. Patch resend or reminders¶
1.2.9. Merge window¶
Please do not expect large patch series to be handled during the merge window or even during the week before. Such patches should be submitted in mergeable state at least a week before the merge window opens. Exceptions are made for bug fixes and sometimes for small standalone drivers for new hardware or minimally invasive patches for hardware enablement.
During the merge window, the maintainers instead focus on following the upstream changes, fixing merge window fallout, collecting bug fixes, and allowing themselves a breath. Please respect that.
The release candidate -rc1 is the starting point for new patches to be applied which are targeted for the next merge window.
1.2.10. Git¶
The tip maintainers accept git pull requests from maintainers who provide subsystem changes for aggregation in the tip tree.
Pull requests for new patch submissions are usually not accepted and do not replace proper patch submission to the mailing list. The main reason for this is that the review workflow is email based.
If you submit a larger patch series it is helpful to provide a git branch in a private repository which allows interested people to easily pull the series for testing. The usual way to offer this is a git URL in the cover letter of the patch series.
1.2.11. Testing¶
Code should be tested before submitting to the tip maintainers. Anything other than minor changes should be built, booted and tested with comprehensive (and heavyweight) kernel debugging options enabled.
These debugging options can be found in kernel/configs/x86_debug.config and can be added to an existing kernel config by running:
make x86_debug.config
Some of these options are x86-specific and can be left out when testing on other architectures.
1.3. Coding style notes¶
1.3.2. Documenting locking requirements¶
Documenting locking requirements is a good thing, but comments are not necessarily the best choice. Instead of writing:
/* Caller must hold foo->lock */ void func(struct foo *foo) { ... }Please use:
void func(struct foo *foo) { lockdep_assert_held(&foo->lock); ... }In PROVE_LOCKING kernels, lockdep_assert_held() emits a warning if the caller doesn’t hold the lock. Comments can’t do that.
1.3.3. Bracket rules¶
Brackets should be omitted only if the statement which follows ‘if’, ‘for’, ‘while’ etc. is truly a single line:
if (foo)
do_something();
The following is not considered to be a single line statement even though C does not require brackets:
for (i = 0; i < end; i++)
if (foo[i])
do_something(foo[i]);
Adding brackets around the outer loop enhances the reading flow:
for (i = 0; i < end; i++) {
if (foo[i])
do_something(foo[i]);
}
1.3.4. Variable declarations¶
The preferred ordering of variable declarations at the beginning of a function is reverse fir tree order:
struct long_struct_name *descriptive_name;
unsigned long foo, bar;
unsigned int tmp;
int ret;
The above is faster to parse than the reverse ordering:
int ret;
unsigned int tmp;
unsigned long foo, bar;
struct long_struct_name *descriptive_name;
And even more so than random ordering:
unsigned long foo, bar;
int ret;
struct long_struct_name *descriptive_name;
unsigned int tmp;
Also please try to aggregate variables of the same type into a single line. There is no point in wasting screen space:
unsigned long a;
unsigned long b;
unsigned long c;
unsigned long d;
It’s really sufficient to do:
unsigned long a, b, c, d;
Please also refrain from introducing line splits in variable declarations:
struct long_struct_name *descriptive_name = container_of(bar,
struct long_struct_name,
member);
struct foobar foo;
It’s way better to move the initialization to a separate line after the declarations:
struct long_struct_name *descriptive_name;
struct foobar foo;
descriptive_name = container_of(bar, struct long_struct_name, member);
1.3.5. Variable types¶
Please use the proper u8, u16, u32, u64 types for variables which are meant to describe hardware or are used as arguments for functions which access hardware. These types are clearly defining the bit width and avoid truncation, expansion and 32/64-bit confusion.
u64 is also recommended in code which would become ambiguous for 32-bit kernels when ‘unsigned long’ would be used instead. While in such situations ‘unsigned long long’ could be used as well, u64 is shorter and also clearly shows that the operation is required to be 64 bits wide independent of the target CPU.
Please use ‘unsigned int’ instead of ‘unsigned’.
1.3.6. Constants¶
Please do not use literal (hexa)decimal numbers in code or initializers. Either use proper defines which have descriptive names or consider using an enum.
1.3.7. Struct declarations and initializers¶
Struct declarations should align the struct member names in a tabular fashion:
struct bar_order {
unsigned int guest_id;
int ordered_item;
struct menu *menu;
};
Please avoid documenting struct members within the declaration, because this often results in strangely formatted comments and the struct members become obfuscated:
struct bar_order {
unsigned int guest_id; /* Unique guest id */
int ordered_item;
/* Pointer to a menu instance which contains all the drinks */
struct menu *menu;
};
Instead, please consider using the kernel-doc format in a comment preceding the struct declaration, which is easier to read and has the added advantage of including the information in the kernel documentation, for example, as follows:
/**
* struct bar_order - Description of a bar order
* @guest_id: Unique guest id
* @ordered_item: The item number from the menu
* @menu: Pointer to the menu from which the item
* was ordered
*
* Supplementary information for using the struct.
*
* Note, that the struct member descriptors above are arranged
* in a tabular fashion.
*/
struct bar_order {
unsigned int guest_id;
int ordered_item;
struct menu *menu;
};
Static struct initializers must use C99 initializers and should also be aligned in a tabular fashion:
static struct foo statfoo = {
.a = 0,
.plain_integer = CONSTANT_DEFINE_OR_ENUM,
.bar = &statbar,
};
Note that while C99 syntax allows the omission of the final comma, we recommend the use of a comma on the last line because it makes reordering and addition of new lines easier, and makes such future patches slightly easier to read as well.
1.3.8. Line breaks¶
Restricting line length to 80 characters makes deeply indented code hard to read. Consider breaking out code into helper functions to avoid excessive line breaking.
The 80 character rule is not a strict rule, so please use common sense when breaking lines. Especially format strings should never be broken up.
When splitting function declarations or function calls, then please align the first argument in the second line with the first argument in the first line:
static int long_function_name(struct foobar *barfoo, unsigned int id,
unsigned int offset)
{
if (!id) {
ret = longer_function_name(barfoo, DEFAULT_BARFOO_ID,
offset);
...
1.3.9. Namespaces¶
Function/variable namespaces improve readability and allow easy grepping. These namespaces are string prefixes for globally visible function and variable names, including inlines. These prefixes should combine the subsystem and the component name such as ‘x86_comp_’, ‘sched_’, ‘irq_’, and ‘mutex_’.
This also includes static file scope functions that are immediately put into globally visible driver templates - it’s useful for those symbols to carry a good prefix as well, for backtrace readability.
Namespace prefixes may be omitted for local static functions and variables. Truly local functions, only called by other local functions, can have shorter descriptive names - our primary concern is greppability and backtrace readability.
Please note that ‘xxx_vendor_’ and ‘vendor_xxx_` prefixes are not helpful for static functions in vendor-specific files. After all, it is already clear that the code is vendor-specific. In addition, vendor names should only be for truly vendor-specific functionality.
As always apply common sense and aim for consistency and readability.
1.4. Commit notifications¶
The tip tree is monitored by a bot for new commits. The bot sends an email
for each new commit to a dedicated mailing list
(linux-tip-commits@vger.kernel.org
) and Cc’s all people who are
mentioned in one of the commit tags. It uses the email message ID from the
Link tag at the end of the tag list to set the In-Reply-To email header so
the message is properly threaded with the patch submission email.
The tip maintainers and submaintainers try to reply to the submitter when merging a patch, but they sometimes forget or it does not fit the workflow of the moment. While the bot message is purely mechanical, it also implies a ‘Thank you! Applied.’.
1.3.1. Comment style¶
Sentences in comments start with an uppercase letter.
Single line comments:
Multi-line comments:
No tail comments:
Comment the important things:
Function documentation comments: