Site Home Page
The UML Wiki
UML Community Site
The UML roadmap
What it's good for
Case Studies
Kernel Capabilities
Downloading it
Running it
Compiling
Installation
Skas Mode
Incremental Patches
Test Suite
Host memory use
Building filesystems
Troubles
User Contributions
Related Links
The ToDo list
Projects
Diary
Thanks
Contacts
Tutorials
The HOWTO (html)
The HOWTO (text)
Host file access
Device inputs
Sharing filesystems
Creating filesystems
Resizing filesystems
Virtual Networking
Management Console
Kernel Debugging
UML Honeypots
gprof and gcov
Running X
Diagnosing problems
Configuration
Installing Slackware
Porting UML
IO memory emulation
UML on 2G/2G hosts
Adding a UML system call
Running nested UMLs
How you can help
Overview
Documentation
Utilities
Kernel bugs
Kernel projects
Screenshots
A virtual network
An X session
Transcripts
A login session
A debugging session
Slackware installation
Reference
Kernel switches
Slackware README
Papers
ALS 2000 paper (html)
ALS 2000 paper (TeX)
ALS 2000 slides
LCA 2001 slides
OLS 2001 paper (html)
OLS 2001 paper (TeX)
ALS 2001 paper (html)
ALS 2001 paper (TeX)
UML security (html)
LCA 2002 (html)
WVU 2002 (html)
Security Roundtable (html)
OLS 2002 slides
LWE 2005 slides
Fun and Games
Kernel Hangman
Disaster of the Month

A debugging session

The following is the beginning of a gdb session with the kernel under gdb from the beginning. It starts at the top of start_kernel() and goes one line at a time through the initial kernel startup.
              
GNU gdb 4.17.0.11 with Linux support
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb)  att 1
Attaching to program `/home/dike/linux/2.3.26/um/linux', Pid 1
0x1009f791 in __kill ()
(gdb)  b start_kernel
Breakpoint 1 at 0x100ddf83: file init/main.c, line 515.
(gdb)  c
Continuing.

Breakpoint 1, start_kernel () at init/main.c:515
515             printk(linux_banner);
(gdb)  n
516             setup_arch(&command_line);
(gdb) 
517             printk("Kernel command line: %s\n", saved_command_line);
(gdb) 
518             parse_options(command_line);
(gdb) 
519             trap_init();
(gdb) 
520             init_IRQ();
(gdb) 
521             sched_init();
(gdb) 
522             time_init();
(gdb) 
523             softirq_init();
(gdb) 
530             console_init();

            
This is tiring, so I just let it continue booting.
              
(gdb)  c
Continuing.

            
It's booted, so I ^C it to see what it thinks is up.
              
Program received signal SIGINT, Interrupt.
0x100a4bc1 in __libc_nanosleep ()
(gdb)  bt
#0  0x100a4bc1 in __libc_nanosleep ()
#1  0x100a4b7d in __sleep (seconds=10) at ../sysdeps/unix/sysv/linux/sleep.c:78
#2  0x10095fbf in do_idle () at process_kern.c:424
#3  0x10096052 in cpu_idle () at process_kern.c:450
#4  0x100de0a4 in start_kernel () at init/main.c:593
#5  0x10098df2 in start_kernel_proc (unused=0x0) at um_arch.c:72
#6  0x1009858f in signal_tramp (arg=0x10098db8) at trap_user.c:50
(gdb) 

            
It's busy sleeping in the idle loop. I'll set a breakpoint in the scheduler and pick it up on the next context switch.
              
(gdb)  b schedule
Breakpoint 2 at 0x10004acd: file sched.c, line 496.
(gdb)  c
Continuing.

Breakpoint 2, schedule () at sched.c:496
496             if (!current->active_mm) BUG();
(gdb)  bt
#0  schedule () at sched.c:496
#1  0x10095fb3 in do_idle () at process_kern.c:421
#2  0x10096052 in cpu_idle () at process_kern.c:450
#3  0x100de0a4 in start_kernel () at init/main.c:593
#4  0x10098df2 in start_kernel_proc (unused=0x0) at um_arch.c:72
#5  0x1009858f in signal_tramp (arg=0x10098db8) at trap_user.c:50

            
Here we are in the scheduler. I'll 'next' through the first few lines of the scheduler, get bored, and set a breakpoint in the SIGIO interrupt handler.
              
(gdb)  n
497             if (tq_scheduler)
(gdb) 
501             prev = current;
(gdb) 
502             this_cpu = prev->processor;
(gdb) 
504             if (in_interrupt())
(gdb) 
510             if (softirq_state[this_cpu].active & softirq_state[this_cpu].mask)
(gdb) 
518             sched_data = & aligned_data[this_cpu].schedule_data;
(gdb) 
520             spin_lock_irq(&runqueue_lock);
(gdb)  b sigio_handler
Breakpoint 3 at 0x10094fdc: file irq_user.c, line 36.
(gdb)  c
Continuing.

Breakpoint 2, schedule () at sched.c:496
496             if (!current->active_mm) BUG();

            
Oops, that process scheduled back to the idle thread. Get rid of that breakpoint and continue again.
              
(gdb)  d 2
(gdb)  c

            
Well, the SIGIO handler is waiting for something and nothing is happening by itself. So, I'll type something at one of the login prompts to wake it up.
              
Breakpoint 3, sigio_handler (sig=29) at irq_user.c:36
36              user_mode = set_user_thread(NULL, 0, 0);

            
That did the trick. I'll climb down the call chain into the actual driver interrupt handler, starting with a breakpoint in do_IRQ.
              
(gdb)  l
31              struct irq_fd *irq_fd;
32              struct timeval tv;
33              fd_set fds;
34              int i, n, user_mode;
35
36              user_mode = set_user_thread(NULL, 0, 0);
37              if(user_mode){
38                      fill_in_regs(process_state(NULL, NULL, NULL), &sig + 1);
39                      change_sig(SIGUSR1, 1);
40              }
(gdb)  l
41              fds = active_fd_mask;
42              tv.tv_sec = 0;
43              tv.tv_usec = 0;
44              if((n = select(max_fd + 1, &fds, NULL, NULL, &tv)) < 0){
45                      printk("sigio_handler : select returned %d, "
46                             "errno = %d\n", n, errno);
47                      return;
48              }
49              for(i=0;i<=max_fd;i++){
50                      if(FD_ISSET(i, &fds)) FD_CLR(i, &active_fd_mask);
(gdb)  l
51              }
52              for(irq_fd=active_fds;irq_fd != NULL;irq_fd = irq_fd->next){
53                      if(FD_ISSET(irq_fd->fd, &fds)) do_IRQ(irq_fd->irq, user_mode);
54              }
55              if(user_mode){
56                      interrupt_end();
57                      change_sig(SIGUSR1, 0);
58              }
59              set_user_thread(NULL, user_mode, 0);
60      }
(gdb)  b do_IRQ
Breakpoint 4 at 0x10094960: file irq.c, line 266.
(gdb)  c
Continuing.

            
From here, I'll go into handle_IRQ_event.
              
Breakpoint 4, do_IRQ (irq=2, user_mode=0) at irq.c:266
266             irq_desc_t *desc = irq_desc + irq;
(gdb)  n
271             regs.user_mode = user_mode;
(gdb) 
272             kstat.irqs[cpu][irq]++;
(gdb) 
274             desc->handler->ack(irq);
(gdb) 
279             status = desc->status & ~(IRQ_REPLAY | IRQ_WAITING);
(gdb) 
280             status |= IRQ_PENDING; /* we _want_ to handle it */
(gdb) 
286             action = NULL;
(gdb) 
287             if (!(status & (IRQ_DISABLED | IRQ_INPROGRESS))) {
(gdb)  l
282             /*
283              * If the IRQ is disabled for whatever reason, we cannot
284              * use the action we have.
285              */
286             action = NULL;
287             if (!(status & (IRQ_DISABLED | IRQ_INPROGRESS))) {
288                     action = desc->action;
289                     status &= ~IRQ_PENDING; /* we commit to handling */
290                     status |= IRQ_INPROGRESS; /* we are handling it */
291             }
(gdb)  l
292             desc->status = status;
293
294             /*
295              * If there is no IRQ handler or it was disabled, exit early.
296                Since we set PENDING, if another processor is handling
297                a different instance of this same irq, the other processor
298                will take care of it.
299              */
300             if (!action)
301                     goto out;
(gdb)  l
302
303             /*
304              * Edge triggered interrupts need to remember
305              * pending events.
306              * This applies to any hw interrupts that allow a second
307              * instance of the same irq to arrive while we are in do_IRQ
308              * or in the handler. But the code here only handles the _second_
309              * instance of the irq, not the third or fourth. So it is mostly
310              * useful for irq hardware that does not mask cleanly in an
311              * SMP environment.
(gdb)  l
312              */
313             for (;;) {
314                     spin_unlock(&desc->lock);
315                     handle_IRQ_event(irq, &regs, action);
316                     spin_lock(&desc->lock);
317
318                     if (!(desc->status & IRQ_PENDING))
319                             break;
320                     desc->status &= ~IRQ_PENDING;
321             }
(gdb)  b 315
Breakpoint 5 at 0x100949b7: file irq.c, line 315.
(gdb)  c
Continuing.

            
Next, I'll step into handle_IRQ_event and stop just before entering the driver.
              
Breakpoint 5, do_IRQ (irq=2, user_mode=0) at irq.c:315
315                     handle_IRQ_event(irq, &regs, action);
(gdb)  s
handle_IRQ_event (irq=2, regs=0x10113c40, action=0x50fef380) at irq.c:141
141             irq_enter(cpu, irq);
(gdb)  l
136                          struct irqaction * action)
137     {
138             int status;
139             int cpu = smp_processor_id();
140
141             irq_enter(cpu, irq);
142
143             status = 1;     /* Force the "do bottom halves" bit */
144
145             if (!(action->flags & SA_INTERRUPT))
(gdb)  l
146                     __sti();
147
148             do {
149                     status |= action->flags;
150                     action->handler(irq, action->dev_id, regs);
151                     action = action->next;
152             } while (action);
153             if (status & SA_SAMPLE_RANDOM)
154                     add_interrupt_randomness(irq);
155             __cli();
(gdb)  l
156
157             irq_exit(cpu, irq);
158
159             return status;
160     }
161
162     /*
163      * Generic enable/disable code: this just calls
164      * down into the PIC-specific version for the actual
165      * hardware disable after having gotten the irq
(gdb)  b 150
Breakpoint 6 at 0x10094813: file irq.c, line 150.
(gdb)  c
Continuing.

Breakpoint 6, handle_IRQ_event (irq=2, regs=0x10113c40, action=0x50fef380)
    at irq.c:150
150                     action->handler(irq, action->dev_id, regs);

            
So, here we are in the console driver. I think I've made whatever point I was making, so I'll just delete all the breakpoints, and let the kernel run so I can log in and halt it.
              
(gdb)  s
con_handler (irq=2, dev=0x10120000, unused=0x10113c40) at stdio_console.c:41
41              stdio_rcv_proc(term->fd);
(gdb) 
(gdb)  d
Delete all breakpoints? (y or n) y
(gdb)  c
Continuing.

            
Hosted at SourceForge Logo