- Documentation for KMSGDUMP v0.4 - [Sun Sep 19 19:30:32 CEST 1999] - Willy Tarreau 1. What is KMSGDUMP ? ~~~~~~~~~~~~~~~~~~~~~ KMSGDUMP is an extension to the Linux kernel which allows the user on the console to dump the last kernel messages onto a floppy diskette, thus avoiding to take a pen and a paper to copy them when the system is stuck. Only 3"1/2, 1.44 MB diskettes are supported by default. Other capacities might work, provided you change the geometry in the file "kmsgdump.h". 2. How does it work ? ~~~~~~~~~~~~~~~~~~~~~ There are two ways of getting a dump : - by pressing SysRQ+D (RightAlt - PrintScrn - D together) ; - after a kernel panic has occured, a dump may be automatically generated. Before anything else, you MUST KNOW that in order to get maximal chances to complete the dump succesfully, the CPU is rebooted in real mode and disk accesses are made via the Bios. This ensures that even if kernel memory is really corrupted, the dump still has chances to work, but this also implies that after a dump has occured, it is IMPOSSIBLE TO CONTINUE TO WORK WITH THE CURRENT KERNEL. You will have to REBOOT. So when your kernel still responds, you'd better get a similar dump by entering one of the following commands : # dmesg > /dev/fd0 ( for RAW mode ) or # dmesg | mwrite a:messages.txt ( for FAT mode ) Second, be sure that FLOPPY CONTENTS WILL BE LOST AFTER A DUMP. Even if there are cases in which you can dump at the end of a diskette without losing the beginning, consider that by default the beginning of the diskette will be ERASED and you won't be able to recover what's on it. You have been warned. 3. Modes of operation ~~~~~~~~~~~~~~~~~~~~~ There are two modes of operation : manual and automated. Manual mode (or interactive mode) is always entered if you hit SysRQ+D. But it is also entered during a kernel panic if the current mode is set to "manual". This mode is recommended for a developper's workstation, or a kernel running under an emulator such as vmware. It's recommended to disable interactive mode on servers which may crash when nobody is near to reboot them. Automatic mode can only be entered during a kernel panic and if automatic mode was previously configured. Sometimes, the system is really weird and even kmsgdump can cause recursive crashes (this has been reported to me once). For this reason I've added a checkpoint mechanism to the code : every little part of code is checkpointed, and if a crash occurs again, the same part is not executed again, to prevent loopings. So there are more chances to get to the reset routine which will, in the worst case, reboot the system, but not let it loop undefinetely. 3.1. Manual mode ~~~~~~~~~~~~~~~~ Under manual mode, the screen initialized to color 80x25 mode (bios mode 3) with a blue background. [Note: some people asked me to set other colors to avoid confusion with another OS' BSOD, but I couldn't find good associations. Eventhough I've received an interesting comment about the way to choose colors readable on any color or monochrome display, I'm waiting for suggestions, and for the moment we'll say that these are the colors of Midnight Commander and call this "BSOL" (blue screen of life) because this one is interactive.] The screen is divided in two portions. The upper one displays the current status (kernel version, drive unit, printer, format...), and the lower one the messages captured before switching to real mode. The internal speaker beeps if a key has not been hit within 3 seconds. This is simply to get someone's attention, mainly in cases where no monitor is connected to the PC. The interface is not case-sensitive about keys pressed. Keys used are : Upper arrow : scroll messages to the beginning Lower arrow : scroll messages to the end B : immediately reBoot the system D : Dump messages onto the selected floppy with selected format. Warning: no check is done before, and the floppy will simply be overwritten by the messages. F : select Format, by switching between RAW and FAT12 H : immediately Halt the system. I : display Information, little help about the keys. P : Print messages on the currently selected printer. If you press this key by accident, wait about one minute for the bios routine to timeout, and you'll here the beeps again, stating that you can play again. T : select next available prinTer. The system tests if a printer is connected at the other end of the cable, and skips the empty ports. U : change drive Unit. Although dump is possible on hard disks, they are never proposed in the interface to avoid dramatical mistakes. Other keys are simply ignored. After a succesful dump or print, 3 quick beeps are played. In case of an error, only one beep is played. This is important if you act blindly with a keyboard and no monitor. 3.2. Automatic mode ~~~~~~~~~~~~~~~~~~~ Automated operation is performed by the system only when a kernel panic occurs. In this case, the system waits for the "panic_timeout" delay to let you a few seconds if you want to try to play with SysRQ (sync, unmount filesystems, ...). This delay is configurable by entering a number of seconds in "/proc/sys/kernel/panic". After that, the system is rebooted to real mode, and depending on the mode of operation chosen, either the interactive mode is entered (see above) or it is the automatic mode, which we'll describe here. 3.2.1. Start of operation ~~~~~~~~~~~~~~~~~~~~~~~~~ Some checks are performed. First, the system sees if the dump feature is enabled or not. If not, operation ends (see below). If dump is enabled, and if the "safe" flag is enabled, the diskette is verified to be a real "KMSGDUMP" diskette and not another one (read section 4 to know how to prepare a secure diskette for KMSGDUMP). If the diskette isn't a right one, operation ends. If the diskette is a right one, or if the check has been disabled, the dump is performed with the current parameters (unit, format...). 3.2.2. End of operation ~~~~~~~~~~~~~~~~~~~~~~~ After completion of an automatic dump, or when a dump is aborted, the system can either halt or reboot. In case of redundant servers, you may prefer halt a buggy system, because another one ensures the service continues to work. But in other cases, you may prefer rebooting to quickly restart services. This is also configurable (read section 4). 4. How a crash can be prepared ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4.1. Kernel options ~~~~~~~~~~~~~~~~~~~ First, choose the kernel compilation options which matches better your situation. This may seem obvious, but you can reduce the risks of crash by not enabling drivers designated for hardware you don't have. Specially on servers, use only a reduced feature set, because you know exactly what you need (eg: don't enable NTFS and QNX filesystems if you don't need them). Configure KMSGDUMP options to match your needs. Don't ask to auto-dump if you don't have a floppy drive. In this case, you might prefer to enable interactive mode to display messages on the screen and eventually print them. When you use SCSI hard disks, you can sometimes reduce the reset time to help the system recover faster. Eg: on my system, I have an AHA2940UW which waits 15 seconds by default. All peripherals still work well with 1 second, so 14 seconds are won. If you have changed your messages buffer size (which is 16 kB by default), you should accord the size in "include/asm/kmsgdump.h", parameter LOG_BUG_LEN. Some people required 32 kB. But you shouldn't exceed 60 kB since the dump is done in real mode (16 bits). 4.2. Configure KMSGDUMP ~~~~~~~~~~~~~~~~~~~~~~~ If your kernel supports SYSCTL, you can adjust KMSGDUMP parameters by writing a string to /proc/sys/kernel/kmsgdump. This string consists in a concatenation of flags. Most of them are only booleans. For each boolean, a complementary flag exists to avoid any ambiguous interpretation. For the moment, the flags are : Name Description Default Complement F FAT mode Yes R R Raw mode F A Automatic mode Yes I I Interactive mode A B Boot after dump Yes H only used in automatic mode H Halt after dump B only used in automatic mode S Safe mode Yes O only used in automatic mode O Overwrite disk S only used in automatic mode E Enable dumping Yes D only used in automatic mode D Disable dumping E only used in automatic mode Txxx Track xxx 0 (N/A) first track is 0 per default Uxxx Unit xxx 0 (N/A) bios drive is 0 (A:) per default Note: default means "default if none specified". Example: if you enter the following command, a kernel panic will generate a dump in FAT mode after verifying that the disk has been prepared for a dump, and then it will reboot : # echo "FABSE" > /proc/sys/kernel/kmsgdump This one will ask to dump raw messages at the end of the diskette in drive B and halt : # echo "RABOET79U1" > /proc/sys/kernel/kmsgdump And this one will ask for a quick reboot : # echo "DB" > /proc/sys/kernel/kmsgdump 4.3. Prepare a disk for kmsgdump ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If safe mode is required, before an automatic dump, the system will read the beginning of the floppy in the drive and will look for the word "KMSGDUMP" at offset 3 of the first sector. This is the label of the diskette. The dump will only be performed if this word is found as-is. So if you enable safe mode don't forget to prepare your diskettes with the following command, provided your diskette is in drive A : # echo "012KMSGDUMP" > /dev/fd0 Please note that when the dump is performed in FAT mode, this word is written to the same place. This has two side effects : - a diskette on which a dump has been done in FAT mode is re-usable without intervention. - you can prepare a diskette by entering kmsgdump (SysRQ+D) and doing a FAT mode dump. On the other hand, when a RAW dump is done at the beginning of the disk, it cannot be used again as a "safe kmsgdump disk". Moreover, letting it in the drive when rebooting will cause the system to hang if the bios tries to boot from the floppy first. 4.4. Prepare the PC for a crash ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Because you'll have to leave a diskette in a drive, you may have to setup your bios to boot from hard disk or anything but the floppy first because the bios will find anything but a bootable system on this floppy. The problem is with older systems on which the boot sequence cannot be changed. For this reason, when a diskette is formated in FAT mode, a small code is inserted on the boot sector which tries to redirect the boot to the first hard disk seen by the bios. This is *generally* the bootable disk, but this may not be the right on specific systems, so you may have to do some tests before considering this option to be the right one for you. If your system is a server, you may reduce the time the bios tests the PC to ensure quick reboot. On some systems, you can turn on the option "Quick power-on self test", and disable testings of memory above 1MB. 5. Reading the messages back ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5.1. FAT-formated disks ~~~~~~~~~~~~~~~~~~~~~~~ If the disk has been formated as FAT12, you'll find on it a file named "MESSAGES.TXT" which contains all messages buffer. If the buffer is not full, the end of the file is filled with zeroes, so it's better to delete them using "tr" under linux. - under Linux, either mount the disk : # mount -rt msdos /dev/fd0 /mnt # cat /mnt/messages.txt | tr -d '\000' # umount /mnt or read it using mtools : # mtype a:messages.txt | tr -d '\000' - under DOS, you can simply run EDIT : C:\> edit a:messages.txt - under Windows, you can open the file with Wordpad. Avoid using Notepad since it doesn't understand linefeeds only. 5.2. RAW disks ~~~~~~~~~~~~~~ Raw disks will be readable under linux by using the utility DD. By default, the dump will be performed from the first sector of the disk. Example with 16 kB messages : # dd if=/dev/fd0 bs=512 count=32 | tr -d '\000' If you specified "T79" in the parameters to dump on track 79 of the disk, you have to do some calculations : A 1.44 MB disk has 18 sectors/track, 2 heads and 512 bytes/sector so 18*2*512 equals 18432 bytes/track. You'll have to skip 18432 bytes for each unwanted track. But you can also count only with kilobytes : if you consider that a track is exactly 18 kilobytes, then skip the number of tracks times 18 kilobytes : # expr 79 \* 18 1422 # dd if=/dev/fd0 bs=1024 skip=1422 count=16 | tr -d '\000' The default dd utility reads all data from the start of the disk so this can be quite long. There are other implementations on the net which do an "lseek" before the first read. 6. Other speed improvements ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Here are some advices to make a system reboot faster. When a file server crashes, it may FSCK during a long time. There are good docs about how to dramatically reduce FSCK time, but at least consider these methods : - in /etc/fstab, set the sixth field (fs_passno) to 1 for the root fs, and 2 for every other fs. FSCK will know it what it can parallelize depending on hardware dependencies. In the better case, you can devide the total time by the number of physical disks. (man fstab and man fsck for more info). - when possible, mount filesystems read-only. On an anonymous FTP server, for example, it's not always necessary to mount everything RW. So before copying files onto an fs, remount it RW : # mount -wo remount /mount/point At the end, remount it RO : # mount -ro remount /mount/point - change the number of bytes by inode and the block size when formating your FS. I personnaly use 16384 bytes/inode, a block size of 4096 bytes, the sparse flag set (reduces the number of superblocks). This makes me waste about 1% space, but total mount time is about 1 second for a total of 8 FS's, 11 gigs on 5 separate disks and the total FSCK time after a loosy power-off is less than 3 minutes. And of course, don't start services you don't need ! Sendmail itself can take a long time if it cannot resolve the domain name. 7. For more information and/or suggestions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For more informations, you can email me at : willy@meta-x.org ( be patient, I read my mail when I can, and can't always reply. I'm used to "tail -1000 $MAIL|less" or "less +G $MAIL" ) For suggestions, you can either email them to me, or share them with the Linux Kernel Mailing List : linux-kernel@vger.rutgers.edu Enjoy using it, Willy