the patch below further optimizes the x86 context-switch code. On P6 class CPUs it takes 36 cycles to reload %fs and %gs, while both registers are zero in most of the cases. (exceptions are LinuxThreads, Wine, DOSEMU, etc.) It's much cheaper to check whether any non-zero %fs or %gs selector is involved, and do the segment reload in that case only. when applied to 2.5.3-pre5, the patch achieves a 2.5% improvement in 2-task lat_ctx context-switch performance. (i've also added unlikely() constructs to context-switch slow paths. The performance improvement was measured on kernel compiled with a 2.x gcc compiler that does not use the unlikely() extension yet.) Ingo --- linux/arch/i386/kernel/process.c.orig Sun Jan 27 15:33:29 2002 +++ linux/arch/i386/kernel/process.c Sun Jan 27 16:07:45 2002 @@ -689,15 +689,17 @@ asm volatile("movl %%gs,%0":"=m" (*(int *)&prev->gs)); /* - * Restore %fs and %gs. + * Restore %fs and %gs if needed. */ - loadsegment(fs, next->fs); - loadsegment(gs, next->gs); + if (unlikely(prev->fs | prev->gs | next->fs | next->gs)) { + loadsegment(fs, next->fs); + loadsegment(gs, next->gs); + } /* * Now maybe reload the debug registers */ - if (next->debugreg[7]){ + if (unlikely(next->debugreg[7])) { loaddebug(next, 0); loaddebug(next, 1); loaddebug(next, 2); @@ -707,7 +709,7 @@ loaddebug(next, 7); } - if (prev->ioperm || next->ioperm) { + if (unlikely(prev->ioperm || next->ioperm)) { if (next->ioperm) { /* * 4 cachelines copy ... not good, but not that To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/