ChangeSet@1.1136.78.2 2003-12-07 12:43:34-02:00 wli at holomorphy.com [PATCH] out_of_memory() locking On Sun, Nov 30, 2003 at 08:18:02AM -0800, William Lee Irwin III wrote: > (1) the timestamps/etc. weren't locked, and when cpus raced, it caused > false OOM kills > (2) the mm could go away while scanning the tasklist, causing the thing > to try to kill kernel threads > Here's a preliminary backport (please do _NOT_ apply until I or someone > tests it) for you to comment on. Basically, do you want (1) and (2) > split out, is the basic thing okay, etc.? out_of_memory()'s operational variables are not locked, and can be reset by multiple cpus simultaneously, causing false OOM kills. This patch adds an oom_lock to out_of_memory() to protect its operational variables. -- wli --- linux-2.4.23/mm/oom_kill.c.orig Tue Dec 9 00:20:47 2003 +++ linux-2.4.23/mm/oom_kill.c Tue Dec 9 00:24:20 2003 @@ -202,6 +202,11 @@ */ void out_of_memory(void) { + /* + * oom_lock protects out_of_memory()'s static variables. + * It's a global lock; this is not performance-critical. + */ + static spinlock_t oom_lock = SPIN_LOCK_UNLOCKED; static unsigned long first, last, count, lastkill; unsigned long now, since; @@ -211,6 +216,7 @@ if (nr_swap_pages > 0) return; + spin_lock(&oom_lock); now = jiffies; since = now - last; last = now; @@ -229,14 +235,14 @@ */ since = now - first; if (since < HZ) - return; + goto out_unlock; /* * If we have gotten only a few failures, * we're not really oom. */ if (++count < 10) - return; + goto out_unlock; /* * If we just killed a process, wait a while @@ -245,17 +251,25 @@ */ since = now - lastkill; if (since < HZ*5) - return; + goto out_unlock; /* * Ok, really out of memory. Kill something. */ lastkill = now; + + /* oom_kill() can sleep */ + spin_unlock(&oom_lock); oom_kill(); + spin_lock(&oom_lock); reset: - first = now; + if (first < now) + first = now; count = 0; + +out_unlock: + spin_unlock(&oom_lock); } #endif /* Unused file */