The initial patch for 2.4-ac was from David Gibson, which I rediffed and ported to recent 2.4-vanilla equipped with the aa-VM. It was necessary to use a dedicated flag (PG_fs_1) to tag the pages which are used by blocks on the ramfs, because there were some situations in which it was not possible to distinguish between a ramfs page and another one, most notably in removepage(). Another change concerned the default ramfs size which has been lowered from RAM/2 to RAM/4 because it was possible to eat all memory by mmap()ing the whole FS in only one process. At least the following associations of flags and actions have been identified : removepage() : needs PG_fs_1 because ref and !ref have two actions 0x00001 -> ignore 0x00005 -> ignore 0x0000D -> ignore 0x10001 -> free 0x10005 -> free alloc_page() : (!dirty&!launder&lru&!uptodate) => alloc() 0x1000D -> ignore 0x00041 -> alloc 0x0004D -> alloc 0x180CD -> ignore 0x1005D -> ignore 0x000C9 -> alloc 0x000D9 -> ignore || ||_ uptodate | referenced | error | locked || |__ active | lru | unused | dirty ||____ launder | reserved | arch_1 | checked |_____ fs_1 Maybe there is a way to mark the pages during alloc_page() with a flag which will remain till removepage() but I've not found any. Note: the following call can be used everywhere to debug pages usage : printk(KERN_DEBUG "%20s: page=%p flags=%08x\n", __FUNCTION__, page, page->flags); Willy Tarreau --- linux-2.4.32/fs/ramfs/inode.c Sat Dec 6 08:14:48 2003 +++ linux-2.4.32-ramfs/fs/ramfs/inode.c Mon Jan 16 02:17:09 2006 @@ -5,6 +5,8 @@ * 2000 Transmeta Corp. * * Usage limits added by David Gibson, Linuxcare Australia. + * Some 2.4-VM compatibility fixes, and a few improvements + * by Willy Tarreau, EXOSEC - 2005/11/20, 2006/01/16. * This file is released under the GPL. */ @@ -29,8 +31,18 @@ #include #include #include +#include +#include #include +#include + +#if PAGE_CACHE_SIZE % 1024 +#error Oh no, PAGE_CACHE_SIZE is not divisible by 1k! I cannot cope. +#endif + +#define IBLOCKS_PER_PAGE (PAGE_CACHE_SIZE / 512) +#define K_PER_PAGE (PAGE_CACHE_SIZE / 1024) /* some random number */ #define RAMFS_MAGIC 0x858458f6 @@ -40,8 +52,150 @@ static struct file_operations ramfs_file_operations; static struct inode_operations ramfs_dir_inode_operations; +/* + * ramfs super-block data in memory + */ +struct ramfs_sb_info { + /* Prevent races accessing the used block + * counts. Conceptually, this could probably be a semaphore, + * but the only thing we do while holding the lock is + * arithmetic, so there's no point */ + spinlock_t ramfs_lock; + + /* It is important that at least the free counts below be + signed. free_XXX may become negative if a limit is changed + downwards (by a remount) below the current usage. */ + + /* maximum number of pages in a file */ + long max_file_pages; + + /* max total number of data pages */ + long max_pages; + /* free_pages = max_pages - total number of pages currently in use */ + long free_pages; + + /* max number of inodes */ + long max_inodes; + /* free_inodes = max_inodes - total number of inodes currently in use */ + long free_inodes; + + /* max number of dentries */ + long max_dentries; + /* free_dentries = max_dentries - total number of dentries in use */ + long free_dentries; +}; + +#define RAMFS_SB(sb) ((struct ramfs_sb_info *)((sb)->u.generic_sbp)) + +/* + * Resource limit helper functions + */ + +static inline void lock_rsb(struct ramfs_sb_info *rsb) +{ + spin_lock(&(rsb->ramfs_lock)); +} + +static inline void unlock_rsb(struct ramfs_sb_info *rsb) +{ + spin_unlock(&(rsb->ramfs_lock)); +} + +/* Decrements the free inode count and returns true, or returns false + * if there are no free inodes */ +static int ramfs_alloc_inode(struct super_block *sb) +{ + struct ramfs_sb_info *rsb = RAMFS_SB(sb); + int ret = 1; + + lock_rsb(rsb); + if (!rsb->max_inodes || rsb->free_inodes > 0) + rsb->free_inodes--; + else + ret = 0; + unlock_rsb(rsb); + + return ret; +} + +/* Increments the free inode count */ +static void ramfs_dealloc_inode(struct super_block *sb) +{ + struct ramfs_sb_info *rsb = RAMFS_SB(sb); + + lock_rsb(rsb); + rsb->free_inodes++; + unlock_rsb(rsb); +} + +/* Decrements the free dentry count and returns true, or returns false + * if there are no free dentries */ +static int ramfs_alloc_dentry(struct super_block *sb) +{ + struct ramfs_sb_info *rsb = RAMFS_SB(sb); + int ret = 1; + + lock_rsb(rsb); + if (!rsb->max_dentries || rsb->free_dentries > 0) + rsb->free_dentries--; + else + ret = 0; + unlock_rsb(rsb); + + return ret; +} + +/* Increments the free dentry count */ +static void ramfs_dealloc_dentry(struct super_block *sb) +{ + struct ramfs_sb_info *rsb = RAMFS_SB(sb); + + lock_rsb(rsb); + rsb->free_dentries++; + unlock_rsb(rsb); +} + +/* If the given page can be added to the give inode for ramfs, return + * true and update the filesystem's free page count and the inode's + * i_blocks field. Always returns true if the page is already used by + * ramfs (ie. PG_fs_1 is set). We use the PG_fs_1 flag here because + * it's the only one which will always remain till ramfs_removepage() */ +int ramfs_alloc_page(struct inode *inode, struct page *page) +{ + int ret = 1; + + /* do we already own this page ? */ + if (!test_bit(PG_fs_1, &page->flags)) { + struct ramfs_sb_info *rsb = RAMFS_SB(inode->i_sb); + lock_rsb(rsb); + if ( (rsb->free_pages > 0) && + ( !rsb->max_file_pages || + (inode->i_data.nrpages <= rsb->max_file_pages) ) ) { + inode->i_blocks += IBLOCKS_PER_PAGE; + rsb->free_pages--; + set_bit(PG_fs_1, &page->flags); /* we own it now */ + } else { + ret = 0; /* will become ENOSPC */ + } + unlock_rsb(rsb); + } + + return ret; +} + static int ramfs_statfs(struct super_block *sb, struct statfs *buf) { + struct ramfs_sb_info *rsb = RAMFS_SB(sb); + + lock_rsb(rsb); + buf->f_blocks = rsb->max_pages; + buf->f_files = rsb->max_inodes; + + buf->f_bfree = rsb->free_pages; + buf->f_bavail = buf->f_bfree; + buf->f_ffree = rsb->free_inodes; + unlock_rsb(rsb); + buf->f_type = RAMFS_MAGIC; buf->f_bsize = PAGE_CACHE_SIZE; buf->f_namelen = NAME_MAX; @@ -76,9 +230,37 @@ return 0; } +static int ramfs_writepage(struct page *page) +{ + struct inode *inode = (struct inode *)page->mapping->host; + + if (PageLaunder(page)) { + activate_page(page); + SetPageReferenced(page); + } + + if (! ramfs_alloc_page(inode, page)) { + UnlockPage(page); + return -ENOSPC; + } + + /* Set the page dirty again, unlock */ + SetPageDirty(page); + UnlockPage(page); + return 0; +} + static int ramfs_prepare_write(struct file *file, struct page *page, unsigned offset, unsigned to) { - void *addr = kmap(page); + struct inode *inode = (struct inode *)page->mapping->host; + void *addr; + + if (! ramfs_alloc_page(inode, page)) { + ClearPageUptodate(page); + return -ENOSPC; + } + + addr = (void *) kmap(page); if (!Page_Uptodate(page)) { memset(addr, 0, PAGE_CACHE_SIZE); flush_dcache_page(page); @@ -99,9 +281,33 @@ return 0; } +static void ramfs_removepage(struct page *page) +{ + struct inode *inode = (struct inode *)page->mapping->host; + + /* Did we own this page ? */ + if (test_bit(PG_fs_1, &page->flags)) { + struct ramfs_sb_info *rsb = RAMFS_SB(inode->i_sb); + lock_rsb(rsb); + if (rsb->free_pages >= rsb->max_pages) { + printk(KERN_ERR "ramfs: Error in page deallocation, free_pages (%ld) > max_pages (%ld)\n", rsb->free_pages, rsb->max_pages); + } else { + rsb->free_pages++; + inode->i_blocks -= IBLOCKS_PER_PAGE; + } + unlock_rsb(rsb); + clear_bit(PG_fs_1, &page->flags); + } +} + struct inode *ramfs_get_inode(struct super_block *sb, int mode, int dev) { - struct inode * inode = new_inode(sb); + struct inode * inode; + + if (! ramfs_alloc_inode(sb)) + return NULL; + + inode = new_inode(sb); if (inode) { inode->i_mode = mode; @@ -127,18 +333,27 @@ inode->i_op = &page_symlink_inode_operations; break; } - } + } else + ramfs_dealloc_inode(sb); + return inode; } /* - * File creation. Allocate an inode, and we're done.. + * File creation. Allocate an inode, update free inode and dentry counts + * and we're done.. */ static int ramfs_mknod(struct inode *dir, struct dentry *dentry, int mode, int dev) { - struct inode * inode = ramfs_get_inode(dir->i_sb, mode, dev); + struct super_block *sb = dir->i_sb; + struct inode * inode; int error = -ENOSPC; + if (! ramfs_alloc_dentry(sb)) + return error; + + inode = ramfs_get_inode(dir->i_sb, mode, dev); + if (inode) { if (dir->i_mode & S_ISGID) { inode->i_gid = dir->i_gid; @@ -148,6 +363,8 @@ d_instantiate(dentry, inode); dget(dentry); /* Extra count - pin the dentry in core */ error = 0; + } else { + ramfs_dealloc_dentry(sb); } return error; } @@ -167,11 +384,15 @@ */ static int ramfs_link(struct dentry *old_dentry, struct inode * dir, struct dentry * dentry) { + struct super_block *sb = dir->i_sb; struct inode *inode = old_dentry->d_inode; if (S_ISDIR(inode->i_mode)) return -EPERM; + if (! ramfs_alloc_dentry(sb)) + return -ENOSPC; + inode->i_nlink++; atomic_inc(&inode->i_count); /* New dentry reference */ dget(dentry); /* Extra pinning count for the created dentry */ @@ -218,6 +439,7 @@ */ static int ramfs_unlink(struct inode * dir, struct dentry *dentry) { + struct super_block *sb = dir->i_sb; int retval = -ENOTEMPTY; if (ramfs_empty(dentry)) { @@ -225,6 +447,9 @@ inode->i_nlink--; dput(dentry); /* Undo the count from "create" - this does all the work */ + + ramfs_dealloc_dentry(sb); + retval = 0; } return retval; @@ -240,6 +465,8 @@ */ static int ramfs_rename(struct inode * old_dir, struct dentry *old_dentry, struct inode * new_dir,struct dentry *new_dentry) { + struct super_block *sb = new_dir->i_sb; + int error = -ENOTEMPTY; if (ramfs_empty(new_dentry)) { @@ -247,6 +474,7 @@ if (inode) { inode->i_nlink--; dput(new_dentry); + ramfs_dealloc_dentry(sb); } error = 0; } @@ -271,11 +499,191 @@ return 0; } +static void ramfs_delete_inode(struct inode *inode) +{ + ramfs_dealloc_inode(inode->i_sb); + + clear_inode(inode); +} + +static void ramfs_put_super(struct super_block *sb) +{ + kfree(sb->u.generic_sbp); +} + +struct ramfs_params { + long pages; + long filepages; + long inodes; + long dentries; +}; + +static int parse_options(char * options, struct ramfs_params *p) +{ + char save = 0, *savep = NULL, *optname, *value; + + p->pages = -1; + p->filepages = -1; + p->inodes = -1; + p->dentries = -1; + + for (optname = strtok(options,","); optname; + optname = strtok(NULL,",")) { + if ((value = strchr(optname,'=')) != NULL) { + save = *value; + savep = value; + *value++ = 0; + } + + if (!strcmp(optname, "maxfilesize") && value) { + p->filepages = memparse(value, &value) + / K_PER_PAGE; + if (*value) + return -EINVAL; + } else if (!strcmp(optname, "maxsize") && value) { + p->pages = memparse(value, &value) // size in blocks + / K_PER_PAGE; + if (*value) + return -EINVAL; + /* we also accept tmpfs-like 'size' option */ + } else if (!strcmp(optname, "size") && value) { + unsigned long long size; + size = memparse(value, &value); + if (*value == '%') { + struct sysinfo si; + si_meminfo(&si); + size <<= PAGE_SHIFT; + size *= si.totalram; + do_div(size, 100); + value++; + } + if (*value) + return -EINVAL; + p->pages = size >> PAGE_SHIFT; + } else if ((!strcmp(optname, "nr_inodes") || + !strcmp(optname, "maxinodes")) && value) { + p->inodes = memparse(value, &value); + if (*value) + return -EINVAL; + } else if (!strcmp(optname, "maxdentries") && value) { + p->dentries = memparse(value, &value); + if (*value) + return -EINVAL; + } + + if (optname != options) + *(optname-1) = ','; + if (value) + *savep = save; + } + + return 0; +} + +static void init_limits(struct ramfs_sb_info *rsb, struct ramfs_params *p) +{ + struct sysinfo si; + + si_meminfo(&si); + + /* By default we set the limits to be: + - Allow this ramfs to take up to 25% of all available RAM + - No limit on filesize (except no file may be bigger that + the total max size, obviously) + - dentries limited to one per 4k of data space + - No limit to the number of inodes (except that there + are never more inodes than dentries). + */ + rsb->max_pages = (si.totalram / 4); + + if (p->pages >= 0) + rsb->max_pages = p->pages; + + rsb->max_file_pages = 0; + if (p->filepages >= 0) + rsb->max_file_pages = p->filepages; + + rsb->max_dentries = rsb->max_pages * K_PER_PAGE / 4; + if (p->dentries >= 0) + rsb->max_dentries = p->dentries; + + rsb->max_inodes = 0; + if (p->inodes >= 0) + rsb->max_inodes = p->inodes; + + rsb->free_pages = rsb->max_pages; + rsb->free_inodes = rsb->max_inodes; + rsb->free_dentries = rsb->max_dentries; + + return; +} + +/* reset_limits is called during a remount to change the usage limits. + + This will suceed, even if the new limits are lower than current + usage. This is the intended behaviour - new allocations will fail + until usage falls below the new limit */ +static void reset_limits(struct ramfs_sb_info *rsb, struct ramfs_params *p) +{ + lock_rsb(rsb); + + if (p->pages >= 0) { + int used_pages = rsb->max_pages - rsb->free_pages; + + rsb->max_pages = p->pages; + rsb->free_pages = rsb->max_pages - used_pages; + } + + if (p->filepages >= 0) { + rsb->max_file_pages = p->filepages; + } + + + if (p->dentries >= 0) { + int used_dentries = rsb->max_dentries - rsb->free_dentries; + + rsb->max_dentries = p->dentries; + rsb->free_dentries = rsb->max_dentries - used_dentries; + } + + if (p->inodes >= 0) { + int used_inodes = rsb->max_inodes - rsb->free_inodes; + + rsb->max_inodes = p->inodes; + rsb->free_inodes = rsb->max_inodes - used_inodes; + } + + unlock_rsb(rsb); +} + +static int ramfs_remount(struct super_block * sb, int * flags, char * data) +{ + struct ramfs_params params; + struct ramfs_sb_info * rsb = RAMFS_SB(sb); + + if (parse_options((char *)data, ¶ms) != 0) + return -EINVAL; + + reset_limits(rsb, ¶ms); + +#if CONFIG_DEBUG_RAMFS + printk(KERN_DEBUG "ramfs: remounted with options: %s\n", + data ? (char *)data : "" ); + printk(KERN_DEBUG "ramfs: max_pages=%ld max_file_pages=%ld " + "max_inodes=%ld max_dentries=%ld\n", + rsb->max_pages, rsb->max_file_pages, + rsb->max_inodes, rsb->max_dentries); +#endif CONFIG_DEBUG_RAMFS + + return 0; +} + static struct address_space_operations ramfs_aops = { readpage: ramfs_readpage, - writepage: fail_writepage, + writepage: ramfs_writepage, prepare_write: ramfs_prepare_write, - commit_write: ramfs_commit_write + commit_write: ramfs_commit_write, + removepage: ramfs_removepage, }; static struct file_operations ramfs_file_operations = { @@ -300,17 +708,37 @@ static struct super_operations ramfs_ops = { statfs: ramfs_statfs, put_inode: force_delete, + delete_inode: ramfs_delete_inode, + put_super: ramfs_put_super, + remount_fs: ramfs_remount, }; +/* + * Initialisation + */ + static struct super_block *ramfs_read_super(struct super_block * sb, void * data, int silent) { struct inode * inode; struct dentry * root; + struct ramfs_sb_info * rsb; + struct ramfs_params params; sb->s_blocksize = PAGE_CACHE_SIZE; sb->s_blocksize_bits = PAGE_CACHE_SHIFT; sb->s_magic = RAMFS_MAGIC; sb->s_op = &ramfs_ops; + + sb->u.generic_sbp = kmalloc(sizeof(struct ramfs_sb_info), GFP_KERNEL); + rsb = RAMFS_SB(sb); + + spin_lock_init(&rsb->ramfs_lock); + + if (parse_options((char *)data, ¶ms) != 0) + return NULL; + + init_limits(rsb, ¶ms); + inode = ramfs_get_inode(sb, S_IFDIR | 0755, 0); if (!inode) return NULL; @@ -321,6 +749,15 @@ return NULL; } sb->s_root = root; + +#if CONFIG_DEBUG_RAMFS + printk(KERN_DEBUG "ramfs: mounted with options: %s\n", + data ? (char *)data : "" ); + printk(KERN_DEBUG "ramfs: max_pages=%ld max_file_pages=%ld " + "max_inodes=%ld max_dentries=%ld\n", + rsb->max_pages, rsb->max_file_pages, + rsb->max_inodes, rsb->max_dentries); +#endif CONFIG_DEBUG_RAMFS return sb; } --- linux-2.4.32/Documentation/filesystems/ramfs.txt 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.4.32-ramfs/Documentation/filesystems/ramfs.txt 2005-11-19 18:55:12.000000000 +0100 @@ -0,0 +1,47 @@ + ramfs - An automatically resizing memory based filesystem + + + Ramfs is a file system which keeps all files in RAM. It allows read + and write access. In contrast to RAM disks, which get allocated a + fixed amount of RAM, ramfs grows and shrinks to accommodate the + files it contains. + + You can mount the ramfs with: + mount -t ramfs none /mnt/wherever + + Then just create and use files. When the filesystem is unmounted, all + its contents are lost. + + NOTE! This filesystem is probably most useful not as a real + filesystem, but as an example of how virtual filesystems can be + written. + +Resource limits: + +By default a ramfs will be limited to using half of (physical) memory +for storing file contents, a bit over that when the metadata is +included. The resource usage limits of ramfs can be controlled with +the following mount options: + + maxsize=NNN + Sets the maximum allowed memory usage of the +filesystem to NNN kilobytes. This will be rounded down to a multiple +of the page size. The default is half of physical memory. NB. unlike +most of the other limits, setting this to zero does *not* mean no +limit, but will actually limit the size of the filesystem data to zero +pages. There might be a use for this in some perverse situation. + + maxfilesize=NNN + Sets the maximum size of a single file on the +filesystem to NNN kilobytes. This will be rounded down to a multiple +of the page size. If NNN=0 there is no limit. The default is no limit. + + maxdentries=NNN + Sets the maximum number of directory entries (hard +links) on the filesystem to NNN. If NNN=0 there is no limit. By +default this is set to maxsize/4. + + maxinodes=NNN + Sets the maximum number of inodes (i.e. distinct +files) on the filesystem to NNN. If NNN=0 there is no limit. The +default is no limit (but there can never be more inodes than dentries).