Subj : Re: Block based virtual harddisk library To : comp.programming From : moi Date : Tue Sep 20 2005 02:26 am M.Barren@gmail.com wrote: > Hi, > > I'm trying to implement a library that will provide API to access a > file as if it is a harddisk with a fixed capacity of 281474976710656 > (max. of 48bit uint) blocks of specific size (eg. 100B-4KB). The access > to the virtual disk is block based (not byte based). > > Since the size of this virtual disk is beyond the phyisical limit of my > disk, I can only afford to save the useful data (eg. non-zero) to an > actual file (container) on disk. > > **The only thing that I'm particularly trying to achieve is to make it > as storage efficient as possible. At this point, fragmentation (of > consequtive blocks in te virtual disk) is not a concern.** > I have been lurking on this thread a few days before reacting. A few observations: * there are actually two "address spaces" involved (disk block number plus virtual block number). You want a tranlation virtual->fysical, and you probably will need the inverse operation, too. * Both address-spaces will need their own allocation logic (plus the coupling betwwen them) and will need some kind of free-list management (and recovery, eventually) * you seem to be preoccupied with adding extra semantics to null/zero, in order to save (quite) a few blocks. IMHO you should keep this kind of tricks out of the design, until you know it will really fit in. (the unix-style file-with-holes-in-it is a nice hack, but it still relies on zero meaning NULL) * as others have noted, your casus very much resembles a vm system (plus filesystem) As a thought-experiment, you could take a mmap'ed file on a 64bit machine. You only need to add persistance (between reboots). For the rest: it is almost all there. * DBMSses have no problems with keys bigger than the native machine addresability. (having a CHAR(6) key is at least _possible_ ) * another thought-experiment: consider mapping your 48 bits addres 0x0123456789ab to a pathname /0/1/2/3/4/5/6/7/8/9/a/b or /01/23/45/67/89/ab on a unix-style filesystem, the file 'b' or 'ab' containing your basic 4k or 8k block. Forget about performance, just use it to buid your API, and let the OS take care of LRU buffering and name cacheing. Number of available blocks/inodes/blocks *will* become a problem, but it is just for testing. A few million will do. (you will of course need some "mkdir -p" like code to construct the non-existant paths, but that is trivial) Afterwards, you could want to roll your own fs. HTH, AvK .