Subj : Re: Block based virtual harddisk library
To   : comp.programming
From : M.Barren
Date : Tue Sep 20 2005 12:25 am


moi wrote:
> I have been lurking on this thread a few days before reacting.
> A few observations:
>
> * there are actually two "address spaces" involved (disk block
>   number plus virtual block number). You want a tranlation
>   virtual->fysical, and you probably will need the inverse operation,
>   too.

Correct. But the only use of inverse mapping that I can think of is to
reorder the physical blocks to improve performace (which is NOT my top
priority) and/or compacting the blocks to save space which again
involves moving blocks around (which IS my top priority).

> * Both address-spaces will need their own allocation logic (plus the
> coupling betwwen them) and will need some kind of free-list management
>    (and recovery, eventually)
>
> * you seem to be preoccupied with adding extra semantics to null/zero,
> in order to save (quite) a few blocks. IMHO you should keep this kind of
> tricks out of the design, until you know it will really fit in. (the
> unix-style file-with-holes-in-it is a nice hack, but it still relies on
> zero meaning NULL)

well since, "The access to the virtual disk is block based (not byte
based).", I can (and probably will) avoid adding that semantic and
instead provide the user with the ability to set a certain block as
'USED' or 'UNUSED'. Initially, all the blocks will be UNUSED.

> * as others have noted, your casus very much resembles a vm system (plus
>   filesystem)
> As a thought-experiment, you could take a mmap'ed file on a 64bit
> machine. You only need to add persistance (between reboots). For the
> rest: it is almost all there.

I should put more time on reading about virtual memory. If you know of
any specific place on the net with information on this topic, I'd be
quite happy to know where.

> * DBMSses have no problems with keys bigger than the native machine
> addresability. (having a CHAR(6) key is at least _possible_ )
>
> * another thought-experiment: consider mapping your 48 bits addres
> 0x0123456789ab to a pathname
> /0/1/2/3/4/5/6/7/8/9/a/b   or
> /01/23/45/67/89/ab
> on a unix-style filesystem, the file 'b' or 'ab' containing your basic
> 4k or 8k block. Forget about performance, just use it to buid your API,
> and let the OS take care of LRU buffering and name cacheing. Number of
> available blocks/inodes/blocks *will* become a problem, but it is just
> for testing. A few million will do.
> (you will of course need some "mkdir -p" like code to construct the
> non-existant paths, but that is trivial)
> Afterwards, you could want to roll your own fs.

Very helpful.
Well, the API is quite simple. It has the following interface:

Main:
Open(container file)
Close()
IsUsed(block number)
SetBlock(block number, used/unused)
Write(block number, data)
Read(block number)

Trivial:
UsedBlocks()
SizeOnDisk()
....

Since the operation of these functions are directly connected to the
internal working of the library, there won't be much else to build
apart from the core functionalities.

Well, since keeping everything in one file would be very difficult to
handle, i might have to keep my mapping tables in a separate file and
data blocks in another.

I appriciate all your comments. So far, they've helped me a lot.

Michael

.