by: burt rosenberg at: university of miami
date: nov 2020
NAME
LevelDB FUSE filesystem for Linux
SYNOPSIS
leveldbd [-v -s _l0_size_] _keystore_dir_ _tablestore_dir_
DESCRIPTION
Uses FUSE to attach a key-value filesystem implemented using the
ideas of the LevelDB datastore that runs Google's BigTable.
OPTIONS
-s the size of the l0 store, in number of key-value pairs
-v verbose output
HISTORY
New for csc421-201, november 2019
BUGS
US Patent US4905110 A
LevelDB
LevelDB is a family of files (in the well directory) that store the key-value pairs.
The files are mostly immutable (once created they never change), except some are
append-only (which shares with immutable a robustness to concurrent access).
The files all share a common format. They are a sequence of fixed size (64 bytes)
records, with a key (first 16 bytes) then a value (48 bytes).
The key and value are placed into their respective fields, with null ('\0')
padding on the right. The null byte is not allowed in the key or value itself.
It is like a C string, however, the key or value can fill their entire field, and
then will not have a null termination byte.
There are three levels of files,
Level 0: key-value pairs are appended to the end of the currently open level 0 file.
Only appends are allowed, so key-value pair deletion is accomplished with appending
a tombstone key-value pair for that key.
Level 1: A level 1 file is created from a closed level 0 file by properly gleaning
duplicate key-value pairs in the level 0 file then sorting it by the key. The file is
then never changed. It is immutable, is searched using binary search, and can have
tombstone entries, but never a duplicate key.
Level 2: There is a single level 2 file that is created by merging current level 1 files
and the current level 2 file. Duplicates between the level 2 and level 1 files are
gleaned. Tombstones are not allowed in a level 2 file.
Project: Level 1
In the previous project, the Level 0 table was implemented, and once full, the behavior was
left unspecified. The leveldbd.c and leveldb.h files are changed from the previous project.
Copy from class to your folder the proj7 directory, then from your proj6 file copy over
your leveldb-sub.c.
Important note: For testing, the size of the level0 can be set by a run-time option.
While the true leveldb would have multiple level 0 and level 1 tables, we will only have
a single table of each level. As described previously, new key-value pairs are appended
onto the end of the level0 tables. Searching the level0 from newest to oldest, reads on the
leveldb store report the newest binding of a key to a value.
This is durable, but slow, as an O(n) linear search is used. So at a certain size the level0
table is closed, gleaned, and sorted by key, to make a level1 table that can be binary searched.
In a full implementation, the datastore would be non-stop. A new level0 table would be
opened, and the old level0 still searched, but after the new level0 table.
In the meanwhile, the closed level0 will be turned into a level1, and when done, a switch-over
will occur that replaces the closed level0 with its level1 version.
We however will make sure our filesystem is single-threaded, and that thread will accomplish
the level1 creation, then the zeroing of the level0, while making new key-value pairs wait.
L0 to L1 migrations.
Just as the level0 table is a file, named l0.dat; the level1 table is a file, named l1.dat.
The rules are,
The current level0 is migrated to a level1.
If there was an existing level1, it is overwritten.
Once the level0 is migrated, it is zeroed.
To make the level1, you must create an algorithm that does the following:
Searching in the L0/L1 tables
Search first in the level0, from newest to oldest. If nothing is found, search in the level1 table.
You must use binary search in the level1 table (for full credit). If it is not found in the level1 table
as well, then the key is not in the datastore.
Hint: While it was fun and easy to use memory maps for the level0, this does not seem to be
as important for level1. The records are fixed sized for a reason — that you can use pread or
lseek with an easily computed offset.
Extra credit: Level 2 tables
The last level is level2. It is searched when a key is not found in either the level0 or level1.
Like the level1, it is sorted by key, for binary search, and each key appears at most once.
Unlike the level2, there are no tombstone entries in the table. (Note: this means that every
value found in the level2 is non-empty.)
Since we only have one table of each level, and level1 to level2 migration is called when a level0
to level1 migration is needed. First the level1 is merged with an existing level2 (if any) and made
the new level2. Then the level1 is zero'ed and a level0 to level1 migration follows.