The traditional file system provides the service of data persistence. However, scalable datastores need to be distributed so that many processes can access the data at higher access rates. They need to be tolerant to failure, and to this end, the data is replicated.
Distributed access, replicated data, and fault tolerence make consistency of data a challenge. The LevelDB is a key–value store that addresses this challenge.
A key–value store is a collection pairs, a key with a value. The functions on the store are,
The LevelDB store achieves consistency and resilience by storing data in either immutable or append only files. An immutable file never changes, so one either has a copy of the data, or one does not. And having a copy supersedes not having a copy. An append only file is slightly more complicated in that there could be two copies of the file, but one is always a prefix of the other, if not the same.
Combining immutable and append-only stores for key-value stores, no fact is ever wrong. The collection of all facts grows, and is at any moment is a subset of all that facts at any future time.
Updates replace old immutable file with new ones in a manner that is atomic, consistent, efficient, and concurrent. The datastore will never be unavailable, and its files can be copied or backed up at any point, as they never change and are never wrong.
The level 1 log structured table (by time) is eventually gleaned of repeated keys and sorted (by key) to make a level 2 table that replaces the level 1 table. Eventually the level 2 table is merged into the level 3 table, which is the final location for the data. Tombstone marks, which indicated deleted keys, are not needed in the level 3 table.
Note 1: Use strncpy to deal with the key and the value. Just be prepared with buffers that are 17 and 41 bytes long for the key and the value, respectively, and to put a null in the last byte (conditionally or otherwise).
Note 2: To time stamp, kv_pair->timestamp = time(NULL) ; is a hint.
#define KEY_SIZE 16 #define TIMESTAMP_SIZE 8 #define VALUE_SIZE 40 #define LEVEL0_N_PAIRS 256 struct KV_PAIR { char key[KEY_SIZE] ; time_t timestamp ; // char timestamp[TIMESTAMP_SIZE] char value[VALUE_SIZE] ; } ; struct KV_PAIR * kv_pair = (struct KV_PAIR *) leveldb_g.l0_mm ;The level 0 table contians LEVEL0_N_PAIRS number of these structs, forming a fixed record file, that is also an in-memory array, using memory-mapped io. The function mmap_level0_file in levedldb.c sets up the mapping using the mmap system call. The result is that the virtual memory system is recruited for file I/O, by assigning the files as the backing store for a region of virtual memory.
The file has the fixed record structure of an array of structs, and can be used as such. Here is an example of finding the end of the level 0 file, that is, the first empty key.
int find_next_index() { int i ; struct KV_PAIR * kv_pair = (struct KV_PAIR *) leveldb_g.l0_mm ; int found = -1 ; for (i=0;i<LEVEL0_N_PAIRS;i++) { if (kv_pair[i].key[0]=='\0') { found = i ; break ; } } return found ; }
NAME LevelDB filesystem using FUSE SYNOPSIS leveldbd [-v] _keystore_dir_ _tablestore_dir_ DESCRIPTION Uses FUSE to attach a key-value filesystem implemented using the ideas of the LevelDB datastore that runs Google's BigTable. OPTIONS -v verbose output HISTORY New for csc421-201, november 2019 BUGS
Please run at least three windows.
To make things simpler, truncate silently user input when necessary. Truncate keys longer than 16 characters to 16 characters, and values longer than 40 characters to 40 characters. It might be easier to use strncpy and strnlen, and represent strings as a fixed length, not necessary null terminated, but if so, null padded as well. See strncpy.
Note that this implementation does not distinguish between a deleted key and key with an empty value. It even does not distinguish between a key with an empty value and a key that never existed, except that a key that was deleted has a deletion time stamp, and a key that never was has time stamp the current time, when listed.
Also note, the listing of the directory never shows keys (this is hard to implement).
The implementation must be by appends to the key-value store, as this pages has described. The behavior is important but correctness will also be checked by comparing your level 0 file against a reference file.
$ ls -l keystore/ total 0 -rwxrwxrwx 1 root root 0 Oct 28 02:33 leveldb $ echo apple > keystore/a $ cat keystore/a apple $ ls -l keystore/a -rwxrwxrwx 1 root root 6 Oct 28 02:33 keystore/a $ echo banana > keystore/b $ cat keystore/b banana $ ls -l keystore/b -rwxrwxrwx 1 root root 7 Oct 28 02:36 keystore/b $ echo artist > keystore/a $ cat keystore/a artist $ ls -l keystore/a -rwxrwxrwx 1 root root 7 Oct 28 02:38 keystore/a $ rm keystore/b $ cat keystore/b $ ls -l keystore/b -rwxrwxrwx 1 root root 0 Oct 28 02:39 keystore/b $ ls -l keystore/ total 0 -rwxrwxrwx 1 root root 0 Oct 28 02:45 leveldb $ make level-dump hexdump -C kv-tables/l0.dat 00000000 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |a...............| 00000010 2c 30 1f 67 00 00 00 00 00 00 00 00 00 00 00 00 |,0.g............| 00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000040 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |a...............| 00000050 2c 30 1f 67 00 00 00 00 61 70 70 6c 65 0a 00 00 |,0.g....apple...| 00000060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000080 62 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |b...............| 00000090 e5 30 1f 67 00 00 00 00 00 00 00 00 00 00 00 00 |.0.g............| 000000a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000000c0 62 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |b...............| 000000d0 e5 30 1f 67 00 00 00 00 62 61 6e 61 6e 61 0a 00 |.0.g....banana..| 000000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000100 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |a...............| 00000110 6a 31 1f 67 00 00 00 00 00 00 00 00 00 00 00 00 |j1.g............| 00000120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000140 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |a...............| 00000150 6a 31 1f 67 00 00 00 00 61 72 74 69 73 74 0a 00 |j1.g....artist..| 00000160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000180 62 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |b...............| 00000190 8b 31 1f 67 00 00 00 00 00 00 00 00 00 00 00 00 |.1.g............| 000001a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00004000
author: burton rosenberg
created: 10 nov 2019
update: 27 oct 2024