This is in-progress. Please check back as the documentation is updated
You are to write a simple file system which will use a regular file as the "virtual disk". The structure of the file system is based on FAT, simplified, and called mini-FAT. In order that you can work as a file system driver but not in the kernel, Linux has a technology called FUSE — File system in User space. Built into the kernel, FUSE will redirect file system calls to a program you write and run as a background job as a regular user program. Your program will take those redirected calls and act upon the data in the "virtual disk" file to carry out the request.
The project therefore requires that you also learn about FUSE. There is some documentation, and a sample "Hello World" filesystem, that does nothing at all except pretend to have a single file whose contents is the string "Hello World". In fact, there is no file system at all, just a bunch of functions that are connected to file system calls by the FUSE mechanism, and these functions return the string "Hello World" when the file helloworld.txt is read, and pretty much ignore any other file system operation.
The Mini-FAT filesystem is composed of a sequence of 1024 byte clusters. We fix the size of the file system as 1024 clusters. The filesystem has three sections:
The superblock is empty and can contain all zeros.
The FAT table is an array of 32 bit integers. The following values are reserved and have special meaning in the FAT table:
Files and directories are contained in the data section in a chain of clusters represented by FAT[x]=y when cluster y follows cluster x in the chain of clusters. The starting cluster for a file or directory is given in the directory entry for the file or directory, with the exception of the root directory, which is contained in no directory.
Files have lengths, given in the file length field of the directory entry for the file. Directory lengths are always a multiple of cluster size, and the length is inferred by the length of the chain of clusters, and the file length field in the directory entry is ignored.
Files are either regular files or directories. Directories are a sequence of 32-byte directory structures as described in the FAT documentation, with the differences:
Like the true FAT file system, filenames are 8+3 style and blank padded on the right. The file system is case-insensitive, but filenames are stored all upper case. Other than for the first directory entry, which contains ".." followed by 9 blanks (0x20), the only characters allowed in a filename are "A" through "Z" and "0" through "9" and "~". The "~" character is reserved to the file system and may substitute for characters other than in the permitted set. (Note that the FAT documentation talks only of what can be stored in the directory structure and is silent on what characters can be in a filename.)
The FAT file system is "little endian", but we are fortunately writing for a little endian architecture and do not need to reverse the byte order going to and from the disk image. The program class/endian.c tests for endian-ness and also the length of various data types. It is convenient to work with 32 bit integers, however, the C language specification does not guarantee any particular bit lengths except that byte is 8 bits, and each "larger" type has at least as many bits as does a "smaller" type.
The greatest of care assumes nothing, and types such as u_int32 are introduced for working programmers that need to know how many bits are in their integer types. Aside from that, types long and longlong have been introduced, and there are several models as to which of these maps to 64 bits and which to 32 bits.
dd if=/dev/zero of=mydisk.dmg bs=1024 count=1024
To view the contents of your "virtual disk", use the command:
hexdump -C mydisk.dmgSee man hexdump.
To read and write to the virtual disk, you can use fopen to open the file, and lseek to move to the cluster (or byte) you wish to read or write. As an option, you can try "memory-mapped I/O". See man mmap, using a zero value for both the desired mapping address and the MAP_FIXED flag. This will, for instance, allow you to connect an array of type uint_32 fat[1024] to the 1024-th byte of the mapping, and thereby allow access to the FAT table using simple array syntax. (uint_32 is whatever typedef you need to force the use a 32-bit integer — C language does not assure any specific bit-size for short, int, or double!)
Read the documentation on FUSE
The class/proj4 directory has a helloworld_filesystem.c and various make targets that should implement the hello-world file system on your Ubuntu installation. You will need the FUSE developer tools installed, and there is a make target to help you with the install. Compile and run. Then ls the fusemount directory and cat the fusemount/hello file. If this all works, then you have accomplished this step.
Test Driven Development
I will try to make TDD files available, and a test driver, to help you target what a basic implementation should accomplish. To be done ...Hints
The longest journey begins with the first step. It is often hard to figure out how to begin an implementation. The idea is to begin by implementing as little as possible, compiling it, and testing it. It is often necessary to implement "provisional code" that you know will be thrown away later in the development process.As a possible plan for implementation:
Update history: