Operating Systems

DM510, Spring 2018

Daniel Merkle

DM510 Required Assignment 4: File Systems with FUSE

In the future RAM may become so cheap that the entire contents of hard drives is kept in main memory such that only writes requires actually transferring data to the hard drive; after the initial read of the contents of the hard drive, everything else is done using only main memory.

Looking at the evolution of hard drives the only thing for which there has been almost no improvement is the disk seek time, i.e., the time required to reposition the hard disk head. Data transfer rates have and will probably continue to increase during the next many years.

A natural consequence of the above is to look at the entire hard drive as a circular log. All writes are initially buffered in memory. Later, either periodically or when enough data is available, all the buffered data is written to the disk in a single segment at the end of the log.

For more details, you can have a look at a few pages in the following book. (You can find a pdf version of these pages via the Blackboard system, where the slides are stored).

Andrew S. Tanenbaum and Albert S. Woodhull. Operating Systems: Design and Implementation. Prentice Hall, 1997. Section 5.3.6, pages 432-4.

Further details can, e.g., be found in Chapter 4 of the book from A. Tanenbaum: Modern Operating Systems, 2009.

In this assignment, your mandatory task is to implement a file-system using FUSE. By doing this you will hopefully learn a lot about how file systems are implemented.

Implementing the log-structured file system features as described by Tanenbaum and Woodhull is voluntary.

Attacking the Problem

You will not need User Mode Linux for this assignment.

FUSE (Filesystem in USErspace) is an easy way to create a file system running in userspace for Linux. For detailed information how FUSE works please refer to the homepage of FUSE. Download the files lfs.c and Makefile, and create an executable with the name lfs. The following sample session shows how this very simple sample file system is mounted (via the command lfs) and unmounted (via the command fusermount -u). Please use a subdirectory of the /tmp directory as a mount-point (otherwise the NFS system might disallow you to unmount the directory).

 
$ mkdir /tmp/lfs-mountpoint
$ chmod 755 /tmp/lfs-mountpoint/
$ ./lfs /tmp/lfs-mountpoint/
$ cd /tmp/lfs-mountpoint/
/tmp/lfs-mountpoint$ ls
hello
/tmp/lfs-mountpoint$ cd ..
$ fusermount -u /tmp/lfs-mountpoint/
 

Based on the sample file system you should now implement your filesystem, and voluntarily even your Log-Structured File System.

Requirements and Hints for Design and Implementation

Your file system should implement the following parts of a file system using some appropriate API.

  • Directories: List, create, and delete files. Create and delete directories. Omit rename and link. You are allowed to restrict the length of filenames, if this would make the implementation easier.
  • Files: Open, close. Sequential read/write of streams of bytes.
  • Inodes: Size, access and modification time stamps. Omit owner, permissions, reference count, etc. You only have to implement one level of indirect data pointers.

The actual file system should be implemented using a normal semi-huge file in the file system of the computer you use, i.e. if your filesystem writes to the harddisk, this semi-huge file in the file system is changed. Everything must of course be able to run on the Linux machines at IMADA.

Even though the entire philosophy of LFS is rooted in the idea of having the entire contents of the file system in main memory, you do not need to implement it like this. In general your main focus should be on actually making your file system work, rather than optimizing it for efficiency.

Not mandatory: In the LFS implementation of Tanenbaum and Woodhull they use a cleaner thread which runs periodically. You do not need to use a special thread (but of course you can implement this feature, if you want to). Instead you can do the cleaning in connection with file system operations.

Note that in the description above and in the pages from Tanenbaum and Woodhull, a lot of details are left out. You have to decide for yourself what is appropriate and what to do.

Report / Submission

The report must be short and precise (maximum 10 pages plus source code). Remember that you must demonstrate that you have understood the problems and solutions.

It must include the following:

  • A small introduction.
  • A description of what you have done and the choices you have made.
  • Discuss if and how your filesystem implementation could recover from a power failure and how you would guarantee that recovery is always possible, even if yet a power failure happens while you are recovering from the previous failure. Note: You do not have to implement this.
  • A description of the tests you have made and the motivation for these. The tests must be able to verify that your solution works correctly, but you do not have to test all possible error scenarios. Remember it is better to document a bug rather than just ignoring it.
  • A small conclusion.

Furthermore, your report has to be supplemented again by a recorded desktop session, in which you show that your implementation is working correctly. During the session you should also perform all or some of the tests that you described in your report. In your report you should refer to the tests you performed by giving the precise time in the video file. The recorded session should be maximally 5 minutes long and have a maximal size of 20MB. If one of these constraints is not fulfilled the video file will be considered as not submitted.

For recording the desktop session you should again use the tool gtk-recordMyDesktop. Do not forget to change the "Video Quality" to 10 (using the slider), and the "Frames per Second" to be recorded to 5 (Advanced Setting / Performance) to reduce the file size. Sound recording is not required.

For submitting the report, the sources, and the desktop session, proceed as follows: Create the following directory structure

assignment4/
assignment4/report/
assignment4/sources/
assignment4/video/

Put your report, sources, and video in the corresponding directory. The sources should include the files you changed and those files you added. The report direcory and the sources directory should not contain more than 20MB of data. Similar to the size of the video file: If this constraint is not fulfilled the submission will be not considered.

Package and compress with the directory with zip -r assignment4 assignment4. Submit the created zip file in blackboard.

Additionally, in contrast to the first required assignment, you shall hand in a printout of your report and of your source code. (My mailbox, my office, directly to TA, as you wish.)

The strict deadline for submission (electronically) is 22.05.2018, 09:00. (It is fine to deliver the printouts until 16:00 to either me or one of the TAs)

Further References

A recent post might help you to get going. Note, that for this post an implementation of a Simple and Stupid Filesystem is provided.

If you want more information, you can have a look at the following. Note: this is not necessary, if you only want to do the assignment.

  • Mendel Rosenblum and John K. Ousterhout. The design and implementation of a log-structured file system. ACM Transactions on Computer Systems volume 10:1, February 1992, pages 26 - 52, February 1992. The paper is available from the ACM digital library. Note: to get the actual PDF-file may require you to be at the university. The paper describes LFS in more detail.
  • JFFS2: The Journalling Flash File System, version 2 is a variant of LFS used for flash devices. This file system is included in any newer standard Linux kernel.

Frequently Asked Questions (FAQ)

  • Are we allowed to work in groups?
  • For everything except the report and recording the desktop session, you are allowed to work in groups of at most three people. Write the members of your group on the front page of your report. The report you have to write on your own. Remember that the three required assignments are part of your final exam, i.e., you have to fully understand all parts of what you have done.

Design by 1234.info | Modified by Daniel Merkle | CSS 2.0