The projects are done in groups of 1-2 students, and have a prerequisite of Operating Systems course.
Data Deduplication: Saving Storage Space without Losing a Bit
Data deduplication is one for the most effective ways to reduce the size of data stored in large scale systems, and is widely used to date. The process of deduplication consists of identifying duplicate chunks of data in different files, storing a single copy of each unique chunk, and replacing the duplicate chunks with pointers to this copy.
The goal of this project is to evaluate new techniques for data chunking and duplicate identification, and their efficiency in improving data compression and overall data reduction in the system. Details will be posted soon.
SSDPlayer: It's Not Where Your Data Is, It's How It Got There
This project is based on SSDPlayer: a visualization tool which enables system designers to view how the layout of data on the flash media progresses over time according to the implemented management policy. The goal of this project is to enhance the tool with additional capabilities, so that more complex policies and behaviors can be visualized and analyzed. Details
Understanding and Optimizing SSD Performance
The goal of this project is to focus on a specific aspect of SSD performance and evaluate the benefit from solutions suggested in literature. The students will implement a new policy or method within the existing, widely used DiskSim simulator, and demonstrate its benefits and limitations by simulations. Details
Behind The Scenes of Flash based SSD Performance
The goal of this project is to design and perform a set of experiments that provide a deep understanding of a certain flash property, by using an existing setup: the Jasmine OpenSSD Platform (a widely used tool for analyzing SSD performance and characteristics) and the SigNAS Board (an analysis platform for low level flash properties and phenomena). Details
Show Me the Data: Analysis of Production Workloads for System Design
The project will consist of processing and analyzing I/O traces from production and research servers, and validating some commonly used assumptions relevant for large scale system design. The students will review the relevant literature about each assumption, determine its implications on the specified system,
and evaluate its validity to the tested data. Details
More projects may be available throughout the semester, contact me for details