Systems Research Laboratory

List of experiments done (to-do?) for the flash profiling project.

Seq. Write w/ delays

This experiment issues sequential requests with inter-delay times. The output is the throughput and standard deviation of the experiments. More information can be found in: sequential_write.

Insights

What rate is better for writing? By delaying the writes, we can give some time to SSDs to complete their internal operation and give more opportunity for merging at the OS level.
We can expose SSDs' internals. First, we can know the required time for the SSD to merge writes in their fixed location (this may vary depending of mapping and internal write method). Second, we can see if the SSD use some timer to start the cleaning operation after the device is considered idle.

Segment Simulator

This experiment partition the SSD in multiple segments of some given size. Then, the incoming write requests are written in the current “active” segment in a log-like mode, i.e., the new request is written in the first available space. When full, a new segment is selected and the operation continued. For more information and results: segment_simulator.

Insights

This experiment finds the best segment size to be used for allocation in SSDs. It can be used in the design/configuration of file systems and swap systems.

Partition Simulator

One idea was to partition the SSD in big segments and apply one “Segment Simulator” in each of them. More information: partition_simulator. This experiment can give more interesting results in high-end SSDs supporting NCQ. We can also take one big segment and partition it in several pieces. Then, each of these pieces are used as segments.

Insights

What is the maximum parallelism that can be used in the SSD?
Is this dependent on the workload? Should we find the best number of partitions and segment size given certain workload?

Full-stripe size

Given that the SSD is composed of several flash chips exported as a RAID-0, we can design an experiment that finds the full-stripe size. The idea of the experiment is to send requests of size equal and aligned to the assumed full-stripe. When the device hit the full-stripe size, the throughput is the highest since full parallelism of the device is used. More information: pattern_size.

Insights

Expose mapping information given the information that we have now of flash chips.
Give a model that describes the internals of the SSD's read operation.

Logical block size

Since the SSD is basically a RAID-0 of flash chips and they share a common erase unit size, some logically blocks size should exist. This logical block size is defined as the RAID-0 of all the erase units in the flash chips of the SSD. When requests overwrite completely a logical block, the device can achieve maximum efficiency in terms of cleaning overhead since only switch merges are performed. To find the logical block size, is enough to send random requests of some assumed logical block size. When the throughput stabilize, the result will the lowest assumed logical block devices with maximum throughput. In the experiments, this is called log_block.

Device overhead

Each time we send a request to an SSD, we incur in a constant overhead consisting of: SATA bus transfer (data+command). By using the read cache in the SSD, we can find this overhead. First we send a random read request to some lba in the device. Then, we send a second request to the same lba. Assuming that the lba was cached in the SSD, the second try will have lower latency. The overhead can be assumed to be the response time of the second request. On the other hand, the difference between them is the time taken by the media (flash chips) to serve the request.

Insights

This experiment was run with sequential requests of 512B. We should also run it with larger request sizes. This will also check the overhead of the SATA bus proportional to the data to be transfered.

Random/Sequential writes

Part of the research on how the SSD works internally, we started a series of sequential and random writes varying the request sizes. In the case of sequential writes, they were done synchronously since merging has to be avoided.

Same-location write

In order to stress the cleaning mechanism in the SSD, we send requests to the same lba repeatedly. This experiment makes the SSD to continuously perform merging and erasures that lead to high response times. The results of this experiment should be compared with those of random writes. They probably look alike or can reveal important insights of SSDs.

Requests rate

This experiments varies the rate in which requests are sent to the device. The idea is to explore variance in the response time of SSDs. This can reveal important characteristics of SSDs' cleaning mechanism. This expose, for instance, if the SSD do some cleaning based on timer or something else (could the combination of several policies).

Prefetching

In this experiment we have to check if the device has pre-fetching or not. As explained in SIGMETRICS paper, we generate random and sequential workloads and compare them. If sequential workload is considerably lower than random reads, it means that some kind of prefetching is happening. We also need to see the absolute values to check if the best response times are lower than that of reading from flash. This also confirm the existence of flash.

Logging

In the SIGMETRICS paper the authors claim that by disabling the cache and doing write intensive workloads they can find if the device is operated with logging for writing. This can be done by checking differences between the random and sequential curves.

Mingled read/write

The SIGMETRICS paper also shows some results in how requests can interfere with each other. We should perform the same experiments and then re-order them so that reads and writes are performed all together. This can give on idea for a better I/O scheduler for flash.