internal:fast09-reviewing:review30

==+== FAST '09 Paper Review Form

==-== Set the paper number and fill out lettered sections A through G.

==-== DO NOT CHANGE LINES THAT START WITH “==+==”!

==+== RAJU MUST REPLACE THIS LINE BEFORE UPLOADING

==+== Begin Review

==+== Paper #30

==-== Replace '000000' with the actual paper number.

==+== Review Readiness

==-== Enter “Ready” here if the review is ready for others to see:

Ready

==+== A. Overall merit

==-== Enter a number from 1 to 5.

==-== Choices: 1. Reject

==-== 2. Weak reject

==-== 3. Weak accept

==-== 4. Accept

==-== 5. Strong accept

==+== B. Novelty

==-== Enter a number from 1 to 5.

==-== Choices: 1. Published before

==-== 2. Done before (not necessarily published)

==-== 3. Incremental improvement

==-== 4. New contribution

==-== 5. Surprisingly new contribution

==+== C. Longevity

==-== How important will this work be over time?

==-== Enter a number from 1 to 5.

==-== Choices: 1. Not important now or later

==-== 2. Low importance

==-== 3. Average importance

==-== 4. Important

==-== 5. Exciting

==+== D. Reviewer expertise

==-== Enter a number from 1 to 4.

==-== Choices: 1. No familiarity

==-== 2. Some familiarity

==-== 3. Knowledgeable

==-== 4. Expert

==+== E. Paper summary

The paper presents a new hybrid storage system that use flash-based SSDs as a complement for the conventional HDDs. It proposes an approach for optimal selection of the SSD size in such a storage system, given the HDD Size, the storage system workload characteristics, desired lifetime, and desired throughput specified by the system administrator. As a second component, the paper suggests a dynamic controller component that adjusts workload to the SSD to meet target lifetime if the workload deviates from its expected characteristics.

==+== F. Comments for author

* The motivation of the paper was quite good. It includes some initial graphs showing the differences in characteristics of HDDs and SDDs, which though well-known, do help motivate the problem.

* Below I list some key high-level points the authors need to address:

- The authors should look at Managed Flash Technology (MFT) [1] which proposes a simple approach to convert all random writes to SSD to sequential ones, thus substantially simplifying the problem of lifetime and performance modeling. MFT is arguably more effective than and eliminates the need for techniques such as “fragmentation busting” and “Wear-leveling write regulation” as proposed by the authors.

- The proposed model (I believe) is being used incorrectly. The authors use linear regression for predicting performance and lifetime given the workload characteristics to the mixed-store, but view the entire workload as being targeted to the SSD alone. In the steady-state, workload to the SSD is necessarily distinct from the overall workload handled by the mixed-store since the HDD is also in use. Further, the workload to the SSD is dynamic, depending on the dynamic state of both devices (requests are diverted to the device which is predicted to be faster to respond). Thus, the linear model estimates performance based on a workload that is a super-set of the SSD workload and this can lead to (perhaps substantial) underestimation. Modeling must inevitably address the impact to performance of both devices at once and capacity planning might be better approached with a holistic view and a different perspective. The reformulated problem would identify both a disk size and an SSD size that allows the Hybrid store to provide the best possible combination of IOs-per-second/$ and mixedstore-lifetime/$. This can then be retrofitted into an actual storage system provisioning which will then use as many Hybrid stores as required to support the target IOs-per-second. Given that workload characteristics are more or less static in terms of relative read-write ratios, request sizes, etc., scaling-up (to handle greater IOs-per-second) in this environment would simply suggest deploying additional units of such hybrid-stores.

- The authors propose changes to both the I/O device driver inside the operating system as well as the SSD controller and moreover require that these implementations communicate. Given that widely existing SSD implementations are based on SATA, the authors should clearly outline protocol changes that will be required and discuss the feasibility of such changes. Comparing these two implementations to “back-end and front-end drivers” in Xen is really incorrect since Xen drivers are both in software and at the same level inside the software stack - consequently, communication between Xen drivers is extremely straightforward.

- The authors mention that they choose an SSD lifetime equaling lifetime of HDD. This decision seems unfounded especially considering the fact that the economics of these two devices are entirely independent. A more fundamental approach is needed on lifetime choices for SSDs.

- The definitions of these independent variables are not provided. How do you define: (a) “read/write ratio” in terms of # of requests or data-size, (b) sequentiality? © working-set size, for workloads ?

- The garbage collector is modeled with the assumption that the GC is run under certain conditions that are only visible inside the SSD. However garbage collection inside the SSD can be run arbitrarily and during SSD idle time. Thus predictions of fragmentation from outside the SSD may be arbitrarily incorrect.

- The paper could be greatly improved by having a better explanation of figures and graphs. There are some figures barely explained while others are not even referenced. This also occurs in the equations.

- What do you mean when you say “a long activity of small random (write) requests results in developing fragmentation”? What fragmentation are you referring to?

- Fragmentation busting as proposed schedules flushing of fragmented data from flash to disk. How does this help? Cleaning still needs to be performed to reclaim free space. It is unclear how fragmentation-busting helps at all.

* Detailed comments:

- It would be good if you provided the ranges of the correlation coefficients for pairs of independent variables for each model you propose.

- What was the statistical criterion used to find the linear model unsatisfactory?

- Pg 8, 2nd col., I believe you mean to day “reasonable accuracy with log-linear models” rather than “linear models”

- Table 4. What was the size of the HDD used?

- What were the parameters & coefficients estimated for the TPC-H workload? Details are needed.

- Pg 8, Introduce what is “dynamism aware data partitioning”.

- What do you mean exactly when you say “SSD model needs to incorporate longer history” Focus must be on steady-state behavior.

- One general comment about the experiments is the use of Avg. response time as a performance measurement. This is not the best metric since it does not take into account the amount of work the request did, i.e., the number of blocks each request read/write. From my point of view, it is better to use throughput (MB/s) which account the amount of work performed in a period of time.

- In the implementation of MixDC migration is used in favor of replication. A discussion about the pros and cons of either approach is required. Why not replicate everything in both devices especially since disk space is cheap.

- In section 5.2 the paper explains that MixedStore improves the response time by 71% compared to the disk alone. What would be the motivation of the administrator to not use the SSD alone? How does performance compare against this configuration?

- In section 5.3 the paper assumes that the the fragmentation busting occur in the background not affecting the system normal operation. This seems unlikely in busy workloads. It is worth exploring the impact to foreground activities.

- In table 1, there is an unfair comparison between SSD and HDD. In many cases, the response time of the disk can be lesser than 5.5 ms, especially for sequential workloads it is substantially lesser. Additionally, the response time for writes in the SSD could be greater that 200 Microsencods.

- What do you mean by “install probes” in section 4.2?

* Minor comments: Typos / missing-refs / etc.

- GC = Garbage Collector.

- Please post references for the parameters used for the SSD and HDD (Figure 1 and Table 1).

- Cello99 workload is shown in Table 2, but not discussed in the paper.

- Missing period: Section 2.2 line 10; footnote 4; Section 5.4 paragraph 2 line 2; Section 5.5 line 5.

- Section 3 line 6, “the” missplaced.

- Section 3 paragraph 3 line 7, missing reference “[]'.

- Caption Figure 4, missing close of parenthesis ”)“.

- Section 3 paragraph 4 line 13, “s” missplaced.

- Section 4.3 paragraph 5 line 2, “n” should be showed in math mode.

- Figures 6 not referenced.

- Legend in figure 7 not clear.

- Section 5.1 paragraph 2 last line, “the” missplaced.

- Figure 8, two subfigures ”(b)“.

- Section 5.3, “2000 to 8000”: it is not clear what this refers to.

- Section 5.5 line 1: missing parenthesis ”)“.

- Section 5.5: ”…which like the write regulator monitors…“ could be better read as ”…which, like the write regulator, monitors…“

* Missing related work:

[1] Managed Flash Techology. http://www.easyco.com/zx1285301141358249458/mft/index.htm

[2] DCD—disk caching disk: A new approach for boosting i/o performance (1996) by Yiming Hu, Qing Yang In Proceedings of the 23rd International Symposium on Computer Architecture

[3] Mentioned but missing reference, ZFS L2ARC: http://opensolaris.org/os/community/arc/caselog/2007/618/

==+== G. Comments for PC (hidden from authors)

==+== End Review