==+== FAST '09 Paper Review Form

==-== Set the paper number and fill out lettered sections A through G.

==-== DO NOT CHANGE LINES THAT START WITH "==+=="!

==+== RAJU MUST REPLACE THIS LINE BEFORE UPLOADING

==+== Begin Review

==+== Paper #93

==-== Replace '000000' with the actual paper number.

==+== Review Readiness

==-== Enter "Ready" here if the review is ready for others to see:

Ready

==+== A. Overall merit

==-== Enter a number from 1 to 5.

==-== Choices: 1. Reject

==-==          2. Weak reject

==-==          3. Weak accept

==-==          4. Accept

==-==          5. Strong accept


1


==+== B. Novelty

==-== Enter a number from 1 to 5.

==-== Choices: 1. Published before

==-==          2. Done before (not necessarily published)

==-==          3. Incremental improvement

==-==          4. New contribution

==-==          5. Surprisingly new contribution

1

==+== C. Longevity

==-==    How important will this work be over time?

==-== Enter a number from 1 to 5.

==-== Choices: 1. Not important now or later

==-==          2. Low importance

==-==          3. Average importance

==-==          4. Important

==-==          5. Exciting

2

==+== D. Reviewer expertise

==-== Enter a number from 1 to 4.

==-== Choices: 1. No familiarity

==-==          2. Some familiarity

==-==          3. Knowledgeable

==-==          4. Expert

4

==+== E. Paper summary

This paper proposes a new method of data allocation that the authors claim improves consistency and performance in file systems. The authors to replace the current one-to-one mapping inside storage systems by a one-to-two mapping. This means that one logical block is potentially mapped to two locations in disk. This increases the system flexibility by providing a choice in the allocation of space in HDDs. Then, a new sub-system called called _Kero_ is proposed that make use of the previous idea. Kero keeps completely control over one small partition in the disk and maintain a map-table that helps in finding the block in either FS or Kero space. A difference with Journals in Journaling File systems is that the Kero space is accessible for foreground reads.

==+== F. Comments for author

1. Conceptual comments:

The paper starts by comparing consistency of journaling with copy-on-write mechanisms. I think the authors need to further explain why this comparison makes sense. Journaling file systems use a specific process (atomic transactions) to ensure that specific file systems structures are consistent on disk at all times. Copy-on-write is a mechanism that does not overwrite existing blocks, but it does not provide consistency by itself. It must be combined with other mechanisms (e.g., ordered writes) to ensure consistency. Thus making this comparison is misleading and leads me to suspect that the problem with this paper may be more fundamental -- that there is a lack of understanding of what file system consistency means. There are other statements in the paper (mentioned below) which seem to strengthen this suspicion.

* There are significant parts of the paper where concepts (as presented) are not very straightforward to understand. I would suggest to add pictures with examples of the statements that the paper is supporting. For instance, I still do not understand how this system keeps the consistency without knowing the semantics of the block and without keeping a writing ordering between them: "Because the Kero prototype system monitors all the data updates at block level, enforcing ordered updates between metadata and data is not necessary" I think this deserves further explanation. An example where ext3 write-back (where no order is enforced) mode failed and Kero success would greatly improve the readability of the paper. 

* As a general comment, I think the paper could be greatly improved by re-visiting the pictures and make them more explanatory. For example, in figures 1,2,3 and 4 it is not explained what is the semantics of the arrows.

* The idea of having a separate partition to log some data is very similar to me to the journal File systems. I think the paper could be more appreciate if the authors highlight quite clear the differences between the Kero sub-system and a normal journaling file system. With this same argument, how can Kero eliminate the multiple writes requirement for logged blocks; cleaning must be performed at some point as is done in journaling file systems. Further, the paper lacks an explanation of how the cleaning in Kero is perform. What policies are used? How often does cleaning occur? And this are very important questions since the kero without cleaning can get quite fragmented. 


* One of the claims in the paper is that Kero can be used with _any_ file system. I think this phrase is not quite clear, Do they mean that Kero can be used with LFS as well? In my understanding, Kero can be used with static layout file systems like ext2 or reisefs without journaling.

* One central part of the Kero sub-system, is the mapping information. In the paper it is mentioned that this should be persistent in order to keep the consistency of the file system intact. However, the paper does not state a way of how to do this. This is a very important task since each time you write into the kero space this map should be updated. Probably this is done in the "log record" but, once again, it is not explained.

* In section 6.2, the authors calculate the memory space used for the mapping table. They base their calculations in terms of the Page Cache size. I would suggest that these overheads can be better represented by using the size of the kero space. 

* The garbage collection takes all the valid blocks in the log and writes them to the file system. I would say that this will incur in a lot of small random operations. However, the paper mentions that "the overhead of this process is very small". I would be good to have the numbers that support this statement in the evaluation section.

* In the evaluation, there are some questions which answer could improve the paper:

- What is the impact of cleaning?
- What is the impact of kero space size?
- How much is this space used? Hit-ratio? This answer will depend on the workload. If a read-only workload is used, then the usage will be 0.

* In the concurrency experiment  what each of those threads needs to be specified.

Statements that are troublesome, and indicate that the authors either lack a basic understanding of file system consistency or need to do a much better job of communicating what they want to say. The most critical are listed first:

* "If there is an update request, then based on a 
decision making rule in speciﬁc implementation, we either 
choose to update the unit in the Slogical or we create a new 
unit in the SKero . We can see that the object is written 
once to the one of the mapped units, hence unlike object 
updates it does not have the “multiple writes” problem, and 
also avoids the meta updates in coupled address update."

- Updates in coupled address updates do not tell you which was the original location of the block, thus Kero cannot eliminate metadata updates generated by the file system which are in place to keep track of where new versions of data (or metadata) are created. If it did do so, the file system will be in an inconsistent state.


* "The atomic block update assumption 
is common in ﬁle systems, and therefore crash recovery is 
unnecessary. "

- Metadata structures in file systems must be kept synchronized at all times; the atomicity of I/O operations (although even this is in contention these days with torn writes, etc.) is vastly insufficient to eliminate consistency requirements.


* "Because the Kero pro- 
totype system monitors all the data updates at block level, 
enforcing ordered updates between metadata and data is not 
necessary any more. "

- For the file system these structures still need to be consistent at all times. How this statement can even be remotely true beats me.


* "Upon an object update, the 
logical address of the object is kept unchanged while all 
changes to physical locations are handled transparently to 
upper layers by 1-n mapping scheme. As a result, there will 
be no recursive updates."

In a copy-on-write file system you do not know the original location of the object being modified to create a correct map entry. This is managed by the file system of which the block layer is oblivious.

* " In our new decoupled address update mechanism, we aim 
to exploit one-to-one mapping to create one-to-many map- 
pings. It is anticipated that this will allow us to change the 
unique address property, and as a result the recursive up- 
dates can be avoided."

- The authors seems to imply (with eliminating recursive updates) that the file system needs to be modified to eliminate recursive updates, but do not suggest any changes to the file system at all.


* " These infrastructures expand the horizon of 
traditional ﬁle systems consistency semantics among meta- 
data, to consistency between data and data, among multiple 
ﬁle operations, multiple ﬁles, and even different ﬁle sys- 
tems. "

What is consistency between data and data? or multiple files? these are all maanged by applications and should not be a concern of the operating system.


3. Minor comments: Typos / missing-refs / etc.

"an block" -> "a block"

Figures 9 and 10 y-axis not consistent.

There are several sentences that need to be paraphrased for clarity. Currently the prose is very unclear with a lot of hard-to-parse sentences.


==+== G. Comments for PC (hidden from authors)

[Enter your comments here]

==+== End Review