Monday, November 24, 2008

Log structured file system..

Paper: Design and Implementation of a log-structured file system.
Author: Mendel Rosenblum and John K. Ousterhout.

Liked:
• Structure of LFS simplifies crash recovery process as no complex data structure like trees and free-block lists are used.
• Concept of buffering file system changes in memory and then writing sequentially into disk removes the seek time and hence improve bandwidth usage of file system.
• Concept of using bimodal distribution and treating hot and cold segments according to age of data is justified because data is high disk utilization also implies probability of more live data movement which results in high write cost. The desired behavior is low write cost and high disk utilization so if we should clean cold segment at high utilization and hot segment at low utilization which would result in less overhead corresponding to live data.


Disliked:
• Log-structured system assumes large amount of resource availability. For example large memory size so that all writes can be cached and read can be served from memory itself. This is not cost-effective and also performance is questionable in cases of large number of read requests. Also approach requires use of NVRAM for better crash recovery. Also if read requests are not sequential them impact on performance would be visible. It can not be used in magnetic media as most of the requests are read only.
• Functioning of this approach requires large amount of free space as well. What if we have a storage system where data keeps on increasing? A possible scenario is that even if we have fragmented free space, segmentation cleaning would become a problem.
• Performance can be limited by memory cache replacement and optimization algorithm which decides what to keep in memory and what not at any point of time?

• Function of write cost is not clearly defined. At one place author claim that if write cost is 1.0 that means new data can be written at full bandwidth and there is no cleaning overhead whereas by the formula write cost = 2/ (1-u), value of 1.0 is never possible. Here only if u=0 then write cost have minimum value equal to 2. This implies that if all data is newly written and later moved to other segment then at the best 50% of bandwidth can be utilized for writing into segment which is about to be cleaned. Paper also mentions (at end of 6th page) that if u=0 then write cost is 1.0.

details will be provided later.. right now i dont have any time

No comments:

Post a Comment