DSM/SSM Quantitative Analysis

Participants

Project Goals

Workload Consolidation Model

Quantative Analysis of DSM vs. SSM

Task List

Design Granularity of Next Rounds of Tests
Plan for the Single LUN Model I/O Tests

Points for Discussion at Next Meeting

Journal Entry 6:26 PM 4/8/2010
Experimentation Notes
Multi-Page Wiki Structure

Model Notes (Top Down)

BASIL Model

http://www.vmware.com/files/pdf/partners/academic/BASIL-gulati.pdf

Experimentation Notes (Bottom Up)

General Thoughts

The items below consider the number of variables in play with a given workload, here are some thoughts on how to run the bottom up testing.

For DSM, RAID'ed, Cached, and “Think Timed” I/O the runs will probably need to be executed several times. The final result will be the average of the runs and standard deviations must be calculated.

Single LUN Model I/O

Run consolidated workloads with known performance metrics on a DSM together on the same DSM. Observe the results. This should be representative of pure consolidation in which only workload parameters are varying.

DSM/SSM I/O

Run consolidated workloads with known performance metrics on a DSM together on an SSM consisting of the aggregate disks of the two DSMs. Observe the results. This should be representative of consolidation in which both workload parameters are varying and the underlying disks are varying.

Direct I/O

For both Single Model I/O and DSM/SSM I/O, disable all caching in the stack.

Read Cached I/O

For both Single Model I/O and DSM/SSM I/O, enable read caching in the stack.

Write Cached I/O

For both Single Model I/O and DSM/SSM I/O, enable write caching in the stack.

Fully Cached I/O

For both Single Model I/O and DSM/SSM I/O, enable read and write caching in the stack.

RAID'ed I/O

For both Single Model I/O and DSM/SSM I/O, enable various RAID levels on the underlying disk subsystem. This should be tested out with read cached, write cached, and fully cached I/O tests.

"Think Timed" I/O

For both Single Model I/O, DSM/SSM I/O, and all of the other variables, think time modeling will be necessary. The worst case of all workloads with no think time will reveal the weak points of the model and the system, however, this is not a realistic scenario.

Paper/Topics Structure

Introduction
Assumptions
Questions To Be Addressed by the Work

Single Workload Model Questions

     What is the expected performance of a a given single workload, with known parameters, 
     given a device configuration?

     What device configuration and spindle allocation is needed to achieve a certain level of 
     performance for a given single workload with known parameters?

   Device Questions

     What will be the impact upon current workloads attached to a device/LUN if additional 
     spindle is added?

        This question's answer should take into account spindle rotational latency and seek times.  
        Ideally, it should support a heterogeneous mixture of spindles.

     What will be the impact upon current workloads attached to a device/LUN if the spindle 
     RAID level is changed?

   Consolidate Workload Model Questions

     What will be the impact upon a current workload with a known device configuration should a 
     new workload, with known parameters, be consolodated onto the existing device with the existing
     workload?

        In short, how does consolidation of workloads affect the existing performance characteristics.
       
     What amount of spindle is necessary to accomodate a new workload that is to be consolidated with 
     an existing workload while maintaining given performance requirements.

Background, Context, & Motivation

 Small to Large Enterprises, Probably Not Scientific Computing
   Monolithic Scale Up Arrays and Scale Up Brick Storage
   Agility of Provisioning (Capacity vs. Performance)

Workload Characteristics

   Sequentiality vs. Randomness
        That which you think is sequential is really not.
        Consider multiuser scenarios (Streaming media to multiple users)
        Consider consolidation scenarios (VMWare, multi-database services, etc...)
        Hence, single instance sequential * multiple users == random

Workload Consolidation Model

DSM/SSM

Experimental Methodology

   Exhaustive microbenchmarks

   Block Traces

Mitigating Factors

   I/O Scheduling
        Proportional Share Scheduling for Priority Control
   Block Reorganization
   Block Replication

References and Terminology

References

Terminology

DSM - Dedicated Spindle Model

SSM - Shared Spindle Model

Meetings

08/16/10: Paper layout discussion and writing tasks; partial discussion of new data from Mike
08/09/10: Analysis of IOMeter and Windows timing/scheduling behavior and brief paper discussion
07/20/10: Revisiting project goals (with Mike), thoughts on IORate influence and zones of latency values, background section ideas
07/08/10: Boundaries for I/O Rate, Questions for the Paper, Paper Structure, and Presenting the Model
06/28/10: Revised methodology and Plans for the paper
06/14/10: Analyzing anomalous results from Controller 1 experiments
06/11/10: A recap of current issues, potential directions moving forward include new data with controller replacement and fewer drives, revisit some old data as well
05/10/10: Single workload model parameters, debugging performance with read caching
05/03/10: Paper outline and content, think time modeling, OIO and IOPS for consolidated workload modeling whiteboard
04/19/10: Robert's Storage Systems class final presentation
04/16/10: Decoding first round of numbers
04/12/10: Workload volume footprint, multiple volumes and influence of I/O rate, DB model
04/06/10: Experimental testbed details and troubleshooting
03/15/10: Review of previous meeting, experimental design, model discussion