internal:projects:sys:start
Table of Contents
Trace Analysis and Replay
Participants
- Daniel Campello
- Humberto Chacon
- Christopher Kerutt
- Hector Lopez
- Steven Lyons
- Jason Liu
- Raju Rangaswami
Project Goals
Analyze all kinds of storage access traces, including block, file, cache, and syscall levels.
Replay syscall traces
Initial part of the project until December 2013 involved:
To build a trace utility to record file system oriented (from and into the HDD) system calls (using strace) used by a single process. The goal is to create a trace that can be used to evaluate caching efficiency.
Reading List
-
- Home use application (monolithically developed, heavy API, interactive): iBench including iWork and iLife
- Use DTrace (system calls, stack traces, in-kernel functions such as page-in/outs); AppleScript for repeatable and automated runs
- File is not a file: documents organized into complex directory trees
- Sequential access is not sequential: pure sequential access is rare (meta-data, headers access more often, out of sequence)
- Many auxiliary files; write is forced (fsync ‘misuse’); renaming is popular; heavy multithreads use for I/O latency hiding
-
- Measurements at disk level or RAID-controller level, through SCSI/IDE analyzer attached to the IO bus intercepting electrical signals (in 2004)
- Application-dependent variability: r/w ratio, access pattern, write traffic
- Environment-dependent variability: request arrival rate, service time, response time, sequentiality, idle time
- Burstiness is consistent through all workloads
- Measurement environments (not enough information provided due to NDA) include: enterprise (HPC systems, RAID, for web, database, and emails); desktops (PCs, single disk, single user applications); consumer electronics (personal video recorder, m3 player, game console, digital camera)
- Major findings:
- Disks are idle (high percentage bus idle), although idle interval is environment dependent
- Average response time is only a few milliseconds
- Access pattern is more random for enterprise than desktop; CE is highly sequential (video recording); use ‘degree of sequentiality’
- Request size varies but variability is low
- Use ‘rewrite distance’ (my term) for write lifetime; it’s application dependent
- Inter-arrival time varies greatly; it shows long-range dependency (using Hurst parameter)
- Seek distances exhibit “extreme long-range dependence”; locality is an inherent characteristic of disk-drive workloads (?)
-
- Two workloads: EECS and CAMPUS
- EECS is research workload for home directories; dominated by metadata requests (for cache consistency) and read/write ratio of less than 1.
- CAMPUS workload is almost entirely email; all files can be categorized according to file names with predictable size, lifespan and access pattern.
- Issues on NFS tracing: hidden file system operations (never accessed files, no on-disk layout, file hierarch (can be learned), no internal state of server); mismatch NFS interface (no open/close), client-side caching, lost NFS communications, network reordering
- Detecting “runs” (can be defined as contiguous accesses, with blocks rounding up to 8k, to a file with gap no larger than 30 seconds; also need to sort accesses within a reorder window of a few milliseconds)
- Runs are classified as entire/sequential/random (both traces mostly sequential sub-runs separated by short seeks), read-only/write-only/read-write (big difference between the two traces)
- Lifespan of blocks: over 1/2 blocks in EECS die in less than 1 second (log or index files), few more than a day; CAMPUS blocks live longer (due to email client operations)
- EECS problems found: user store temporary files (web page caching, dot files, Applet files) in home directory
- CAMPUS problems found: email behavior (flat user inbox file, lock files)
- Large variations of workload over time (diurnal pattern) observed
- Sequentiality metric (delta-consecutive vs. reorder window)?
-
- nfsdump is used to gather traces at NFS server (like tcpdump, using libpcap) and output human-readable text
- nfsscan is for data processing and output one or more tables (containing total number of NFS operations, time and latency of these operations, and information of file accessed
- A set of utility tools for helping dissect the data output from nfsscan and for gnuplot
- One can obtain the following information: workload intensity over time, read/write, data/meta-data, overall, per client, per user, put directory, per file
-
- Collected traces from 4 environments: instruction lab, research lab, web (all HP-UX), and NT cluster.
- Histogram of file system events: read dominates; web has significantly more reads; high number of file stat calls
- Block lifetime: some traces show bimodal distribution; most blocks die due to overwrites and there’s a high degree of locality in overwritten files; average block lifetime is longer than estimated (for write delay)
- Caching effect: small write buffer is sufficient even for write delays up to a day; small cache sufficient to absorb most read traffic; cache effect on reads or writes varies…
- File size distribution, file access patterns: mostly reads, mostly sequential
-
- File I/O traces of VAX/VMS systems collected at 8 sites
- System characteristics: #files, file size distribution, % active files/data, #IOs, % control vs. access ops, file creation/deletion ratios
- File access characteristics: #opens/active file (distribution), #reads/active file (distribution), #writes/active file (distribution), open-to-close, close-to-open timing, read/write activities within open to close intervals
- Process access characteristics: #users, #processes, process lifetime, #open files/process, #file ops/process, inter-open time distribution
- File sharing: “simultaneous sharing” is low, “sequential sharing” (not necessary open files at the same time) is 2-4 times more (further classified into read-only, write-only, read-and-write).
- Workload analysis: relative stable behavior observed for IO operations (intensity, file control ops, read/write)
-
- Sprint distributed file system, 40 diskless workstations, 4 file servers
- File system activities are collected at server end (some require kernel modifications)
- Measurements/statistics on caching collected through counters
- Two main changes from '85 study: larger files, higher intensity (more burstiness)
-
- Instrumented BSD Unix (4.2), timeshared VAX-11/780s at UC Berkeley EECS
- Record only user file system activities (open, close, seek, unlink, truncate, exec)
- No reads and writes (no location and timing for individual disk IO, no information on disk access due to paging)
- Main results: low IO activity; most file accesses are sequential; short file open time; short file lifetime; caching can have significant effect
Meetings
- 07/13/15: Updates on module; kprobe takes over
- 07/02/15: (Hector) Updates on module; search for missing records; one bug found
- 06/30/15: Updates from Humberto on ARTC and Hector on
- 06/23/15: (Hector) Updates on module; search for missing records continues
- 06/23/15: (Hector) Updates on module; search for missing records continues
- 06/23/15: (Humberto) Trace discrepancies with ARTc
- 06/19/15: (Hector) Initial investigation of lost trace records
- 06/16/15: (Hector) Various updates
- 06/16/15: (Humberto) ARTc port v1 complete
- 06/09/15: (Humberto) Updates on ARTc port
- 05/22/15: (Humberto) Initial port to ARTc replayer
- 05/22/15: (Hector) Current status of replayer (single-threaded)
- 12/23/13: Handle multi-page requests; Understanding the cache simulator code
- 11/25/13: Next step: MRC Construction using LRU cache simulator
- 10/28/13: Final remaining tasks for user-level page cache simulation
- 10/14/13: First cut – need to work on bug and sendfile implementation
- 09/30/13: Problems with script
- 09/16/13: First MRC, discussion of project directories
- 08/21/13: Implementation – first version
- 08/07/13: Next steps – data generation
- 07/31/13: Updates on systrace (-e) options, recording page IDs, and other status
- 07/24/13: Review of initial document, scrum tasks, and overview of strace based scripting and expected output
- 07/17/13: Project introduction
internal/projects/sys/start.txt · Last modified: 2024/06/28 20:42 by 127.0.0.1