Use intrinsic data duplication in storage systems to improve I/O performance.
The goal of this project we explore the premise that intrinsic data duplication in storage systems can be utilized to improve I/O performance.
I/O deduplication comprises three key techniques: (i) content based caching that uses the popularity of “data content” rather than “data location” of I/O accesses in making caching decisions, (ii) dynamic replica retrieval that upon a cache miss, dynamically chooses target replica to retrive that minimizes disk head movement , and (iii) popular content duplication that creates copies of popular files on disk so that optimization (ii) becomes more effective.
A fourth optimization which is still under consideration is optional replica updates that dynamically chooses between updating the target location (as requested) for duplicate content and registering persistent copy-on-write metadata information.
Effect of the head position optimization:
The solid line shows the execution time for 8192 requests distributed in a 500GB disk. Each request is of size 4096B and its content has 1 to 1000 copies (x axis).
The dotted line shows the execution time for 8192 requests distributed in a 500GB/x disk.
Static similarity:
Web traces
Webmail:
bash-3.00$ wc -l webmail-hda-hashes-blocks 2621433 webmail-hda-hashes-blocks0.836426698 bash-3.00$ wc -l webmail-hda-hashes-blocks-freqs 679913 webmail-hda-hashes-blocks-freqs bash-3.00$ tail webmail-hda-hashes-blocks-freqs 190 1bf5da25d44d5e211da6d05a71aa8bb5 287 edc44bbcad017816f872db4dcbe0ef2e 291 34b96409e3df0a7de5e829b1c4bb3000 401 115d93fc589fab0673448f2fe76e9b09 408 a12b4a9b75f829fd55107cfbd7981279 413 01718ffb8ce64b88a42d84fefc703145 419 ecb4f1d9d1677005bdaaa2e2b5b9464e 421 941d368d2ca84b91a6a3709d0117db8a 879 48859c5c4d9612a8fa65b9c239511b61 612994 3df1244f6143869f52abf2a1d73d0c0f
static_similarity = (2621433 - 612994) / (1679913 - 1) = 1.195562029
Online:
bash-3.00$ wc -l online-hda-hashes-blocks 2621433 online-hda-hashes-blocks bash-3.00$ wc -l online-hda-hashes-blocks-freqs 1672880 online-hda-hashes-blocks-freqs bash-3.00$ tail online-hda-hashes-blocks-freqs 201 5a91379cfb04d88cbc0529b005263b92 220 1bf5da25d44d5e211da6d05a71aa8bb5 331 34b96409e3df0a7de5e829b1c4bb3000 483 115d93fc589fab0673448f2fe76e9b09 493 a12b4a9b75f829fd55107cfbd7981279 501 01718ffb8ce64b88a42d84fefc703145 502 ecb4f1d9d1677005bdaaa2e2b5b9464e 503 941d368d2ca84b91a6a3709d0117db8a 998 48859c5c4d9612a8fa65b9c239511b61 580305 3df1244f6143869f52abf2a1d73d0c0f
static_similarity = (2621433 - 580305) / (1672880 - 1) = 1.220128892
webmail+online:
rkoll001@leopard:~ 82% wc -l online-webmail-hda-hashes-blocks 5242866 online-webmail-hda-hashes-blocks rkoll001@leopard:~ 83% wc -l online-webmail-hda-hashes-blocks-freqs 1960763 online-webmail-hda-hashes-blocks-freqs rkoll001@leopard:~ 84% tail online-webmail-hda-hashes-blocks-freqs 387 5a91379cfb04d88cbc0529b005263b92 410 1bf5da25d44d5e211da6d05a71aa8bb5 622 34b96409e3df0a7de5e829b1c4bb3000 884 115d93fc589fab0673448f2fe76e9b09 901 a12b4a9b75f829fd55107cfbd7981279 914 01718ffb8ce64b88a42d84fefc703145 921 ecb4f1d9d1677005bdaaa2e2b5b9464e 924 941d368d2ca84b91a6a3709d0117db8a 1877 48859c5c4d9612a8fa65b9c239511b61 1193299 3df1244f6143869f52abf2a1d73d0c0f
static_similarity = (5242866 - 1193299) / (1960763 - 1) = 2.065302673
Lab traces
mad-max
ricardo@mad-max:~/research/vcache/data/mad-max$ wc -l sda.hash.blocks.sorted 39062042 sda.hash.blocks.sorted ricardo@mad-max:~/research/vcache/data/mad-max$ wc -l sda.hash.blocks.freqs 27431635 sda.hash.blocks.freqs ricardo@mad-max:~/research/vcache/data/mad-max$ tail sda.hash.blocks.freqs 56906 2f7cf239e69c81d559fd8c7340c0cc5a 56915 68032d6801c04eee872fa4fe57d8a115 56927 36967e11b7887d32d8e6a4adc8d71414 56932 8e7974c25680dec98bfd3cfedc48e918 56972 e8df727d1312a825f5e02cf8d95a7309 57060 a43f839498484cf668095c11038733e0 57281 95c898b9208b7895b1f9b3313802d865 64334 711a5c482f6d8f2dc849b60145092f92 486646 d4478f77dc66cde39a432db0299f6d71 1727055 3df1244f6143869f52abf2a1d73d0c0f
static_similarity = (39062042 - 1727055) / (27431635 - 1) = 1.361019435
madmax + ikki + apu + topgun
rkoll001@puppy:~ 262% wc -l homes.sorted.2 186024571 homes.sorted.2 rkoll001@puppy:~ 261% wc -l homes.freq.2 62273294 homes.freq.2 rkoll001@puppy:~ 260% tail homes.freq.sorted.2 56972 e8df727d1312a825f5e02cf8d95a7309 57060 a43f839498484cf668095c11038733e0 57296 95c898b9208b7895b1f9b3313802d865 64341 711a5c482f6d8f2dc849b60145092f92 79652 8e0bebb0539a580d44a9cfd65214ffe8 427475 675a226b6d15cbadacce60c04e10dca1 486646 d4478f77dc66cde39a432db0299f6d71 769700 54f1565f8a686e900ad84ad5ad00aeb3 779708 bc8b26a149ad79305567f89e9c5353bd 97863369 3df1244f6143869f52abf2a1d73d0c0f
static_similarity = (186024571 - 97863369) / (62273294 - 1) = 1.41571447
Mail traces
cheetah:
rkoll001@leopard:~ 33% wc -l cheetah-sda-hashes-blocks 73103754 cheetah-sda-hashes-blocks rkoll001@leopard:~ 34% wc -l cheetah-sda-hashes-blocks-freqs 27675205 cheetah-sda-hashes-blocks-freqs rkoll001@leopard:~ 32% tail cheetah-sda-hashes-blocks-freqs 1617 f991a8fd929ee7b267aba02880e694a1 2503 85e3fd60029277c87f74d1fa4d6ca22e 2815 52849183a60a723e83315dda0b23775e 2901 82b1b5c8ddaa19dd84fb4f572b0ccc4d 3195 0ac632fa7ec32e2259a49a0a7c708f94 3489 48859c5c4d9612a8fa65b9c239511b61 3889 70284059c0b91af0f85761e87941acd0 3922 d128857459c72b774379585e9eae2590 6634 173c963d3665445f1c863fa5a30a3918 40358276 3df1244f6143869f52abf2a1d73d0c0f
static_similarity = (73103754 - 40358276) / (27675205 - 1) = 1.183206382
Content analysis
madmax
rkoll001@puppy:~ 360% tail madmax.sda.hash.blocks.freq 56906 2f7cf239e69c81d559fd8c7340c0cc5a 56915 68032d6801c04eee872fa4fe57d8a115 56927 36967e11b7887d32d8e6a4adc8d71414 56932 8e7974c25680dec98bfd3cfedc48e918 56972 e8df727d1312a825f5e02cf8d95a7309 57060 a43f839498484cf668095c11038733e0 57281 95c898b9208b7895b1f9b3313802d865 64334 711a5c482f6d8f2dc849b60145092f92 486646 d4478f77dc66cde39a432db0299f6d71 free block hashes 1727055 3df1244f6143869f52abf2a1d73d0c0f free block 000000000000
d4478f77dc66cde39a432db0299f6d71
rkoll001@puppy:~ 357% grep -n d4478f77dc66cde39a432db0299f6d71 madmax.sda.hash.blocks | head 3960654:d4478f77dc66cde39a432db0299f6d71 3961833:d4478f77dc66cde39a432db0299f6d71 .. rkoll001@puppy:~ 359% ./collection_static/find-block madmax 3960654 0: [63,78140159] 3960654 31685287 ricardo@mad-max:~/research/vcache/code/collection_static$ sudo ./hashblock /dev/sda 31685287 8 1 73f3618c9fa1b8a070273df01b8b4153 3030300a202020202020312032393338 2 8a788c54488001a0006d8558f511fc64 20203120323933383033393036206266 3 cffa939ad8a6028d12d8441bf4fd11dc 30333931322062663631396561633063 4 78737061c7e6d64ade158a4143878321 36313965616330636466336636386434 5 f7a88a7d2791fad80803fc5738074593 64663366363864343936656139333434 6 00978ee1d510b78df39c45430af1916d 39366561393334343133376538622030 7 f6fb9be5993f18a21cc4262785f40b9a 31333765386220303030303030303030 8 a545b72922179e75b7a98dc8b8da0b9e 30303030303030303030303030303030
but printing content (not in hex) shows:
1 73f3618c9fa1b8a070273df01b8b4153 000 1 293803900 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 1 293803901 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 1 293803902 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 1 293803903 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 1 293803904 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 1 293803905 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 /dev/sda 2 8a788c54488001a0006d8558f511fc64 1 293803906 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 1 293803907 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 1 293803908 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000
topgun
rkoll001@puppy:~/home-traces 330% tail ../topgun.sda.hash.blocks.freq 1472 c2507c3aaf41c99d38c33d60bc90111c 1503 940d36a6a9373e53bf4483a5076e6dd3 1782 8a08d49bc4fada337a91d1808de03ba8 2165 501ff97e8c3ea1371701688f5f0bfda0 2272 48859c5c4d9612a8fa65b9c239511b61 7066 2853de6d55536980d9a530ee6c8e9375 427475 675a226b6d15cbadacce60c04e10dca1 free block postmark readable random chars 769700 54f1565f8a686e900ad84ad5ad00aeb3 free block postmark readable random chars 779708 bc8b26a149ad79305567f89e9c5353bd free block postmark readable random chars 31794051 3df1244f6143869f52abf2a1d73d0c0f free block 000000000000000000000...
675a226b6d15cbadacce60c04e10dca1
rkoll001@puppy:~/home-traces 346% grep -n 675a226b6d15cbadacce60c04e10dca1 ../topgun.sda.hash.blocks | head 580082:675a226b6d15cbadacce60c04e10dca1 580086:675a226b6d15cbadacce60c04e10dca1 580090:675a226b6d15cbadacce60c04e10dca1 580094:675a226b6d15cbadacce60c04e10dca1 580098:675a226b6d15cbadacce60c04e10dca1 580102:675a226b6d15cbadacce60c04e10dca1 580109:675a226b6d15cbadacce60c04e10dca1 580116:675a226b6d15cbadacce60c04e10dca1 580126:675a226b6d15cbadacce60c04e10dca1 580130:675a226b6d15cbadacce60c04e10dca1 rkoll001@puppy:~/home-traces 347% ../collection_static/find-block topgun 580082 2: [4353615,297314954] 35889 4640719 root@topgun:~/ricardo_stuff/vcache/code/collection_static# ./hashblock /dev/sda 4640719 8 1 026bbcf1c5427e0cc8e075268170e421 232c406b2c5060294c5f5a4f52513033 2 2ee2a9d275179b265f43988773fd041f 646257495556427c286a2050545f6935 3 dbf1c9e824b0470d72a7efca3136f356 525856417a5e5d2439734144787c6f38 4 0cfaaca8826da327dd2084b1697f5c3b 5c235f2a35413c33742d6e773f395b29 5 9549e813aa8c48beb743105cd232ae7c 50506647553d50243a4b4c6775224a49 6 51fe121098dd40e702fd1ce8e0bf3d09 26277a4a3b395478242e5b2e4d313244 7 2177aa1b9cc6689a15f698d1a4eafe42 30787d33343353243621356d6367242c 8 831e583c5371980234949344e49bdcce 7b455e7055763722602e41487d584b20
54f1565f8a686e900ad84ad5ad00aeb3
rkoll001@puppy:~/home-traces 334% grep -n 54f1565f8a686e900ad84ad5ad00aeb3 ../topgun.sda.hash.blocks | head 580080:54f1565f8a686e900ad84ad5ad00aeb3 580084:54f1565f8a686e900ad84ad5ad00aeb3 580088:54f1565f8a686e900ad84ad5ad00aeb3 580092:54f1565f8a686e900ad84ad5ad00aeb3 580096:54f1565f8a686e900ad84ad5ad00aeb3 580100:54f1565f8a686e900ad84ad5ad00aeb3 580104:54f1565f8a686e900ad84ad5ad00aeb3 580107:54f1565f8a686e900ad84ad5ad00aeb3 580111:54f1565f8a686e900ad84ad5ad00aeb3 580114:54f1565f8a686e900ad84ad5ad00aeb3 rkoll001@puppy:~/home-traces 335% ../collection_static/find-block topgun 580080 2: [4353615,297314954] 287088 4640703 root@topgun:~/ricardo_stuff/vcache/code/collection_static# ./hashblock /dev/sda 4640703 8 1 281599a2e1c3e086a00828201cfe82bd 4c655a65624a3653357e715076244324 2 4fef6272f5e1f4ebfe114bf23371ed6b 245c226e7d54676e34403f265e3d3d25 3 c69559f6765f83ada5b31ee37dec6cc7 4963446e5822776d63775c367b4f2d24 4 6a38f6888c631da78ec2c4ae841b2fff 2e40324f30435e3e525743494d384d22 5 fc96ec96664b296c06b1a9999920bf15 655e77343c2c7b275c5e6b3337792e2f 6 ba2ffaea52e538a4b6bc7ca1b0025f3c 49777d2b3b5b3e30302f7a2a3e524b7d 7 71a489f7a9bc010aecdc1b1f4db8a50b 293e7c27696f2424265270537a443559 8 ebed780c41eafb8c564c29e222f5c76d 5b7c5a5a6c362f2f4470207e6d557526
printing the content (not in hex) shows some random but clear text data. This is typical temporal files content of postmark. In this case postmark was executed with the same random seed.
root@topgun:~/ricardo_stuff/vcache/code/collection_static# ./hashblock /dev/sda 4640703 8 1 281599a2e1c3e086a00828201cfe82bd LeZebJ6S5~qPv$C$py}hmKd H58nM4~.}pv}Q7wSt'&R0{oSe8eY_i&z}T'VGt*I)RJ>[Aq!ZAVvdBF"@ l|}B-:7>bL8C0Gw>7r~U'GmbpL}waz00{sz?>X^j?b zo&/ShA+HN3^F.2J<> N7#i(~ek~G<SO6e8Vd~@U:mvb0!_\kj_Pbt_]Azjez5WdWQ \A6j@<}4kf^JcF":i=L3NF9EO4rC0r4ATX&hR?90x]p4a7`yJ8H"tf!mB"=yeqYgHRD{2mz#&NR9n*xo#A>;hX&Ab\d\!8Vt"e.8z2Z2AsB@G2{:D*MrEt%@1+?udy{r>v)(YPq2LgVGG'Z^5H==V@5*{d,OgE<S'=GOZoyu0X/:bMJ#l7":<;EvxuF@_=Z\COCLesV9}dzV~dTh\+Ft>Ac;a?em<9A]dJX9S!'x%J#6BH;^CJm"k+).wIYEC4#]ry!Y~m,7/0)I>I=Fs1r)PD<JKs[Ldzx:-u=pc9v?x^!,_e3JQ3h[U<koS-kMwFgI/dev/sda 2 4fef6272f5e1f4ebfe114bf23371ed6b $\"n}Tgn4@?&^==%&-:Kx-hVlMr}<mEcR'ZAG#oyBBreDOxf:!P_MLENUV@-m,0=Q`tNTi8*?Hx-vJlT|:TN-7&h$u5y%]TI1:s"k1r+MW`(Hxy'>fSstZ@1b.V^Ky1wT|5DBpeHBeu'mWA@f(uL?qYDXgh>_qe@u \V)ul3"J(euB;8}zT()Jg%<&]nkSe"rFIytn>K1c[u:^P_.zpzu,6LDn/!Mp[D77Q%^n*Q{/7aMFGJXA"JIND$:B'SQIrM_[~~!BRt?LU3un|9hksjqWlAS]@t5(9}A{F8{Mi1YS,_{~WI516phB_kS/p,#wWgbv afC`2@@{? ({z(R%|gbhZX-w.ai2`@Jzu3Z[Z/|@>7NCh;.*qUnpK_Z:-+PQ8U $.d>N,%kks-F<kcAWiX8.,)\/;lPng<Zm<+wwBxU1XQ6sttt~G(.A%o"QH9$HrcNMp-i#o"aYO`SNxiO}GqV82d'EsDs6;l-p \9oR>&ad7)-*+qL<h`:|@KjHeSza/dev/sda
bc8b26a149ad79305567f89e9c5353bd
rkoll001@puppy:~/home-traces 339% grep -n bc8b26a149ad79305567f89e9c5353bd ../topgun.sda.hash.blocks | head 580081:bc8b26a149ad79305567f89e9c5353bd 580085:bc8b26a149ad79305567f89e9c5353bd 580089:bc8b26a149ad79305567f89e9c5353bd 580093:bc8b26a149ad79305567f89e9c5353bd 580097:bc8b26a149ad79305567f89e9c5353bd 580101:bc8b26a149ad79305567f89e9c5353bd 580105:bc8b26a149ad79305567f89e9c5353bd 580108:bc8b26a149ad79305567f89e9c5353bd 580112:bc8b26a149ad79305567f89e9c5353bd 580115:bc8b26a149ad79305567f89e9c5353bd rkoll001@puppy:~/home-traces 340% ../collection_static/find-block topgun 580081 2: [4353615,297314954] 35888 4640711 root@topgun:~/ricardo_stuff/vcache/code/collection_static# ./hashblock /dev/sda 4640711 8 1 434acb2d12010f6b33cf28bb721c107c 4b444962693e254c5e75252a2444272a 2 28d1e037fcf37d68dd14edc830dc16b3 3629673f6b385767516e257636526921 3 9a2ec9e2ea991177d8c01dc1b8083c8a 3c2861294d55232e454c62526b225029 4 01ea271001bca0d9d1c55922684373a2 79777c7128493b322821347d5d422133 5 31c4db8d8d3dddd12d03d447d04dc0e2 423e4f27573c35212f3e733b5b2a2c30 6 4cceb92ad6b1e850bd26caf5dd115c75 7e282b25213f44737439385a40684c36 7 79a2e4df733beeafbdc8c80e309a7fcb 3067464f725d79255f68473641622e46 8 13f345b7f8ecef9e05be5783d2c7c8f3 2a582c68263e5c325b7c2e3f3d5a535d rkoll001@puppy:~/home-traces 329% grep 434acb2d12010f6b33cf28bb721c107c ../topgun-hashes | head 4640711 434acb2d12010f6b33cf28bb721c107c 4b444962693e254c5e75252a2444272a 4640743 434acb2d12010f6b33cf28bb721c107c 4b444962693e254c5e75252a2444272a
3df1244f6143869f52abf2a1d73d0c0f
rkoll001@puppy:~/home-traces 343% grep -n 3df1244f6143869f52abf2a1d73d0c0f ../topgun.sda.hash.blocks | head 2:3df1244f6143869f52abf2a1d73d0c0f 3:3df1244f6143869f52abf2a1d73d0c0f rkoll001@puppy:~/home-traces 345% ../collection_static/find-block topgun 2 0: [63,144584] 2 71 root@topgun:~/ricardo_stuff/vcache/code/collection_static# ./hashblock /dev/sda 71 8 1 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 2 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 3 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 4 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 5 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 6 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 7 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000 8 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000
Trace Static similarity:
TODO: add an analysis of processes, which are the processes which make more requests with more copies?? or something like that..
webmail + online
rkoll001@wolf:~ 209% awk 'BEGIN {s = 0; i = 0;} {s += $2; i++;} END { print s/i}' final-traces/2111.web.read.blocks.copies 1.60422
Interestingly, 1.60 is less than the regular static similarity of 2. Another day 12/11 shows this number:
rkoll001@puppy:~ 42% awk 'BEGIN {s = 0; i = 0;} {if ($2 < 10) {s += $2; i++;} } END { print s/i}' final-traces/1211.web.read.blocks.copies 1.83932 rkoll001@puppy:~ 43% awk 'BEGIN {s = 0; i = 0;} {if ($2 < 100) {s += $2; i++;} } END { print s/i}' final-traces/1211.web.read.blocks.copies 1.93666 rkoll001@puppy:~ 44% awk 'BEGIN {s = 0; i = 0;} {if ($2 < 1000) {s += $2; i++;} } END { print s/i}' final-traces/1211.web.read.blocks.copies 1.93666
without removing extreme values it would have been:
rkoll001@wolf:~ 230% awk 'BEGIN {s = 0; i = 0;} {s += $2; i++;} END { print s/i}' final-traces/1211.web.read.blocks.copies 721.38
furthermore, the reads are to the zero block:
rkoll001@wolf:~ 231% grep 1193299 final-traces/1211.web.read.blocks.copies 3df1244f6143869f52abf2a1d73d0c0f 1193299 13758159
made by first sshd:
rkoll001@wolf:~ 245% grep 2736191 mail-traces/1211.online.0 3395788967958299 R 31238 sshd 3 0 2736191 8192 268435465 0 0 0 0 bf619eac0cdf3f68d496ea9344137e8bbf619eac0c
and then by swapper:
rkoll001@wolf:~ 252% grep 13758159 mail-traces/1211.webmail.0 3422630657163567 R 0 swapper 3 0 13758159 81920 805306377 0 0 0 0 bf619eac0cdf3f68d496ea9344137e8bbf619eac0c
just to confirm.. yes it is the zero block:
7415 bf619eac0cdf3f68d496ea9344137e8b 00000000000000000000000000000000
now trace static similarity for a weeks traces:
rkoll001@wolf:~ 338% awk 'BEGIN {s = 0; i = 0;} {if ($2 < 1000) {s += $2; i++;} } END { print s/i}' final-traces/1111-1711.web.read.blocks.copies 1.99529
rkoll001@puppy:~ 45% awk 'BEGIN {s = 0; i = 0;} {if ($2 < 10) {s += $2; i++;} } END { print s/i}' final-traces/1211.cheetah.read.blocks.copies 1.1387 rkoll001@puppy:~ 46% awk 'BEGIN {s = 0; i = 0;} {if ($2 < 100) {s += $2; i++;} } END { print s/i}' final-traces/1211.cheetah.read.blocks.copies 1.23882 rkoll001@puppy:~ 47% awk 'BEGIN {s = 0; i = 0;} {if ($2 < 1000) {s += $2; i++;} } END { print s/i}' final-traces/1211.cheetah.read.blocks.copies 2.47925
considering the extreme values (using accesses to the zero page) it would be:
rkoll001@puppy:~ 48% awk 'BEGIN {s = 0; i = 0;} {s += $2; i++;} END { print s/i}' final-traces/1211.cheetah.read.blocks.copies 1.22369e+06
the number of access to the zero page is:
rkoll001@puppy:~ 51% grep 3df1244f6143869f52abf2a1d73d0c0f final-traces/1211.cheetah.read.blocks.copies | wc -l 83856
from a total block accesses of:
rkoll001@puppy:~ 52% wc -l final-traces/1211.cheetah.read.blocks.copies 2765642 final-traces/1211.cheetah.read.blocks.copies
now for a week of traces:
bash-3.2$ awk 'BEGIN {s = 0; i = 0;} {if ($3 < 10) {s += $3; i++;} } END { print s/i}' final-traces/1111-1711.mail.read.blocks.copies 1.16444 bash-3.2$ awk 'BEGIN {s = 0; i = 0;} {if ($3 < 100) {s += $3; i++;} } END { print s/i}' final-traces/1111-1711.mail.read.blocks.copies 1.27565 bash-3.2$ awk 'BEGIN {s = 0; i = 0;} {if ($3 < 1000) {s += $3; i++;} } END { print s/i}' final-traces/1111-1711.mail.read.blocks.copies 2.26263 bash-3.2$ awk 'BEGIN {s = 0; i = 0;} {{s += $3; i++;} } END { print s/i}' final-traces/1111-1711.mail.read.blocks.copies 1.2478e+06
now for a month of traces
rkoll001@puppy:~ 16% awk 'BEGIN {s = 0; i = 0;} {if ($3 < 10) {s += $3; i++;} } END { print s/i}' final-traces/0111-3011.mail.read.blocks.copies 1.13465 rkoll001@puppy:~ 17% awk 'BEGIN {s = 0; i = 0;} {if ($3 < 100) {s += $3; i++;} } END { print s/i}' final-traces/0111-3011.mail.read.blocks.copies 1.25955 rkoll001@puppy:~ 19% awk 'BEGIN {s = 0; i = 0;} {if ($3 < 1000) {s += $3; i++;} } END { print s/i}' final-traces/0111-3011.mail.read.blocks.copies 2.32995
Dynamic Similarity
% .. | ruby dynamic-similarity-blocks.rb dynamic similarity number of hits
Online (alone)
reads
rkoll001@wolf:~ 38% cat final-traces/1211.web.traces | ./collection_static/trace-extract-blocks | ruby dynamic-similarity-blocks.rb 205706 780 rkoll001@wolf:~ 42% cat final-traces/1211.web.traces | ./collection_static/trace-extract-blocks | ruby dynamic-similarity-sectors.rb 216239 742
writes
rkoll001@wolf:~ 35% cat final-traces/1211.web.traces | ./collection_static/trace-extract-blocks | ruby dynamic-similarity-blocks.rb 2765 85517 rkoll001@wolf:~ 45% cat final-traces/1211.web.traces | ./collection_static/trace-extract-blocks | ruby dynamic-similarity-sectors.rb 35596 931259
reads + writes
rkoll001@wolf:~ 50% cat final-traces/1211.web.traces | ./collection_static/trace-extract-blocks | ruby dynamic-similarity-blocks.rb 4599 86297 rkoll001@wolf:~ 48% cat final-traces/1211.web.traces | ./collection_static/trace-extract-blocks | ruby dynamic-similarity-sectors.rb 35740 932001
Online + Webmail
day reads
rkoll001@wolf:~ 13% cat final-traces/1211.web.traces | ./collection_static/trace-extract-blocks| ruby dynamic-similarity-sectors.rb ; 521599 1655 rkoll001@wolf:~ 14% cat final-traces/1211.web.traces | ./collection_static/trace-extract-blocks| ruby dynamic-similarity-blocks.rb 72254 56428
day writes
rkoll001@wolf:~ 17% cat final-traces/1211.web.traces | ./collection_static/trace-extract-blocks | ruby /tmp/dynamic-similarity-sectors.rb 71541 1832590 rkoll001@wolf:~ 18% cat final-traces/1211.web.traces | ./collection_static/trace-extract-blocks | ruby /tmp/dynamic-similarity-blocks.rb 7168 320962
day reads + writes
rkoll001@wolf:~ 20% cat final-traces/1211.web.traces | ./collection_static/trace-extract-blocks | ruby /tmp/dynamic-similarity-sectors.rb 71947 1834245 rkoll001@wolf:~ 21% cat final-traces/1211.web.traces | ./collection_static/trace-extract-blocks | ruby /tmp/dynamic-similarity-blocks.rb 16900 377390
week read
rkoll001@wolf:~ 65% cat final-traces/1111-1711.web.traces | ./collection_static/trace-extract-blocks | ruby /tmp/dynamic-similarity-sectors.rb 928238 1062015 rkoll001@wolf:~ 66% cat final-traces/1111-1711.web.traces | ./collection_static/trace-extract-blocks | ruby /tmp/dynamic-similarity-blocks.rb 583077 1144024
week write
rkoll001@wolf:~ 68% cat final-traces/1111-1711.web.traces | ./collection_static/trace-extract-blocks | ruby /tmp/dynamic-similarity-sectors.rb 93025 4764854 rkoll001@wolf:~ 69% cat final-traces/1111-1711.web.traces | ./collection_static/trace-extract-blocks | ruby /tmp/dynamic-similarity-blocks.rb 5552 1910869
week read + write
rkoll001@wolf:~ 61% cat final-traces/1111-1711.web.traces | ./collection_static/trace-extract-blocks | ruby /tmp/dynamic-similarity-sectors.rb 245253 5826869 rkoll001@wolf:~ 62% cat final-traces/1111-1711.web.traces | ./collection_static/trace-extract-blocks | ruby /tmp/dynamic-similarity-blocks.rb 221829 3054893
Cheetah
read
rkoll001@puppy:~ 11% cat final-traces/1211.mail.traces.blocks | ruby dynamic-similarity-sectors.rb 2587904 1162035 rkoll001@puppy:~ 12% cat final-traces/1211.mail.traces.blocks | ruby dynamic-similarity-blocks.rb 2347697 1260203
writes
rkoll001@puppy:~ 15% cat final-traces/1211.mail.traces.blocks | ruby dynamic-similarity-sectors.rb 537507 18828186 rkoll001@puppy:~ 16% cat final-traces/1211.mail.traces.blocks | ruby dynamic-similarity-blocks.rb 199689 17409314
read + writes
rkoll001@puppy:~ 43% cat final-traces/1211.mail.traces.blocks | ruby dynamic-similarity-sectors.rb 656697 19990221 rkoll001@puppy:~ 42% cat final-traces/1211.mail.traces.blocks | ruby dynamic-similarity-blocks.rb 344681 18669517
blocks requested for read: 3454044 (14 GB) blocks requested for read and write: 22972351
Cache sizes and LRU vs. ARC
small = 10000 entries grande = 100000 entries
root@armageddon:/home/ric/module# wc -l /tmp/small-* /tmp/grande-* 10005 /tmp/small-arc 7516 /tmp/small-lru 74229 /tmp/grande-arc 7978 /tmp/grande-lru
total number of requests 34707