Recent Developments in the CernVM-FS Server Backend
|
|
- Sophie Morrison
- 6 years ago
- Views:
Transcription
1 Recent Developments in the CernVM-FS Server Backend René Meusel Jakob Blomer, Gerardo Ganis, Predrag Buncic, Seppo Heikkila ACAT Prague, 4th of September
2 1 Usage Statistics and Adoption 2 New Challenges and Features 3 File System History 4 Garbage Collection 5 Smart Stratum1 Servers 2
3 What is CernVM-FS? Scalable software distribution system! Infrequent atomic updates in a central location Read-only access on the clients HTTP based global data transfer! Minimal protocol requirements Aggressive hierarchical cache strategy Assumption: Coherent working set on physically close nodes (cf. software vs. data distribution) Accessible through a mounted file system (POSIX) FUSE module, NFS exported FUSE volume or Parrot 3
4 Who Uses CernVM-FS? All LHC experiments! CernVM 3! Operating system in CernVM-FS Others beyond the HEP community! Human Brain Project, BioMed, VLEMED, Stratum0s at CERN, RAL, NIKHEF, Fermilab, DESY, 4
5 Repository Statistics Repository Files Refer. Objects Volume ø File Size atlas.cern.ch 34'500'000 3'700' TiB 66.2 kib cms.cern.ch 30'600' ' TiB 33.1 kib lhcb.cern.ch '000 4'600' TiB 41.9 kib alice.cern.ch 5'900' ' TiB 90.7 kib ams.cern.ch 2'900'000 1'900' TiB 0.7 MiB alice-ocdb.cern.ch 700' ' TiB 0.2 MiB atlas-condb.cern.ch 8'000 7' TiB 60.8 MiB Mainly Software Software + Conditions Data Conditions Data Files and Volume as saved in the CernVM-FS catalogs Actual number of Referenced Objects is compressed and de-duplicated Based on latest revision - no history involved (Effective: August 2014) 5
6 New Challenges 6
7 New Challenges Large Files (> 200 MiB) Long term data preservation Rapidly changing repository content Increasing configuration distribution effort Instant repository update propagation 7
8 CernVM-FS Repository From POSIX File Filesystem to Content-Addressable Objects 8
9 From POSIX to Blob-Objects release x86 config default.conf specific.conf executable fancy_tool do_magic.sh x86_64 specific.conf fancy_tool armv7 fancy_tool fancy_tool_debug run.sh setup.py foo.bar 9
10 From POSIX to Blob-Objects release Data Object x86 config default.conf specific.conf executable fancy_tool do_magic.sh x86_64 specific.conf fancy_tool a53bec6b0e43137e187d9bc5dce05d8443 dae8d8c367149f4b71f5ea f9d9ab0a 001e62aad4c6722f96a1c4a7a c02b4aad b b78efb04ba9f0b0ff38d055bd3d751c a4d8ef529369c3d910a6cf001f562d07fd 2ee4f4dfc208f374855dc78dbb c a11b16694d6abf72e412ead6b0721c80c7dc98a7 armv7 fancy_tool fancy_tool_debug run.sh setup.py foo.bar c6fb1c3da08b88ae3f afa59b5f9cc 5c1c639b9a39a3c77c790768c06b2dd e4e8ce b55aea9d771fe65f018ccf9 8599d27418cf321a855d0c79091f1dfd5bec202d 17b9caf70786d9a8444e51d77ea7495b9a5e8ce5 9
11 From POSIX to Blob-Objects release x86 config default.conf specific.conf executable fancy_tool do_magic.sh x86_64 specific.conf fancy_tool armv7 fancy_tool fancy_tool_debug run.sh setup.py foo.bar Catalog: /release Sub-Catalog: /release/x86/config a53bec6b0e43137e187d9bc5dce05d8443 dae8d8c367149f4b71f5ea f9d9ab0a 001e62aad4c6722f96a1c4a7a c02b4aad b b78efb04ba9f0b0ff38d055bd3d751c a4d8ef529369c3d910a6cf001f562d07fd Sub-Catalog: /release/x86_64 2ee4f4dfc208f374855dc78dbb c a11b16694d6abf72e412ead6b0721c80c7dc98a7 Sub-Catalog: /release/armv7 c6fb1c3da08b88ae3f afa59b5f9cc 5c1c639b9a39a3c77c790768c06b2dd e4e8ce b55aea9d771fe65f018ccf9 8599d27418cf321a855d0c79091f1dfd5bec202d 17b9caf70786d9a8444e51d77ea7495b9a5e8ce5 Data Object 10
12 From POSIX to Blob-Objects release x86 config default.conf specific.conf executable fancy_tool do_magic.sh x86_64 specific.conf fancy_tool armv7 fancy_tool fancy_tool_debug run.sh setup.py foo.bar Catalog: /release Sub-Catalog: /release/x86/config a53bec6b0e43137e187d9bc5dce05d8443 dae8d8c367149f4b71f5ea f9d9ab0a 001e62aad4c6722f96a1c4a7a c02b4aad b b78efb04ba9f0b0ff38d055bd3d751c a4d8ef529369c3d910a6cf001f562d07fd Sub-Catalog: /release/x86_64 2ee4f4dfc208f374855dc78dbb c a11b16694d6abf72e412ead6b0721c80c7dc98a7 Sub-Catalog: /release/armv7 c6fb1c3da08b88ae3f afa59b5f9cc 5c1c639b9a39a3c77c790768c06b2dd e4e8ce b55aea9d771fe65f018ccf9 8599d27418cf321a855d0c79091f1dfd5bec202d 17b9caf70786d9a8444e51d77ea7495b9a5e8ce5 Data Object 11
13 From POSIX to Blob-Objects release x86 config default.conf specific.conf executable fancy_tool do_magic.sh x86_64 specific.conf fancy_tool armv7 fancy_tool fancy_tool_debug run.sh setup.py foo.bar 038f625d0790e06b0848a04bef90a51bd7b3ebecC 59bb67e545ac1951ac0f274ff63e8d2cc78ef420C a53bec6b0e43137e187d9bc5dce05d8443 dae8d8c367149f4b71f5ea f9d9ab0a 001e62aad4c6722f96a1c4a7a c02b4aad b b78efb04ba9f0b0ff38d055bd3d751c a4d8ef529369c3d910a6cf001f562d07fd 11f58905f87d7ad5513aede20e21722b890eb9d6C 2ee4f4dfc208f374855dc78dbb c a11b16694d6abf72e412ead6b0721c80c7dc98a a4d8ef529369c3d910a6cf001f562d07fdC c6fb1c3da08b88ae3f afa59b5f9cc 5c1c639b9a39a3c77c790768c06b2dd e4e8ce b55aea9d771fe65f018ccf9 8599d27418cf321a855d0c79091f1dfd5bec202d 17b9caf70786d9a8444e51d77ea7495b9a5e8ce5 Data Object Root Catalog Catalog 12
14 From POSIX to Blob-Objects 038f625d0790e06b0848a04bef90a51bd7b3ebecC Data Object 59bb67e545ac1951ac0f274ff63e8d2cc78ef420C a53bec6b0e43137e187d9bc5dce05d8443 dae8d8c367149f4b71f5ea f9d9ab0a Root Catalog Catalog 001e62aad4c6722f96a1c4a7a c02b4aad b b78efb04ba9f0b0ff38d055bd3d751c a4d8ef529369c3d910a6cf001f562d07fd 11f58905f87d7ad5513aede20e21722b890eb9d6C 2ee4f4dfc208f374855dc78dbb c a11b16694d6abf72e412ead6b0721c80c7dc98a a4d8ef529369c3d910a6cf001f562d07fdC c6fb1c3da08b88ae3f afa59b5f9cc 5c1c639b9a39a3c77c790768c06b2dd e4e8ce b55aea9d771fe65f018ccf9 8599d27418cf321a855d0c79091f1dfd5bec202d 17b9caf70786d9a8444e51d77ea7495b9a5e8ce5 13
15 From POSIX to Blob-Objects.cvmfspublished 038f625d0790e06b0848a04bef90a51bd7b3ebecC 59bb67e545ac1951ac0f274ff63e8d2cc78ef420C a53bec6b0e43137e187d9bc5dce05d8443 dae8d8c367149f4b71f5ea f9d9ab0a Data Object Manifest Root Catalog Catalog 001e62aad4c6722f96a1c4a7a c02b4aad b b78efb04ba9f0b0ff38d055bd3d751c a4d8ef529369c3d910a6cf001f562d07fd 11f58905f87d7ad5513aede20e21722b890eb9d6C 2ee4f4dfc208f374855dc78dbb c a11b16694d6abf72e412ead6b0721c80c7dc98a a4d8ef529369c3d910a6cf001f562d07fdC c6fb1c3da08b88ae3f afa59b5f9cc 5c1c639b9a39a3c77c790768c06b2dd e4e8ce b55aea9d771fe65f018ccf9 8599d27418cf321a855d0c79091f1dfd5bec202d 17b9caf70786d9a8444e51d77ea7495b9a5e8ce5 13
16 From POSIX to Blob-Objects.cvmfspublished 038f625d0790e06b0848a04bef90a51bd7b3ebecC 59bb67e545ac1951ac0f274ff63e8d2cc78ef420C a53bec6b0e43137e187d9bc5dce05d8443 dae8d8c367149f4b71f5ea f9d9ab0a 001e62aad4c6722f96a1c4a7a c02b4aad b b78efb04ba9f0b0ff38d055bd3d751c Data Object Manifest Root Catalog Catalog Reference a4d8ef529369c3d910a6cf001f562d07fd 11f58905f87d7ad5513aede20e21722b890eb9d6C 2ee4f4dfc208f374855dc78dbb c a11b16694d6abf72e412ead6b0721c80c7dc98a a4d8ef529369c3d910a6cf001f562d07fdC c6fb1c3da08b88ae3f afa59b5f9cc 5c1c639b9a39a3c77c790768c06b2dd e4e8ce b55aea9d771fe65f018ccf9 8599d27418cf321a855d0c79091f1dfd5bec202d 17b9caf70786d9a8444e51d77ea7495b9a5e8ce5 13
17 From POSIX to Blob-Objects Merkle tree! only.cvmfspublished needs to be signed Content-Addressable Storage! File de-duplication Trivial file integrity checks Flat Namespace! Perfect for HTTP caching Minimal storage API requirements (PUT, GET, [DELETE]).cvmfspublished 038f625d bb67e545ac a53bec... dae8d8c e62aad4c6... b b f58905f87d... 2ee4f4dfc a11b16694d6a... c6fb1c3da08b... 5c1c639b9a e4e8ce d27418cf... 17b9caf Stratum0 (backend storage) 14
18 From POSIX to Blob-Objects Merkle tree! only.cvmfspublished needs to be signed Content-Addressable Storage! File de-duplication Trivial file integrity checks Flat Namespace! Perfect for HTTP caching Minimal storage API requirements (PUT, GET, [DELETE]).cvmfspublished 038f625d bb67e545ac a53bec... dae8d8c e62aad4c6... b b f58905f87d... 2ee4f4dfc a11b16694d6a... c6fb1c3da08b... 5c1c639b9a e4e8ce d27418cf... 17b9caf Stratum1 Stratum0 (backend storage) Stratum1 14
19 From POSIX to Blob-Objects Merkle tree! only.cvmfspublished needs to be signed Content-Addressable Storage! File de-duplication Trivial file integrity checks Flat Namespace! Perfect for HTTP caching Minimal storage API requirements (PUT, GET, [DELETE]).cvmfspublished 038f625d bb67e545ac a53bec... dae8d8c e62aad4c6... b b f58905f87d... 2ee4f4dfc a11b16694d6a... c6fb1c3da08b... 5c1c639b9a e4e8ce d27418cf... 17b9caf Stratum1 Stratum0 (backend storage) Stratum1 14
20 Repository Storage RESTful backend storage - PUT, GET, (DELETE) POSIX compliant file systems Stratum0 (backend storage) Key-Value stores (using S3 API - Seppo Heikkila) Pluggable storage connector implementation! Facilitate implementation of native Key-Value store APIs Like: Basho Riak, Ceph, Amazon Dynamo, 15
21 File System History Named Repository Snapshots 16
22 File System History Client: Mount historic revisions of a repository Use legacy software in its contemporary environment Long term data preservation Stratum0: Rollback to previous revisions Undo - Overwrite broken revisions with previous version 17
23 File System History.cvmfspublished 038f625d bb67e545ac a53bec... dae8d8c e62aad4c6... b b f58905f87d... 2ee4f4dfc a11b16694d6a... c6fb1c3da08b... 5c1c639b9a e4e8ce d27418cf... Data Object Manifest Root Catalog Catalog Reference 18
24 File System History.cvmfspublished 038f625d bb67e545ac a53bec... dae8d8c e62aad4c6... b b f58905f87d... 2ee4f4dfc a11b16694d6a... c6fb1c3da08b... 5c1c639b9a e4e8ce d27418cf... a3767d11dc ae8e0fad52d... 3d3d8bbb77e9... d4b20e277a21... cfbeae3135ac... 4e519bb5d4ac b12e983d f0ad3... c82c149ea1ae... e0feb5d7265f... b c2... Data Object Manifest Root Catalog Catalog Reference 18
25 File System History.cvmfspublished 038f625d bb67e545ac a53bec... dae8d8c e62aad4c6... b b f58905f87d... 2ee4f4dfc a11b16694d6a... c6fb1c3da08b... 5c1c639b9a e4e8ce d27418cf... a3767d11dc ae8e0fad52d... 3d3d8bbb77e9... d4b20e277a21... cfbeae3135ac... 4e519bb5d4ac b12e983d f0ad3... c82c149ea1ae... e0feb5d7265f... b c2... 6dd8e bfb8b6eb62... c35b3a93d c098919c e0018f5... b d c8bb2d... Data Object Manifest Root Catalog Catalog Reference 18
26 File System History.cvmfspublished 038f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc a11b16694d6a... c82c149ea1ae... e0feb5d7265f... b c2... Data Object Manifest d98d921a f609e064cda b7c c6fb1c3da08b... Root Catalog 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... Catalog Reference 9642a79d a9f a70f
27 File System History.cvmfspublished 612c764ab4cb f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc a11b16694d6a... c82c149ea1ae... e0feb5d7265f... b c2... Data Object Manifest d98d921a f609e064cda b7c c6fb1c3da08b... Root Catalog 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... Catalog Reference History Index 9642a79d a9f a70f
28 File System History.cvmfspublished 612c764ab4cb f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc a11b16694d6a... c82c149ea1ae... e0feb5d7265f... b c2... Data Object Manifest d98d921a f609e064cda b7c c6fb1c3da08b... Root Catalog 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... Catalog Reference History Index 9642a79d a9f a70f
29 File System History.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc a11b16694d6a... c82c149ea1ae... e0feb5d7265f... b c2... Data Object Manifest d98d921a f609e064cda b7c c6fb1c3da08b... Root Catalog 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... Catalog Reference History Index 9642a79d a9f a70f
30 File System History.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 5c1c639b9a e4e8ce d27418cf... atlas.cern.ch.conf [ ]! CVMFS_REPOSITORY_TAG=v3.0 1ba1d0fe892d a79d a9f a70f
31 File System History.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 5c1c639b9a e4e8ce d27418cf... atlas.cern.ch.conf [ ]! CVMFS_REPOSITORY_TAG=v3.0 1ba1d0fe892d a79d a9f a70f
32 File System History.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 5c1c639b9a e4e8ce d27418cf... atlas.cern.ch.conf [ ]! CVMFS_REPOSITORY_TAG=v3.0 1ba1d0fe892d a79d a9f a70f
33 File System History.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 5c1c639b9a e4e8ce d27418cf... atlas.cern.ch.conf [ ]! CVMFS_REPOSITORY_TAG=v3.0 1ba1d0fe892d a79d a9f a70f
34 File System History.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d... 2ee4f4dfc f0ad3... c82c149ea1ae... mount + pin 3caf7e8664f8... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 5c1c639b9a e4e8ce d27418cf... atlas.cern.ch.conf [ ]! CVMFS_REPOSITORY_TAG=v3.0 1ba1d0fe892d a79d a9f a70f
35 Garbage Collection Permanently Removing Overwritten or Deleted Files in Volatile Repositories 20
36 Garbage Collection CernVM-FS backend initially designed as insert-only New use-case: LHC experiment s nightly build repositories high update rate (up to twice a day) large volume of newly staged files (10-100GiB) short lived revisions (stay online max. two weeks) Insert-only quickly fills up backend storage! 21
37 LHCb Nightly Builds Regular Files Referenced Objects Stored Objects 16M 12M 8M 4M 1 Revision 54 22
38 Garbage Collection.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... Data Object Manifest Catalog Reference 9642a79d a9f a70f (schematic -not an actual repository) Root Catalog History Index 23
39 Garbage Collection.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... Data Object Manifest Catalog Reference 9642a79d a9f a70f (schematic -not an actual repository) Root Catalog History Index 23
40 Garbage Collection.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... Data Object Manifest Catalog Reference 9642a79d a9f a70f (schematic -not an actual repository) Root Catalog History Index 23
41 Garbage Collection.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in v3.0 find objects that are referenced nowhere else 9642a79d a9f a70f
42 Garbage Collection.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in v3.0 find objects that are referenced nowhere else 9642a79d a9f a70f
43 Garbage Collection.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in v3.0 find objects that are referenced nowhere else 9642a79d a9f a70f
44 Garbage Collection.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in v3.0 find objects that are referenced nowhere else 9642a79d a9f a70f
45 Garbage Collection.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in v3.0 find objects that are referenced nowhere else 9642a79d a9f a70f
46 Garbage Collection.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in v3.0 find objects that are referenced nowhere else 9642a79d a9f a70f
47 Garbage Collection.cvmfspublished 612c764ab4cb... v3.0 v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in v3.0 find objects that are referenced nowhere else 9642a79d a9f a70f
48 Garbage Collection.cvmfspublished 612c764ab4cb... v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in v3.0 find objects that are referenced nowhere else 9642a79d a9f a70f
49 Garbage Collection.cvmfspublished 612c764ab4cb... remove v f625d a3767d11dc dd8e f05c12cc bb67e545ac... 7ae8e0fad52d... 19bfb8b6eb62... acebb88f4c a53bec... 3d3d8bbb77e9... c35b3a93d a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in v3.0 find objects that are referenced nowhere else 9642a79d a9f a70f
50 Garbage Collection.cvmfspublished 612c764ab4cb... v f625d a3767d11dc f05c12cc bb67e545ac... 7ae8e0fad52d... acebb88f4c a53bec... 3d3d8bbb77e a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in v3.0 find objects that are referenced nowhere else 9642a79d a9f a70f
51 Garbage Collection.cvmfspublished 612c764ab4cb... v f625d a3767d11dc f05c12cc bb67e545ac... 7ae8e0fad52d... acebb88f4c a53bec... 3d3d8bbb77e a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in a3767 currently: stop-the-world GC no concurrent publish 9642a79d a9f a70f
52 Garbage Collection.cvmfspublished 612c764ab4cb... v f625d a3767d11dc f05c12cc bb67e545ac... 7ae8e0fad52d... acebb88f4c a53bec... 3d3d8bbb77e a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in a3767 currently: stop-the-world GC no concurrent publish 9642a79d a9f a70f
53 Garbage Collection.cvmfspublished 612c764ab4cb... v f625d a3767d11dc f05c12cc bb67e545ac... 7ae8e0fad52d... acebb88f4c a53bec... 3d3d8bbb77e a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in a3767 currently: stop-the-world GC no concurrent publish 9642a79d a9f a70f
54 Garbage Collection.cvmfspublished 612c764ab4cb... v f625d a3767d11dc f05c12cc bb67e545ac... 7ae8e0fad52d... acebb88f4c a53bec... 3d3d8bbb77e a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in a3767 currently: stop-the-world GC no concurrent publish 9642a79d a9f a70f
55 Garbage Collection.cvmfspublished 612c764ab4cb... v f625d a3767d11dc f05c12cc bb67e545ac... 7ae8e0fad52d... acebb88f4c a53bec... 3d3d8bbb77e a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in a3767 currently: stop-the-world GC no concurrent publish 9642a79d a9f a70f
56 Garbage Collection.cvmfspublished 612c764ab4cb... remove v f625d a3767d11dc f05c12cc bb67e545ac... 7ae8e0fad52d... acebb88f4c a53bec... 3d3d8bbb77e a078736f41... dae8d8c d4b20e277a21... c098919c ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d f0ad3... 3caf7e8664f8... 2ee4f4dfc c82c149ea1ae... d98d921a a11b16694d6a... e0feb5d7265f... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in a3767 currently: stop-the-world GC no concurrent publish 9642a79d a9f a70f
57 Garbage Collection.cvmfspublished 612c764ab4cb... v f625d f05c12cc bb67e545ac... acebb88f4c a53bec... 3d3d8bbb77e a078736f41... dae8d8c d4b20e277a21... ea0630b e62aad4c6... cfbeae3135ac e0018f fb016f0bda... b b e519bb5d4ac... b d68... f3bd1410d41d b12e983d c8bb2d... 3d13859e193f... 11f58905f87d... 3caf7e8664f8... 2ee4f4dfc d98d921a a11b16694d6a... f609e064cda4... b c b7c c6fb1c3da08b... 1ba1d0fe892d... 5c1c639b9a e4e8ce d27418cf... (schematic -not an actual repository) remove content in a3767 currently: stop-the-world GC no concurrent publish 9642a79d a9f a70f
58 Garbage Collection Mark-and-Sweep implementation! Two-stage approach: Traverse preserved catalogs and log referenced objects Traverse condemned catalogs and match against log Full walk of the repository s catalog graph required Time consuming task 30
59 Smart Stratum1 Servers Automatic Stratum1 Ordering, Push Replication to Stratum1 Servers 31
60 Smart Stratum1 Servers Equip Stratum1 servers with RESTful API Automatic Stratum1 ordering on the client side Based on GeoIP database to determine closest replicas (to come in CernVM-FS Dave Dykstra - Fermilab/OSG) Triggered (instant) replication of new revisions Repository s private key for authentication 32
61 Wrap up 33
62 New Challenges Large Files (> 200 MiB) File chunking Long term data preservation Named repository snapshots Rapidly changing repository content Garbage collection Increasing configuration distribution effort Central bootstrapping configuration repository Instant repository update propagation Push replication from Stratum0 to Stratum1 34
63 Main New Features Alternative Storage Backends based on keyvalue stores through the S3 API (Seppo Heikkila) Named Snapshots and History for long term software accessibility and error recovery Garbage Collection for rapidly changing repositories (expected in CernVM-FS ) Transactional Repository Updates for better publishing performance and a robust backend 35
64 Other New Features Automatic Ordering of Stratum1 Mirrors based on geo-ip location (Dave Dykstra - OSG) Configuration Bootstrap Repositories to facilitate public key and configuration distribution (not yet released) Chunking of Large Files for better cache exploitation and traffic efficiency Fully Parallel File Processing to speed up snapshot publishing 36
65 37
66 Backup Slides 38
67 Repository Growth Data Volume Referenced Objects Directory Entries Example Repository: atlas.cern.ch Size approximately doubled in two years Maximal values: Data: 2.1 TiB Entries: 48.0 M Objects: ~3.8 M 07/ / / / / / / / / / / / / / / / / / / / / / / / / /
68 What is CernVM-FS? Stratum0 40
69 What is CernVM-FS? Stratum0 Stratum1 (replication) 40
70 What is CernVM-FS? Stratum0 Stratum1 (replication) 40
71 What is CernVM-FS? Local Cache Stratum0 Stratum1 (replication) 40
72 What is CernVM-FS? DESY RAL CERN Fermilab?????? (schematic - does not reflect the real setup)??? 41
73 Stratum 0 Release Manager Machine 42
74 Updating a Repository Stratum0 (backend storage) 43
75 Updating a Repository CVMFS Revision X (read only) Stratum0 (backend storage) 43
76 Updating a Repository Union File System (writable) CVMFS Revision X (read only) Stratum0 (backend storage) 43
77 Updating a Repository Union File System (writable) CVMFS Revision X (read only) Stratum0 (backend storage) 43
78 Updating a Repository Union File System (writable) CVMFS Revision X (read only) Stratum0 (backend storage) Synchronisation 43
79 Updating a Repository CVMFS Revision X + 1 (read only) Stratum0 (backend storage) 44
80 Updating a Repository Stratum1 CVMFS Revision X + 1 (read only) Stratum0 (backend storage) Stratum1 44
81 Updating a Repository Stratum1 CVMFS Revision X + 1 (read only) Stratum0 (backend storage) Stratum1 44
82 Updating a Repository Union File System (writable) CVMFS Revision X + 1 (read only) Stratum0 (backend storage) 45
83 Updating a Repository Union File System (writable) CVMFS Revision X + 1 (read only) Stratum0 (backend storage) 45
84 Updating a Repository Union File System (writable) Discard CVMFS Revision X + 1 (read only) Stratum0 (backend storage) 45
85 Updating a Repository CVMFS Revision X + 1 (read only) Stratum0 (backend storage) 46
86 47
Recent Developments in the CernVM-File System Server Backend
Journal of Physics: Conference Series PAPER OPEN ACCESS Recent Developments in the CernVM-File System Server Backend To cite this article: R Meusel et al 2015 J. Phys.: Conf. Ser. 608 012031 Recent citations
More informationGlobal Software Distribution with CernVM-FS
Global Software Distribution with CernVM-FS Jakob Blomer CERN 2016 CCL Workshop on Scalable Computing October 19th, 2016 jblomer@cern.ch CernVM-FS 1 / 15 The Anatomy of a Scientific Software Stack (In
More informationCernVM-FS beyond LHC computing
CernVM-FS beyond LHC computing C Condurache, I Collier STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot, OX11 0QX, UK E-mail: catalin.condurache@stfc.ac.uk Abstract. In the last three years
More informationRADU POPESCU IMPROVING THE WRITE SCALABILITY OF THE CERNVM FILE SYSTEM WITH ERLANG/OTP
RADU POPESCU IMPROVING THE WRITE SCALABILITY OF THE CERNVM FILE SYSTEM WITH ERLANG/OTP THE EUROPEAN ORGANISATION FOR PARTICLE PHYSICS RESEARCH (CERN) 2 THE LARGE HADRON COLLIDER THE LARGE HADRON COLLIDER
More informationSecurity in the CernVM File System and the Frontier Distributed Database Caching System
Security in the CernVM File System and the Frontier Distributed Database Caching System D Dykstra 1 and J Blomer 2 1 Scientific Computing Division, Fermilab, Batavia, IL 60510, USA 2 PH-SFT Department,
More informationUsing S3 cloud storage with ROOT and CvmFS
Journal of Physics: Conference Series PAPER OPEN ACCESS Using S cloud storage with ROOT and CvmFS To cite this article: María Arsuaga-Ríos et al 05 J. Phys.: Conf. Ser. 66 000 View the article online for
More informationCernVM-FS. Catalin Condurache STFC RAL UK
CernVM-FS Catalin Condurache STFC RAL UK Outline Introduction Brief history EGI CernVM-FS infrastructure The users Recent developments Plans 2 Outline Introduction Brief history EGI CernVM-FS infrastructure
More informationEvaluation of the Huawei UDS cloud storage system for CERN specific data
th International Conference on Computing in High Energy and Nuclear Physics (CHEP3) IOP Publishing Journal of Physics: Conference Series 53 (4) 44 doi:.88/74-6596/53/4/44 Evaluation of the Huawei UDS cloud
More informationDISTRIBUTED FILE SYSTEMS CARSTEN WEINHOLD
Department of Computer Science Institute of System Architecture, Operating Systems Group DISTRIBUTED FILE SYSTEMS CARSTEN WEINHOLD OUTLINE Classical distributed file systems NFS: Sun Network File System
More informationDISTRIBUTED FILE SYSTEMS CARSTEN WEINHOLD
Department of Computer Science Institute of System Architecture, Operating Systems Group DISTRIBUTED FILE SYSTEMS CARSTEN WEINHOLD OUTLINE Classical distributed file systems NFS: Sun Network File System
More informationApplication of Virtualization Technologies & CernVM. Benedikt Hegner CERN
Application of Virtualization Technologies & CernVM Benedikt Hegner CERN Virtualization Use Cases Worker Node Virtualization Software Testing Training Platform Software Deployment }Covered today Server
More informationCA485 Ray Walshe Google File System
Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage
More informationSoftware installation and condition data distribution via CernVM File System in ATLAS
Journal of Physics: Conference Series Software installation and condition data distribution via CernVM File System in ATLAS To cite this article: A De Salvo et al 2012 J. Phys.: Conf. Ser. 396 032030 View
More informationData services for LHC computing
Data services for LHC computing SLAC 1 Xavier Espinal on behalf of IT/ST DAQ to CC 8GB/s+4xReco Hot files Reliable Fast Processing DAQ Feedback loop WAN aware Tier-1/2 replica, multi-site High throughout
More informationGFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures
GFS Overview Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures Interface: non-posix New op: record appends (atomicity matters,
More informationCernVM a virtual software appliance for LHC applications
CernVM a virtual software appliance for LHC applications P Buncic 1, C Aguado Sanchez 1, J Blomer 1, L Franco 1, A Harutyunian 2,3, P Mato 1, Y Yao 3 1 CERN, 1211 Geneve 23, Geneva, Switzerland 2 Armenian
More informationDISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems INTRODUCTION. Transparency: Flexibility: Slide 1. Slide 3.
CHALLENGES Transparency: Slide 1 DISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems ➀ Introduction ➁ NFS (Network File System) ➂ AFS (Andrew File System) & Coda ➃ GFS (Google File System)
More informationECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 1: Distributed File Systems GFS (The Google File System) 1 Filesystems
More informationDistributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung
Distributed Systems Lec 10: Distributed File Systems GFS Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung 1 Distributed File Systems NFS AFS GFS Some themes in these classes: Workload-oriented
More informationAn NFS Replication Hierarchy
W3C Push Technologies Workshop An NFS Replication Hierarchy Slide 1 of 18 An NFS Replication Hierarchy Sun Microsystems, Inc W3C Push Technologies Workshop An NFS Replication Hierarchy Slide 2 of 18 The
More informationThe Google File System
October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single
More informationCeph Rados Gateway. Orit Wasserman Fosdem 2016
Ceph Rados Gateway Orit Wasserman owasserm@redhat.com Fosdem 2016 AGENDA Short Ceph overview Rados Gateway architecture What's next questions Ceph architecture Cephalopod Ceph Open source Software defined
More information! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like
Cloud background Google File System! Warehouse scale systems " 10K-100K nodes " 50MW (1 MW = 1,000 houses) " Power efficient! Located near cheap power! Passive cooling! Power Usage Effectiveness = Total
More informationUsing Puppet to contextualize computing resources for ATLAS analysis on Google Compute Engine
Journal of Physics: Conference Series OPEN ACCESS Using Puppet to contextualize computing resources for ATLAS analysis on Google Compute Engine To cite this article: Henrik Öhman et al 2014 J. Phys.: Conf.
More informationCernVM-FS Documentation
CernVM-FS Documentation Release 2.1.20 CernVM Team March 02, 2016 Contents 1 What is CernVM-FS? 1 2 Contents 3 2.1 Overview................................................. 3 2.2 Getting Started..............................................
More informationDistributed Systems 16. Distributed File Systems II
Distributed Systems 16. Distributed File Systems II Paul Krzyzanowski pxk@cs.rutgers.edu 1 Review NFS RPC-based access AFS Long-term caching CODA Read/write replication & disconnected operation DFS AFS
More informationDistributed File Systems II
Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation
More informationHedvig as backup target for Veeam
Hedvig as backup target for Veeam Solution Whitepaper Version 1.0 April 2018 Table of contents Executive overview... 3 Introduction... 3 Solution components... 4 Hedvig... 4 Hedvig Virtual Disk (vdisk)...
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung ACM SIGOPS 2003 {Google Research} Vaibhav Bajpai NDS Seminar 2011 Looking Back time Classics Sun NFS (1985) CMU Andrew FS (1988) Fault
More informationGoogle File System 2
Google File System 2 goals monitoring, fault tolerance, auto-recovery (thousands of low-cost machines) focus on multi-gb files handle appends efficiently (no random writes & sequential reads) co-design
More informationGlusterFS Architecture & Roadmap
GlusterFS Architecture & Roadmap Vijay Bellur GlusterFS co-maintainer http://twitter.com/vbellur Agenda What is GlusterFS? Architecture Integration Use Cases Future Directions Challenges Q&A What is GlusterFS?
More informationOpendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES
Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES May, 2017 Contents Introduction... 2 Overview... 2 Architecture... 2 SDFS File System Service... 3 Data Writes... 3 Data Reads... 3 De-duplication
More informationDistributed System. Gang Wu. Spring,2018
Distributed System Gang Wu Spring,2018 Lecture7:DFS What is DFS? A method of storing and accessing files base in a client/server architecture. A distributed file system is a client/server-based application
More informationGFS: The Google File System
GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung December 2003 ACM symposium on Operating systems principles Publisher: ACM Nov. 26, 2008 OUTLINE INTRODUCTION DESIGN OVERVIEW
More informationarxiv: v1 [cs.dc] 7 Apr 2014
arxiv:1404.1814v1 [cs.dc] 7 Apr 2014 CernVM Online and Cloud Gateway: a uniform interface for CernVM contextualization and deployment G Lestaris 1, I Charalampidis 2, D Berzano, J Blomer, P Buncic, G Ganis
More informationCloud object storage in Ceph. Orit Wasserman Fosdem 2017
Cloud object storage in Ceph Orit Wasserman owasserm@redhat.com Fosdem 2017 AGENDA What is cloud object storage? Ceph overview Rados Gateway architecture Questions Cloud object storage Block storage Data
More informationOutline. INF3190:Distributed Systems - Examples. Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles
INF3190:Distributed Systems - Examples Thomas Plagemann & Roman Vitenberg Outline Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles Today: Examples Googel File System (Thomas)
More informationEvolution of Cloud Computing in ATLAS
The Evolution of Cloud Computing in ATLAS Ryan Taylor on behalf of the ATLAS collaboration 1 Outline Cloud Usage and IaaS Resource Management Software Services to facilitate cloud use Sim@P1 Performance
More informationCOS 318: Operating Systems. NSF, Snapshot, Dedup and Review
COS 318: Operating Systems NSF, Snapshot, Dedup and Review Topics! NFS! Case Study: NetApp File System! Deduplication storage system! Course review 2 Network File System! Sun introduced NFS v2 in early
More informationHDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
HDFS Architecture Gregory Kesden, CSE-291 (Storage Systems) Fall 2017 Based Upon: http://hadoop.apache.org/docs/r3.0.0-alpha1/hadoopproject-dist/hadoop-hdfs/hdfsdesign.html Assumptions At scale, hardware
More informationNPTEL Course Jan K. Gopinath Indian Institute of Science
Storage Systems NPTEL Course Jan 2012 (Lecture 39) K. Gopinath Indian Institute of Science Google File System Non-Posix scalable distr file system for large distr dataintensive applications performance,
More informationCLOUD-SCALE FILE SYSTEMS
Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients
More informationStatus and Roadmap of CernVM
Journal of Physics: Conference Series PAPER OPEN ACCESS Status and Roadmap of CernVM To cite this article: D Berzano et al 2015 J. Phys.: Conf. Ser. 664 022018 View the article online for updates and enhancements.
More informationGFS. CS6450: Distributed Systems Lecture 5. Ryan Stutsman
GFS CS6450: Distributed Systems Lecture 5 Ryan Stutsman Some material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed for
More informationChanging Requirements for Distributed File Systems in Cloud Storage
Changing Requirements for Distributed File Systems in Cloud Storage Wesley Leggette Cleversafe Presentation Agenda r About Cleversafe r Scalability, our core driver r Object storage as basis for filesystem
More informationECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Software Infrastructure in Data Centers: Distributed File Systems 1 Permanently stores data Filesystems
More informationINTRODUCTION TO CEPH. Orit Wasserman Red Hat August Penguin 2017
INTRODUCTION TO CEPH Orit Wasserman Red Hat August Penguin 2017 CEPHALOPOD A cephalopod is any member of the molluscan class Cephalopoda. These exclusively marine animals are characterized by bilateral
More informationDistributed File Storage in Multi-Tenant Clouds using CephFS
Distributed File Storage in Multi-Tenant Clouds using CephFS FOSDEM 2018 John Spray Software Engineer Ceph Christian Schwede Software Engineer OpenStack Storage In this presentation Brief overview of key
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system
More information13th International Workshop on Advanced Computing and Analysis Techniques in Physics Research ACAT 2010 Jaipur, India February
LHC Cloud Computing with CernVM Ben Segal 1 CERN 1211 Geneva 23, Switzerland E mail: b.segal@cern.ch Predrag Buncic CERN E mail: predrag.buncic@cern.ch 13th International Workshop on Advanced Computing
More informationSystem that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files
System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files Addressable by a filename ( foo.txt ) Usually supports hierarchical
More informationCernVM-FS Documentation
CernVM-FS Documentation Release 2.4.3 CernVM Team Jan 05, 2018 Contents 1 What is CernVM-FS? 1 2 Contents 3 2.1 Release Notes for CernVM-FS 2.4.4................................... 3 2.2 Release Notes
More informationGoogle File System. By Dinesh Amatya
Google File System By Dinesh Amatya Google File System (GFS) Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung designed and implemented to meet rapidly growing demand of Google's data processing need a scalable
More informationEvolution of the HEP Content Distribution Network. Dave Dykstra CernVM Workshop 6 June 2016
Evolution of the HEP Content Distribution Network Dave Dykstra CernVM Workshop 6 June 2016 Current HEP Content Delivery Network The HEP CDN is general purpose squid proxies, at least at all WLCG sites
More informationHandling Big Data an overview of mass storage technologies
SS Data & Handling Big Data an overview of mass storage technologies Łukasz Janyst CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it GridKA School 2013 Karlsruhe, 26.08.2013 What is Big Data?
More informationCernVM-FS Documentation
CernVM-FS Documentation Release 2.5.0 CernVM Team Dec 11, 2017 Contents 1 What is CernVM-FS? 1 2 Contents 3 2.1 Release Notes for CernVM-FS 2.5.................................... 3 2.2 Overview.................................................
More informationKubernetes Integration with Virtuozzo Storage
Kubernetes Integration with Virtuozzo Storage A Technical OCTOBER, 2017 2017 Virtuozzo. All rights reserved. 1 Application Container Storage Application containers appear to be the perfect tool for supporting
More informationCSE 124: Networked Services Lecture-16
Fall 2010 CSE 124: Networked Services Lecture-16 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/23/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments
More informationGoogle File System. Arun Sundaram Operating Systems
Arun Sundaram Operating Systems 1 Assumptions GFS built with commodity hardware GFS stores a modest number of large files A few million files, each typically 100MB or larger (Multi-GB files are common)
More informationECS. Monitoring Guide. Version 3.2.x.x
ECS Version 3.2.x.x Monitoring Guide 302-004-492 03 Copyright 2018 Dell Inc. or its subsidiaries. All rights reserved. Published June 2018 Dell believes the information in this publication is accurate
More informationUsing CernVM-FS to deploy Euclid processing S/W on Science Data Centres
Using CernVM-FS to deploy Euclid processing S/W on Science Data Centres M. Poncet (CNES) Q. Le Boulc h (IN2P3) M. Holliman (ROE) On behalf of Euclid EC SGS System Team ADASS 2016 1 Outline Euclid Project
More informationdcache Introduction Course
GRIDKA SCHOOL 2013 KARLSRUHER INSTITUT FÜR TECHNOLOGIE KARLSRUHE August 29, 2013 dcache Introduction Course Overview Chapters I, II and Ⅴ christoph.anton.mitterer@lmu.de I. Introduction To dcache Slide
More informationStorage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan
Storage Virtualization Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan Storage Virtualization In computer science, storage virtualization uses virtualization to enable better functionality
More informationAzure File Sync. Webinaari
Azure File Sync Webinaari 12.3.2018 Agenda Why use Azure? Moving to the Cloud Azure Storage Backup and Recovery Azure File Sync Demo Q&A What is Azure? A collection of cloud services from Microsoft that
More informationLHCb experience running jobs in virtual machines
LHCb experience running jobs in virtual machines Andrew McNab, University of Manchester Federico Stagni & Cinzia Luzzi, CERN on behalf of the LHCb collaboration Overview Starting from DIRAC + Grid CernVM
More informationCSE 124: Networked Services Fall 2009 Lecture-19
CSE 124: Networked Services Fall 2009 Lecture-19 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa09/cse124 Some of these slides are adapted from various sources/individuals including but
More informationDYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun
DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE Presented by Byungjin Jun 1 What is Dynamo for? Highly available key-value storages system Simple primary-key only interface Scalable and Reliable Tradeoff:
More informationCisco Wide Area Application Services: Secure, Scalable, and Simple Central Management
Solution Overview Cisco Wide Area Application Services: Secure, Scalable, and Simple Central Management What You Will Learn Companies are challenged with conflicting requirements to consolidate costly
More informationAzure Webinar. Resilient Solutions March Sander van den Hoven Principal Technical Evangelist Microsoft
Azure Webinar Resilient Solutions March 2017 Sander van den Hoven Principal Technical Evangelist Microsoft DX @svandenhoven 1 What is resilience? Client Client API FrontEnd Client Client Client Loadbalancer
More informationCPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University
CPSC 426/526 Cloud Computing Ennan Zhai Computer Science Department Yale University Recall: Lec-7 In the lec-7, I talked about: - P2P vs Enterprise control - Firewall - NATs - Software defined network
More informationData Movement & Tiering with DMF 7
Data Movement & Tiering with DMF 7 Kirill Malkin Director of Engineering April 2019 Why Move or Tier Data? We wish we could keep everything in DRAM, but It s volatile It s expensive Data in Memory 2 Why
More informationEECS 482 Introduction to Operating Systems
EECS 482 Introduction to Operating Systems Winter 2018 Harsha V. Madhyastha Multiple updates and reliability Data must survive crashes and power outages Assume: update of one block atomic and durable Challenge:
More informationThe Google File System
The Google File System By Ghemawat, Gobioff and Leung Outline Overview Assumption Design of GFS System Interactions Master Operations Fault Tolerance Measurements Overview GFS: Scalable distributed file
More informationGoogle File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo
Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google 2017 fall DIP Heerak lim, Donghun Koo 1 Agenda Introduction Design overview Systems interactions Master operation Fault tolerance
More informationThe Google File System. Alexandru Costan
1 The Google File System Alexandru Costan Actions on Big Data 2 Storage Analysis Acquisition Handling the data stream Data structured unstructured semi-structured Results Transactions Outline File systems
More informationKinetic drive. Bingzhe Li
Kinetic drive Bingzhe Li Consumption has changed It s an object storage world, unprecedented growth and scale In total, a complete redefinition of the storage stack https://www.openstack.org/summit/openstack-summit-atlanta-2014/session-videos/presentation/casestudy-seagate-kinetic-platform-in-action
More informationAuthors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani
The Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani CS5204 Operating Systems 1 Introduction GFS is a scalable distributed file system for large data intensive
More informationMySQL Group Replication. Bogdan Kecman MySQL Principal Technical Engineer
MySQL Group Replication Bogdan Kecman MySQL Principal Technical Engineer Bogdan.Kecman@oracle.com 1 Safe Harbor Statement The following is intended to outline our general product direction. It is intended
More informationATLAS Nightly Build System Upgrade
Journal of Physics: Conference Series OPEN ACCESS ATLAS Nightly Build System Upgrade To cite this article: G Dimitrov et al 2014 J. Phys.: Conf. Ser. 513 052034 Recent citations - A Roadmap to Continuous
More informationA scalable storage element and its usage in HEP
AstroGrid D Meeting at MPE 14 15. November 2006 Garching dcache A scalable storage element and its usage in HEP Martin Radicke Patrick Fuhrmann Introduction to dcache 2 Project overview joint venture between
More informationThe Google File System (GFS)
1 The Google File System (GFS) CS60002: Distributed Systems Antonio Bruto da Costa Ph.D. Student, Formal Methods Lab, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur 2 Design constraints
More informationThe HAMMER Filesystem DragonFlyBSD Project Matthew Dillon 11 October 2008
The HAMMER Filesystem DragonFlyBSD Project Matthew Dillon 11 October 2008 HAMMER Quick Feature List 1 Exabyte capacity (2^60 = 1 million terrabytes). Fine-grained, live-view history retention for snapshots
More informationThe Btrfs Filesystem. Chris Mason
The Btrfs Filesystem Chris Mason Btrfs Design Goals Broad development community General purpose filesystem that scales to very large storage Extents for large files Small files packed in as metadata Flexible
More informationAUTOMATING IBM SPECTRUM SCALE CLUSTER BUILDS IN AWS PROOF OF CONCEPT
AUTOMATING IBM SPECTRUM SCALE CLUSTER BUILDS IN AWS PROOF OF CONCEPT By Joshua Kwedar Sr. Systems Engineer By Steve Horan Cloud Architect ATS Innovation Center, Malvern, PA Dates: Oct December 2017 INTRODUCTION
More informationDistributed Storage with GlusterFS
Distributed Storage with GlusterFS Dr. Udo Seidel Linux-Strategy @ Amadeus OSDC 2013 1 Agenda Introduction High level overview Storage inside Use cases Summary OSDC 2013 2 Introduction OSDC 2013 3 Me ;-)
More informationZFS. Right Now! Jeff Bonwick Sun Fellow
ZFS Right Now! Jeff Bonwick Sun Fellow Create a Mirrored ZFS Pool, tank # zpool create tank mirror c2d0 c3d0 That's it. You're done. # df Filesystem size used avail capacity Mounted on tank 233G 18K 233G
More informationCeph at the DRI. Peter Tiernan Systems and Storage Engineer Digital Repository of Ireland TCHPC
Ceph at the DRI Peter Tiernan Systems and Storage Engineer Digital Repository of Ireland TCHPC DRI: The Digital Repository Of Ireland (DRI) is an interactive, national trusted digital repository for contemporary
More informationWhere s Your Third Copy?
Where s Your Third Copy? Protecting Unstructured Data 1 Protecting Unstructured Data Introduction As more organizations put their critical data on NAS systems, a complete, well thought out data protection
More informationA New Key-value Data Store For Heterogeneous Storage Architecture Intel APAC R&D Ltd.
A New Key-value Data Store For Heterogeneous Storage Architecture Intel APAC R&D Ltd. 1 Agenda Introduction Background and Motivation Hybrid Key-Value Data Store Architecture Overview Design details Performance
More informationECE 598 Advanced Operating Systems Lecture 19
ECE 598 Advanced Operating Systems Lecture 19 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 7 April 2016 Homework #7 was due Announcements Homework #8 will be posted 1 Why use
More informationCentre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules. Singularity overview. Vanessa HAMAR
Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules Singularity overview Vanessa HAMAR Disclaimer } The information in this presentation was compiled from different
More information7680: Distributed Systems
Cristina Nita-Rotaru 7680: Distributed Systems GFS. HDFS Required Reading } Google File System. S, Ghemawat, H. Gobioff and S.-T. Leung. SOSP 2003. } http://hadoop.apache.org } A Novel Approach to Improving
More informationEvolution of the ATLAS PanDA Workload Management System for Exascale Computational Science
Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science T. Maeno, K. De, A. Klimentov, P. Nilsson, D. Oleynik, S. Panitkin, A. Petrosyan, J. Schovancova, A. Vaniachine,
More informationPanzura White Paper Panzura Distributed File Locking
Panzura White Paper Panzura Distributed File Locking Panzura s game-changing Freedom Family of Products finally brings the full power and benefits of cloud storage to enterprise customers, helping to break
More informationThe Logic of Physical Garbage Collection in Deduplicating Storage
The Logic of Physical Garbage Collection in Deduplicating Storage Fred Douglis Abhinav Duggal Philip Shilane Tony Wong Dell EMC Shiqin Yan University of Chicago Fabiano Botelho Rubrik 1 Deduplication in
More informationStorage Industry Resource Domain Model
Storage Industry Resource Domain Model A Technical Proposal from the SNIA Technical Council Topics Abstract Data Storage Interfaces Storage Resource Domain Data Resource Domain Information Resource Domain
More informationDistributed File Systems. Directory Hierarchy. Transfer Model
Distributed File Systems Ken Birman Goal: view a distributed system as a file system Storage is distributed Web tries to make world a collection of hyperlinked documents Issues not common to usual file
More informationIntroduction to Cloud Computing
Introduction to Cloud Computing Distributed File Systems 15 319, spring 2010 12 th Lecture, Feb 18 th Majd F. Sakr Lecture Motivation Quick Refresher on Files and File Systems Understand the importance
More informationA New Model for Image Distribution
A New Model for Image Distribution Stephen Day Distribution, Tech Lead Docker, Inc. stephen@docker.com @stevvooe github.com/stevvooe Overview Why does this matter? History Docker Registry API V2 Implementation
More information