Why are storage systems relevant?
-
Cornerstone for data management infrastructures and systems:
- Cloud, HPC, loT, …
- Databases, Analytics, Machine Learning, Deep Learning, …
-
Crucial to ensure data persistency and availability;
-
Performance is key:
- slow data storage and retrieval translates into slow applications and services.
Storage Workloads
Archival
- Data is stored for archival purposes. Useful for digital information that is rarely accessed but may be relevant in the future:
- Throughput is favored over latency:
- large amounts of data must be written/read efficiently;
- write-once data typically;
- archival files are usually append-only.
- Sequential workloads:
- archival files are written and read sequentially.
- Throughput is favored over latency:
- Example of an archival service: Amazon Glacier.
Backup
-
Data backups of fresh data. Useful for digital information that is still in use and may be accessed frequently in the near future:
- Throughput is still favored over latency:
- large amount of data must be written/read efficiently.
- Sequential workloads mainly:
- sometimes one may want only to retrieve/update specific parts of backup files.
- Data can now be updated typically in a sporadic fashion:
- in some cases, only diffs (modified data) are stored across backups of the same data.
- Throughput is still favored over latency:
-
Example of a backup service: Amazon S3.
Primary Storage* (not only RAM!)
-
Storage support for databases, data analytics, AI frameworks, VMs …
- High throughput and/or low-latency are now desirable:
- large amounts of data may be written/read (throughput);
- small sized writes/reads must be done efficiently (latency).
- Sequential and random workloads:
- the content of files may be partially accessed and out of order.
- Data and metadata intensive workloads:
- frequent access to the content of files (data) but also to different files (metadata).
- Data is expected to be updated frequently.
- High throughput and/or low-latency are now desirable:
-
Example of a primary data service: Amazon EBS.
Storage Mediums
-
Tape:
- used for archival data;
- reliable and cheap;
- no support for random accesses or in-place updates.
-
HDD:
- used for archival, backup, and (still in some cases) primary data;
- still cheap, with support for random accesses and in-place updates.
-
SSD:
- used for backup and primary data;
- more expensive than HDDs but faster, specially for random accesses.
-
Persistent Memory:
-
used for primary storage (mainly used as cache);
-
speed closer to RAM for sequential and random workloads, but expensive;
-
RAM:
- used for primary storage (used as caches);
- volatile, when the computer reboots data is lost…
Storage Interfaces
-
Block Device:
- data is managed a set of blocks (closest abstraction to the disk);
- e.g., Linux block device, Amazon EBS, …
-
File System:
- data is managed as a hierarchy of files and directories (abstraction that most users rely on their personal computers);
- e.g., Ext4, HDFS, Ceph, …
-
Object Storage:
- data is managed as objects (e.g., each file is mapped to a key-value pair, the key is an unique file identifier, and the value is the file’s content);
- e.g., Amazon S3, Ceph, …
Storage Scope
From local…
-
The Operating System (OS) mediates I/0 requests from applications to the local disk(s);
-
Applications can interact directly with the block device layer or …;
-
With the file system:
- the virtual file system provides a common interface and abstraction;
- different file system implementations must follow the VFS abstraction, while specializing it for their needs.
-
The OS page cache holds content of files in memory for quicker access.
To remote…
-
Storage is provided across the network:
- block devices;
- file systems.
-
Client-server architecture:
- client I/O requests are intercepted and sent over the network;
- at the remote machine, requests are forwarded to the server component and then stored at the local disk.
To distributed… (data center)
-
Large-scale:
- tens of hundreds of nodes storing data;
- Examples: HDFS, Ceph, Lustre, …
-
No single point of failure:
- data distributed (replicated) across “data” nodes;
- metadata (location of files, permissions, etc) managed by independent “meta” nodes;
- clients contact “meta” nodes to get the location of their data. The retrieval/update of such data is done directly through the “data” nodes.
-
Manager-worker design optimized for stable churn (i.e, failure of servers at a small and stable rate).
To highly distributed… (peer-to-peer)
-
Very large-scale:
- thousands to millions of nodes;
- Examples: CFS, Napster, …
-
No single point of failure:
- data and metadata distributed (replicated) across several nodes (no specialized nodes);
- clients can interact (make requests) to any node;
- difficult to maintain a consistent data view across all nodes (ensuring that clients connecting to different nodes at the same time observe the same files and content).
-
Peer-to-Peer design optimized for high-churn (i.e., nodes are expected to fail and rejoin the system frequently).
Storage Features
Data availability
-
RAID - Redundant Array of Inexpensive Driver: data is replicated across multiple local disks in a single server for availability and load balancing purposes;
-
Replication: data is replicated across several servers for availability and load balancing purposes;
-
Erasure-Codes: data is broken into fragments, with redundant information, that are then spread across several servers for availability and load balancing purposes.
Performance Optimizations
-
Data locality: push computation near to the devices and/or servers holding data:
- storage and processing co-location at the same server/device, which means faster access to data!
-
Caching: keep data closer to the client and/or accessible from a faster source (e.g., RAM):
- avoids waiting for data to be written/read from local or remote storagge;
- Example: file system page cache.
Space Efficiency
-
Compression: removes redundant content inside and across files:
- usually used as a static technique (i.e., for sets of files that are not going to be further updated).
-
Deduplication: eliminates redundant copies at a storage system:
- used as a dynamic technique (i.e., some files/blocks are expected to be updated in the future).
Security
-
Data encryption: privacy protection for sensitive data:
- Encryption at rest: data is encrypted before being stored persistently;
- Encryption in transit: data is encrypted at the client premises before being sent through the network.
-
Access Control: avoid unauthorized access to users data.
Complex and Monolithic Storage Solutions
Modern infrastructures
-
The I/O stack of data centers is long and composed of many components:
- applications, local file systems, VMs, remote storage, disks, …
- each providing a strict combination of storage features, however …
-
The best combination of features to apply varies with the applications:
- small files versus large files;
- storage access patterns;
- sensitive vs non-sensitive information.
Software-Defined Storage
SDS
- Main principles:
- I/O flows (data plane) are separated from the control flow (control plane);
- the control plane ensures global control of I/O flows (logically centralized).
Data Plane
-
Layered design organized into stages;
-
Each stage handles requests at the I/O path and provides different features;
-
Programmable and extensible design, i.e., stages can be extended with new features.
Control Plane
-
Global visibility of applications, stages and infrastructure resources:
- the brain of the system that holistically coordinates data plane stages.
-
Distributed for scalability and availability purposes;
-
Configures and tunes data plane stages to enforce I/O policies:
- Quality of Service: I/O fairness or prioritization, etc.
- Transformations: Encryption, compression, etc.
- Policies are defined through Control Applications.