Unified Storage for the Cloud Means Higher-Level Interfaces
In common use, the term “unified storage” means providing block-level and file-level access to the same storage system with a single management and control interface. Traditionally, block-level access is via fiber channel or iSCSI, and file-level access is via NFS or CIFS protocol.
Recently, storage vendors are also adding _object_-level storage where the objects are entities with metadata like type, access control policies. Objects are read and written by applications using REST HTTP or SOAP and used directly at the application level. The most popular API is Amazon’s S3 (Simple Storage Service). With the higher-abstraction level of objects, the underlying implementation (e.g., number of parts, tiered storage, etc.) is hidden even more than with block- and file-level interfaces. EMC, Hitachi Data Systems, and NetApp among others provide unified storage systems.
Cloud storage is bringing some different requirements for a unified storage system. Most notably, not only is the data in the cloud, but the applications and compute resources are in the cloud. Instead of pulling data from the cloud, processing the data, and pushing back to cloud, the paradigm is to have compute resources in the cloud read and write data directly, local to the cloud. The data is never moved out of the cloud unless absolutely needed.
This type of usage pushes the need for even higher-level interfaces to data like SQL, Map-Reduce, and ETL. Unified storage for the cloud needs to do more than provide multi-protocol access to data that can be managed with one system. In addition to access, there must be the ability to process the data, e.g., running an SQL query. Then applications can easily use the functionality of the cloud storage/compute system instead of using the cloud as a dumb storage system and pulling and pushing the data between the cloud storage system and compute resources.
Unified storage for block- and file- level is still required and important because of tghe need to integrate with compute nodes that have a block-level or file-level interface. And also, there is the cloud data bootstrap problem of how do you initially get the data in the cloud that can be efficiently done by block-level transfer.
The power of a unified cloud storage system is in the network effect of having a single management interface that allows managing users, multiple tenants, storage and resource quotas, security, access management with ACLs (access control lists) for sharing, and other functions. With the single management interface, the system administrator can effectively control the backend storage system used in a wide variety of users using a small slice of different functionality (e.g., object storage only) or users using a wide scope of data access and processing.
By Gary Ogasawara
Gary Ogasawara is the VP Engineering at Gemini Mobile Technologies. He has worked on large scale mail systems for service providers and other high-performance, high-volume software systems. Gemini’s Cloudian™ product is an S3-compatible storage software package.