EasyHSM
| File:General Atomics Logo.svg | |
| Developer(s) | General Atomics |
|---|---|
| Initial release | February 1, 2017 |
| Stable release | 1.0
/ February 1, 2017 |
| Preview release | 1.1
/ March 1, 2017 |
| Written in | C |
| Engine | |
| Operating system | Linux |
| Platform | X86-64 |
| Type | Hierarchical Storage Management Software |
| License | Proprietary commercial software |
| Website | www |
Search EasyHSM on Amazon.
EasyHSM is hierarchical storage management (HSM) software for IBM Spectrum Scale[1] (formerly known as GPFS), that migrates infrequently accessed data from high-speed Tier 1 storage to Tier 2 storage, on-premises or in the cloud. As such, applications and users see a standard POSIX file system (GPFS, NFS, or SMB) no matter where the data is stored, with no change to user or program access.
EasyHSM coordinates transparent migration of the contents of files in the GPFS namespace from GPFS-controlled flash or disk onto external storage and back. Files are always accessed directly from Tier 1 storage when needed. Moreover, access to files that were migrated is still regulated by normal GPFS file permissions (i.e. ACLs).
EasyHSM was designed to support multiple Tier 2 storage technologies. It currently supports NFS storage systems, S3 object stores (both Amazon S3 and other S3-compatible solutions), and Google Cloud Storage.
Software design
EasyHSM is designed to work as part of the GPFS DMAPI infrastructure.
The core functionality of EasyHSM is the DMAPI-based daemon.
- When an application tries to access a file that is currently not on Tier 1 storage, GPFS intercepts the request, sends a DMAPI message to the EasyHSM daemon, and waits for an answer. On receiving the DMAPI message, the EasyHSM daemon retrieves the data from the appropriate Tier 2 storage, and notifies GPFS that the data is ready. At this point GPFS can now serve the data to the requesting application.
- Note that no DMAPI traffic is generated for access to files already on Tier 1 storage.
- Similarly, when a file is being deleted in GPFS, a DMAPI message is sent to EasyHSM, so the copy in external storage can be deleted.
Additionally, EasyHSM provides command line tools for explicit movement of data between Tier 1 and Tier 2 storage. These tools will notify the GPFS DMAPI layer that the data location for the affected file has changed, thus either enabling or disabling DMAPI messages on access.
Lifecycle of a file in EasyHSM
The lifecycle of a file managed by EasyHSM is described below:
- All files managed by EasyHSM are born as normal GPFS files, with their contents held on GPFS-managed disk.
- At some point in time, a copy of the file is made on external storage; this process is known as pre-migrate. At this point the contents of the file exist in two exact copies both on GPFS-owned disks and on external storage. Any access to the file content is served directly from GPFS-owned disks.
- Sometime later, the data on GPFS-owned disks is deleted, but from the user point of view the file is still the same; this process is known as migration, and the file status is known as a HSM stub The contents of the file still exist in external storage.
- When a user needs the contents of the file, the contents of the file are copied from the external storage onto the GPFS-owned disks; this process is known as a recall. This process can happen either:
- Transparently: An HSM daemon monitors access to stubbed files, intercepts read requests and delays them until the data is again on GPFS-owned disks.
- Explicitly: An administrator can explicitly recall the contents of select files to GPFS-owned disks, in expectation of user access in the near future. Note: This approach is slightly more efficient.
- If a file is modified after being pre-migrated, its contents is marked as dirty, since the data on GPFS-owned disks and on external storage differ. The file has to be pre-migrated again in order for the two copies to be back in sync. The copy on GPFS–owned disks is always the authoritative one.
- If the file in GPFS namespace is deleted, the copy in external storage is deleted, too. EasyHSM allows for a HSM file to be marked as archival, in which case the external copy is preserved in the event of an HSM stub removal.
- If so desired, the link between the file in GPFS and the external copy can be broken; this process is known as unmigrate. If needed, the contents of the file are first recalled onto GPFS-owned disks. The same considerations as with the file removal above apply to the external copy.
Setup and administration
EasyHSM requires a one-time setup procedure to select and configure the external storage to use as the 2nd tier. From that point on, EasyHSM runs invisibly in the background.
(Pre-)migration can be handled either manually or through the IBM Spectrum Scale's ILM Policy Engine,[2] the latter being much more convenient.
Deleted files in Tier 1 are automatically deleted in Tier 2 storage, unless marked archival, and changed files in Tier 1 and Tier 2 storage are synchronized during pre-migration.
Scalable multi-node parallel transfer
EasyHSM migration and recall can be distributed over multiple nodes in order to increase throughput.
Reverse migration
In order to facilitate the retirement of legacy storage systems, EasyHSM can work together with Nirvana to create HSM stubs in a target GPFS filesystem, pointing to the original files on NFS legacy storage, while preserving file ownership and ACLs.
Nirvana is responsible for scanning the external storage system. EasyHSM tools then take that information, and create the HSM stubs in GPFS. No data happens at this point, but all the data in external storage is immediately available to GPFS applications and users.
The users/administrators can then use the EasyHSM's explicit recall, or on-use recall, to migrate the actual data from the external storage to GPFS. EasyHSM can also be used to monitor how much data has been migrated and how much still only exists on external storage.
Note that after the HSM-migration, the external storage should be considered read-only. Changes to data and/or metadata in external storage are not supported.
Main competitors
The main competitor is the Transparent Cloud Tiering feature in GPFS. The main differentiator of EasyHSM is the ability to use non-cloud Tier-2 storage resources, no reconciliation[3] process, and simplified setup.
References
This article "EasyHSM" is from Wikipedia. The list of its authors can be seen in its historical. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.
