Manifest-based snapshots in distributed computing environments
Inventors
Hsieh, Jonathan Ming-Cyn • Bertozzi, Matteo
Assignees
Publication Number
US-12007846-B2
Publication Date
2024-06-11
Expiration Date
2034-10-29
Interested in licensing this patent?
MTEC can help explore whether this patent might be available for licensing for your application.
Abstract
Scalable architectures, systems, and services are provided herein for creating manifest-based snapshots in distributed computing environments. In some embodiments, responsive to receiving a request to create a snapshot of a data object, a master node identifies multiple slave nodes on which a data object is stored in the cloud-computing platform and creates a snapshot manifest representing the snapshot of the data object. The snapshot manifest comprises a file including a listing of multiple file names in the snapshot manifest and reference information for locating the multiple files in the distributed database system. The snapshot can be created without disrupting I/O operations, e.g., in an online mode by various region servers as directed by the master node. Additionally, a log roll approach to creating the snapshot is also disclosed in which log files are marked. The replaying of log entries can reduce the probability of causal consistency in the snapshot.
Core Innovation
The invention provides scalable architectures, systems, and services for creating manifest-based snapshots in distributed computing environments, particularly in Hadoop-based cloud computing platforms. Upon receiving a request to create a snapshot of a data object, a master node identifies multiple slave nodes storing the data object and creates a snapshot manifest representing the snapshot. This manifest includes a file listing multiple file names and reference information for locating these files within the distributed database system.
The snapshot creation can be performed without disrupting ongoing I/O operations, enabling an online mode where various region servers operate as directed by the master node. Additionally, a log roll approach is employed whereby log files are marked, ensuring causal consistency by replaying log entries during snapshot creation and restoration.
The problem addressed is the inefficiency and high latency of existing backup methods, such as batch MapReduce jobs for exporting, copying, and importing tables in distributed systems like HBase and Hadoop. These prior approaches impose substantial workload impact and latency. Therefore, a need exists for a more efficient approach that allows snapshot creation, cloning, exporting, and restoration with minimal impact on existing workloads.
Claims Coverage
The patent includes multiple inventive features extracted from independent claims covering methods, systems, and storage media for manifest-based snapshot management in distributed computing platforms.
Creation of snapshot manifest portions by region servers
Each region server associated with a data node receives a request from a master node to create a corresponding portion of a snapshot manifest. Each portion corresponds to a partition of a data object stored locally and includes a file listing multiple file names and reference information for locating these files. The master node combines these portions to form the full snapshot manifest.
Flushing memory before snapshot portion creation
Region servers flush their memory prior to creating the snapshot manifest portion, ensuring data consistency for the snapshot.
Dropping markers in log files to indicate snapshot positions
The snapshot portion creation includes dropping markers in log files to indicate relevant positions, supporting causal consistency.
Copying data object partitions using links in cloning
Partitions of the data object are copied into new directories on data nodes when cloning tables based on snapshots. These copies consist of links to original partitions rather than actual data, maintaining validity even if original files are moved.
Avoiding data modification except during merges or splits
Modification of data object partitions is avoided except when merging or splitting files, preserving snapshot integrity.
Restoration of data objects to previous states using snapshots
Region servers receive requests to restore data objects to prior states based on snapshots by copying the current partitions into archive directories and updating the data partitions to previous states.
Adding log entries in response to detected causal inconsistencies
Upon detection of causal inconsistencies, region servers add entries to log files indicating memory flushes to ensure consistency during snapshot operations.
Distributed computer system architecture for snapshot management
A system comprising a master node and multiple slave nodes where slave nodes implement region servers linked to data nodes. Slave nodes create snapshot manifest portions upon master node requests and transmit these portions back to the master for manifest aggregation.
Computer-readable storage medium executing snapshot manifest operations
Instructions stored in a non-transitory storage medium cause a system to implement region servers, receive snapshot portion creation requests from a master node, create these portions listing filenames and references, and transmit portions back for manifest formation.
The claims collectively cover a distributed approach to creating, managing, cloning, and restoring manifest-based snapshots through coordination between master nodes and region servers, including techniques ensuring causal consistency and low-latency snapshot operations in distributed computing environments.
Stated Advantages
Facilitates efficient and effective data management in distributed, cloud computing environments.
Allows creation, maintenance, and usage of snapshots to export, clone, restore, and perform other data operations with minimal impact on existing workloads.
Enables snapshot creation without disrupting I/O operations by supporting online snapshot mode.
Reduces utilization of the name node during snapshot and restore operations through manifest files listing file locations.
Ensures causal consistency in snapshots through log roll approaches and log file markers.
Improves backup, cloning, exporting, and restoration performance compared to existing high-latency MapReduce-based approaches.
Documented Applications
Backup of large tables in distributed databases by creating point-in-time snapshots.
Exporting data between source and target clusters efficiently without high-latency table manipulation commands.
Cloning tables for data replication or testing by using snapshot manifests and linking file references.
Restoring tables to previous states after errors or data loss by replaying logs and updating data partitions per snapshots.
Managing distributed cloud-computing platforms based on Hadoop, especially HBase environments.
Interested in licensing this patent?