Manifest-based snapshots in distributed computing environments

Inventors

Hsieh, Jonathan Ming-CynBertozzi, Matteo

Assignees

Cloudera Inc

Publication Number

US-11768739-B2

Publication Date

2023-09-26

Expiration Date

2034-10-29

Interested in licensing this patent?

MTEC can help explore whether this patent might be available for licensing for your application.


Abstract

Scalable architectures, systems, and services are provided herein for creating manifest-based snapshots in distributed computing environments. In some embodiments, responsive to receiving a request to create a snapshot of a data object, a master node identifies multiple slave nodes on which a data object is stored in the cloud-computing platform and creates a snapshot manifest representing the snapshot of the data object. The snapshot manifest comprises a file including a listing of multiple file names in the snapshot manifest and reference information for locating the multiple files in the distributed database system. The snapshot can be created without disrupting I/O operations, e.g., in an online mode by various region servers as directed by the master node. Additionally, a log roll approach to creating the snapshot is also disclosed in which log files are marked. The replaying of log entries can reduce the probability of causal consistency in the snapshot.

Core Innovation

The invention provides scalable architectures, systems, and services for creating manifest-based snapshots in distributed computing environments. In essence, a master node, upon receiving a request to snapshot a data object, identifies multiple slave nodes where the data is stored and creates a snapshot manifest representing the snapshot. This snapshot manifest includes a file listing multiple file names and reference information for locating these files in the distributed database system. This snapshot creation can occur without disrupting I/O operations, allowing an online mode of creation directed by the master node to various region servers.

Further, the invention discloses a log roll approach in creating snapshots where log files are marked, and replaying of log entries ensures causal consistency in the snapshot. This is particularly important in online snapshot creation to maintain consistency despite the distributed nature of data and temporal discrepancies in capturing states across multiple region servers.

The problem being solved addresses inherent inefficiencies and high latencies when backing up or cloning data in distributed database systems like HBase over Hadoop. Traditional approaches require executing multiple MapReduce jobs or table manipulation commands that substantially impact workloads. There is a need for a more efficient mechanism to manage snapshot creation, export, cloning, and restoration with minimal impact on live operations and maintaining causal consistency.

Claims Coverage

The patent includes one independent method claim and two independent apparatus claims encompassing a computer-implemented method, a computer system, and a non-transitory computer-readable storage medium relating to manifest-based snapshot creation and table cloning in distributed computing platforms.

Cloning a table based on a snapshot

Accessing a snapshot manifest that represents a snapshot of a data object stored in a distributed computing platform, where each data node stores a partition of the data object; then cloning the table by creating a copy of the table metadata and copying relevant partitions into a new directory as links to the actual data partitions that remain valid even if the original data partitions are moved.

Creating snapshot manifest via combined region server responses

Combining responses from region servers associated with the partitions of the data object to create the snapshot manifest, supporting online creation with continued client I/O operations or offline creation that disables access and examines the namespace to determine table partitions.

Maintaining snapshot validity by archiving and updating references

Creating archived copies of data files (responses) before merging or splitting partitions and updating snapshot references to point to their archived copies to preserve snapshot integrity.

Restoring and rolling back tables with causal consistency

Restoring or rolling back a table to a previous state based on the snapshot, including clearing the table and replaying log entries to account for all relevant data, thereby reducing causal inconsistency.

Detecting and resolving causal inconsistency

Detecting cases where data input into the platform is unaccounted for in the snapshot and resolving this by requesting respective region servers to roll or mark logs to ensure snapshot consistency.

Backing up data efficiently without table manipulation commands

Performing backup via a MapReduce job based on the snapshot manifest, which bypasses typical table manipulation commands and reduces latency.

The inventive features cover the method and system of creating manifest-based snapshots through coordinated master and slave node operations, cloning tables using snapshot manifests with linked partitions that remain consistent despite data movement, online and offline snapshot creation modes, managing data consistency through archiving and log replay, and efficient backup without disruptive table commands.

Stated Advantages

Enables efficient and effective data management for distributed cloud environments by facilitating export, cloning, restoration, and other data operations with minimal impact on existing workloads.

Allows snapshot creation without disrupting I/O operations by supporting online snapshot creation directed by the master node to various region servers.

Reduces utilization of name nodes during snapshot and restore operations by using manifest files that list the relevant files.

Ensures causal consistency in snapshots via a log roll approach where markers in log files and replaying log entries ensure all related data is accounted for during restoration.

Provides low-latency workloads for data backup, cloning, and restoration by bypassing normal table manipulation commands and employing manifest-based snapshots.

Documented Applications

Creating, maintaining, and using snapshots in HBase-based Hadoop clusters for backup, cloning, restoration, and exporting data to improve data quality and availability.

Bootstrapping data replication and recovery from user errors in distributed databases using manifest-based snapshots.

Supporting snapshot creation both in offline mode (disabling table access) and online mode (allowing continued I/O operations) in distributed cloud-computing platforms like Hadoop.

Using snapshot manifests for efficient copying of data between source and target clusters in distributed computing environments without executing multiple MapReduce jobs with high latency.

JOIN OUR MAILING LIST

Stay Connected with MTEC

Keep up with active and upcoming solicitations, MTEC news and other valuable information.