System and method for hybrid kernel and user-space checkpointing using a character device
Inventors
Assignees
Interested in licensing this patent?
MTEC can help explore whether this patent might be available for licensing for your application.
Abstract
A system, method, and computer readable medium for hybrid kernel-mode and user-mode checkpointing of multi-process applications using a character device. The computer readable medium includes computer-executable instructions for execution by a processing system. A multi-process application runs on primary hosts and is checkpointed by a checkpointer comprised of a kernel-mode checkpointer module and one or more user-space interceptors providing barrier synchronization, checkpointing thread, resource flushing, and an application virtualization space. Checkpoints may be written to storage and the application restored from said stored checkpoint at a later time. Checkpointing is transparent to the application and requires no modification to the application, operating system, networking stack or libraries. In an alternate embodiment the kernel-mode checkpointer is built into the kernel.
Core Innovation
The invention provides a hybrid kernel-mode and user-space checkpointing approach for multi-process applications. A kernel module performs transparent checkpoint capture by capturing memory-page and kernel-state information during application execution, while user-space interceptors coordinate execution and checkpoint behavior through barrier synchronization and other controls.
A character-device checkpointer interface is used for checkpoint creation, including a read function that includes memory pages used by the applications. The checkpointer reads memory locations and creates checkpoints by capturing the memory pages, and the captured kernel state and memory information are used to write checkpoint data to storage for later restore.
An Application Virtualization Space (AVS) virtualizes operating system constructs such as PIDs, TIDs, and resource identities across restore. During restore, the initial process recreates the process hierarchy and remaps the AVS resources so that checkpointed applications, including shared/global state and process relationships, are restored while maintaining transparent operation without modifying the application, OS, networking stack, or libraries.
Claims Coverage
This patent contains four independent claims, centered on synchronization points that coordinate or pause application execution and a checkpointer that creates checkpoints by reading memory pages used by the applications, with a character-device implementation and specific read-function behaviors.
Synchronization point coordinating execution and checkpointer reading memory locations for checkpoints
The system includes one or more instructions comprising a synchronization point for coordinating execution of one or more applications at the synchronization point, and one or more instructions comprising a checkpointer configured to read one or more memory locations used by the one or more applications to create one or more checkpoints, wherein the checkpointer comprises instructions for a read function to include memory pages used by the one or more applications.
Checkpointer device read that forwards device pointer to next page after read
The system includes one or more instructions comprising a checkpointer device configured to read one or more memory locations used by the one or more applications to create one or more checkpoints, wherein the checkpointer device comprises instructions for a read function to include memory pages used by the one or more applications, and wherein the checkpointer device comprises instructions for the CPUs to forward a device pointer to a next page after a read.
Synchronization point for pausing execution with a character device checkpointer
The system includes one or more instructions comprising a synchronization point for pausing execution of the one or more applications at the synchronization point, and one or more instructions comprising a checkpointer device configured to read one or more memory pages and create one or more checkpoints by reading memory pages used by the one or more applications, wherein the checkpointer device is a character device, wherein the checkpointer character device comprises instructions for a read function to include memory pages used by the one or more applications, and wherein the checkpointer character device comprises instructions for the CPUs to forward the checkpointer character device pointer to a next page after a read.
Synchronization point pausing and triggering pause plus character-device checkpointer read
The system includes one or more instructions comprising a synchronization point for pausing execution of the one or more applications at the synchronization point and triggering the one or more applications to pause execution at the synchronization point, and a checkpointer character device comprising instructions for the CPUs for a read function to include memory pages used by the one or more applications, wherein the checkpointer character device comprises instructions for the CPUs to forward a device pointer to a next page after a read.
Across the independent claims, the core claim coverage is the combination of a synchronization point that coordinates and/or pauses application execution with a checkpointer that creates checkpoints by reading memory pages used by the applications, implemented as or accessed via a character-device read function that includes a read-function mechanism for forwarding a device pointer to a next page after a read.
Stated Advantages
Transparent operation without modifying the application, OS, networking stack, or libraries.
Documented Applications
Checkpointing and later restoring multi-process applications, including maintaining process hierarchy and shared/global state across restore.
Deployment architectures including primary and backup servers, with possible remote storage/network scenarios.
Interested in licensing this patent?