Utilization-aware resource scheduling in a distributed computing cluster

Inventors

Kambatla, Karthik

Assignees

Cloudera Inc

Publication Number

US-11099892-B2

Publication Date

2021-08-24

Expiration Date

2037-05-15

Interested in licensing this patent?

MTEC can help explore whether this patent might be available for licensing for your application.


Abstract

Embodiments are disclosed for a utilization-aware approach to cluster scheduling, to address this resource fragmentation and to improve cluster utilization and job throughput. In some embodiments a resource manager at a master node considers actual usage of running tasks and schedules opportunistic work on underutilized worker nodes. The resource manager monitors resource usage on these nodes and preempts opportunistic containers in the event this over-subscription becomes untenable. In doing so, the resource manager effectively utilizes wasted resources, while minimizing adverse effects on regularly scheduled tasks.

Core Innovation

The invention disclosed is a utilization-aware approach to resource scheduling in distributed computing clusters, specifically addressing the problem of resource fragmentation and under-utilization caused by over-allocation in existing cluster schedulers. The resource manager at a master node monitors actual resource usage of running tasks and opportunistically schedules additional work on underutilized worker nodes through the allocation of opportunistic containers. These containers utilize underused, previously allocated resources while minimizing adverse effects on regularly scheduled tasks.

The disclosed techniques, generally referred to as utilization-based incremental scheduling (UBIS), enable oversubscription of cluster resources by opportunistically allocating slack resources that are not fully utilized by regular containers. The system enforces a hierarchy where regular containers are first-tier with guaranteed resources, and opportunistic containers are second-tier, subject to preemption if resource usage crosses defined thresholds. UBIS includes adjustable parameters to control resource over-allocation and preemption, allowing the system to balance improved utilization with performance stability.

Claims Coverage

The patent includes multiple independent claims focusing on methods and systems for recouping previously allocated computing resources in a distributed computing cluster by opportunistically scheduling tasks using underutilized resources and managing preemption based on resource usage thresholds.

Opportunistic resource allocation based on actual utilization and thresholds

Allocating opportunistic second-tier resources at a worker node to process tasks when actual computing resource utilization is below a first threshold, where these resources include underutilized computing resources previously allocated and guaranteed to first-tier resources. The thresholds are defined via parameters Talloc and Tpreempt related to node resource capacity.

Preemptive deallocation of opportunistic resources upon threshold breach

De-allocating opportunistic second-tier resources when actual resource utilization rises above a second threshold to guarantee resources for first-tier containers, wherein deallocation can be initiated by either the master node or the worker node itself.

Task opt-in and opt-out controls for opportunistic resource utilization

Determining whether a task allows the use of opportunistic second-tier resources, including receiving indications in task requests to permit or disallow processing with opportunistic resources.

Dynamic and node-specific threshold management

Setting and dynamically adjusting the first and second thresholds (Talloc and Tpreempt) based on information received from worker nodes, and allowing thresholds to vary per worker node and resource type.

Multi-tier opportunistic resource scheduling

Allocating opportunistic resources with more than two priority tiers (e.g., third-tier opportunistic containers) under similar threshold-based allocation and preemption conditions.

Applicability to diverse computing resources

Applying the resource allocation and preemption methods to multiple computing resource types, such as processing, memory, data storage, input/output, or network resources.

System implementation for resource recoupment using utilization-informed scheduling

A system comprising processor and memory instructions to receive utilization information from worker nodes and allocate opportunistic second-tier resources according to utilization thresholds, with monitoring via periodic heartbeat signals to determine preemption timing.

The independent claims collectively cover a utilization-aware resource scheduling framework that dynamically allocates and preempts opportunistic resources in a distributed computing cluster based on monitored resource utilization, task choices, and configurable thresholds, enhancing resource utilization while preserving performance guarantees for primary tasks.

Stated Advantages

Improves cluster utilization and job throughput by exploiting underutilized resources through opportunistic scheduling.

Minimizes the adverse effects on regularly scheduled tasks by using a priority hierarchy and preemptive measures.

Allows configurable control over resource oversubscription and preemption aggressiveness through adjustable parameters (Talloc and Tpreempt).

Enables fair sharing of resources among users while allowing tasks to opt out of opportunistic scheduling if necessary.

Documented Applications

Improving resource allocation in distributed computing clusters implementing data-centric programming models such as Apache Hadoop MapReduce and Apache Spark.

Integration in cluster schedulers like Apache Hadoop YARN to allocate containerized resources more efficiently based on actual usage.

Application in heterogeneous resource environments including CPU, memory, storage, network bandwidth, GPUs, and I/O resources.

Supporting batch processing and real-time ad hoc queries on big data via unified distributed computing platforms.

JOIN OUR MAILING LIST

Stay Connected with MTEC

Keep up with active and upcoming solicitations, MTEC news and other valuable information.