Cyber vaccine and predictive-malware-defense methods and systems

Inventors

Howard, MichaelPfeffer, AviDalal, MukeshReposa, Michael

Assignees

Charles River Analytics Inc

Publication Number

US-10848519-B2

Publication Date

2020-11-24

Expiration Date

2038-10-12

Interested in licensing this patent?

MTEC can help explore whether this patent might be available for licensing for your application.


Abstract

Methods and systems for Predictive Malware Defense (PMD) are described. The systems and methods can utilize advanced machine-learning (ML) techniques to generate malware defenses preemptively. Embodiments of PMD can utilize models, which are trained on features extracted from malware families, to predict possible courses of malware evolution. PMD captures these predicted future evolutions in signatures of as yet unseen malware variants to function as a malware vaccine. These signatures of predicted future malware “evolutions” can be added to the training set of a machine-learning (ML) based malware detection and/or mitigation system so that it can detect these new variants as they arrive.

Core Innovation

The invention provides systems and methods for predictive malware defense (PMD) that utilize advanced machine-learning techniques to predict and defend against malware attacks before they occur. These systems analyze features extracted from malware families to model and predict the evolution patterns of malware, generating signatures of future, as-yet-unseen malware variants. By anticipating how malware could evolve, these predicted signatures serve as a type of cyber vaccine, enabling detection and mitigation of new threats ahead of their actual deployment by attackers.

The core PMD process involves several phases: feature extraction and reduction from both benign and malicious events, clustering the signatures into families, learning the evolution patterns of those families, and predicting future malicious event signatures. These predicted signatures are then incorporated into the training datasets of machine-learning-based malware classifiers, thereby enhancing the system's ability to recognize and classify both existing and future malware variants, often before such variants are released in the wild.

The PMD approach addresses the challenge that most existing malware defense systems are reactive, responding to threats only after damage has already occurred. Standard signature-based antivirus solutions struggle with rapidly evolving malware, and while current machine-learning models can detect some previously unseen variants, they do not attempt to model the evolutionary pathways of malware development. This invention explicitly models those pathways, allowing defenders to preemptively fortify their systems against likely future threats.

Claims Coverage

The patent includes one main independent claim, which defines a system for predictive malware defense composed of several inventive features.

Predictive malware defense system with learning and prediction phases

The system comprises a processor and a memory storing instructions which, when executed, cause the system to: 1. Perform feature reduction training on a collection of benign and malicious events to produce a feature reducer. 2. Perform evolution pattern training by extracting features from a collection of malicious events, converting them into signatures, clustering them into families, and generating an evolution pattern predictor. 3. Perform evolution pattern prediction on another collection of malicious events, converting them into signatures and families, then using the evolution pattern predictor to generate signatures of future malicious events. 4. Perform malicious event detection training by extracting features from labeled benign and malicious events, converting them into signatures, combining them with the future malicious event signatures, and training an event classifier to distinguish between benign and malicious events. 5. Perform event classification on new events, extracting and reducing features, and classifying each as benign or malicious based on the event classifier trained with future malicious event signatures.

Collection of events includes binary executable files

The system's collection of events for analysis may include computer code in a binary executable file format.

Feature reducer using singular value decomposition (SVD)

The feature reducer component can use singular value decomposition (SVD) for feature extraction from the input events.

Feature extraction of code block semantics

Feature extraction for an event can derive the semantics of a code block directly from the code of that block.

Family clustering with locality-sensitive hashing (LSH)

The family clusterer produces collections of families from the signatures using locality-sensitive hashing (LSH) schemes.

MinHash scheme for LSH

The LSH scheme utilized by the family clusterer can include the MinHash scheme for efficient similarity detection.

SESAME system implementation for event classification

A SESAME system can be used to implement the event classifier and produce benign or malicious classifications for new events.

The inventive features collectively establish a comprehensive system for proactively defending against malware by modeling malware evolution, predicting future variants, and incorporating those predictions into machine-learning-based classification systems.

Stated Advantages

Shifts the advantage from attacker to defender by enabling prediction and preemptive mitigation of malware before it appears.

Improves detection of future malware variants that are not detectable by conventional reactive or signature-based systems.

Preemptive defense reduces time lag between malware release and system defense capability.

Allows classifiers to detect new malware variants without increasing false positive rates.

Supports strategic planning and prioritization by focusing on likely emerging threats.

Applicable to any attack type that can be represented by features and where attacks develop in families.

Enhances the effectiveness of machine-learning-based malware detection systems by incorporating predicted signatures.

Documented Applications

Malware attacks and network-based attacks where families of attacks evolve over time.

Classification and defense against evolving malware in binary executable formats, including Windows and Android applications.

Preemptive mitigation and detection of novel malware developed to evade signature-based and ML-based antivirus defenses.

Enterprise-level large-scale malware classification and detection, including endpoint and cloud-based systems.

JOIN OUR MAILING LIST

Stay Connected with MTEC

Keep up with active and upcoming solicitations, MTEC news and other valuable information.