Adaptive malware binary rewriting

Inventors

Smith, Jared M. • Koch, Luke

Assignees

UT Battelle LLC

Publication Number

US-12019746-B1

Publication Date

2024-06-25

Expiration Date

2042-06-28

Interested in licensing this patent?

MTEC can help explore whether this patent might be available for licensing for your application.

Abstract

An adaptive malware writing system includes a targeting engine that classifies malware candidates as a malicious candidate or a benign candidate through a surrogate model. The surrogate model assigns a weight to each byte of the malware candidates through a saliency vector. The sum of the weights render a malware classification score. An alteration engine alters a binary form of the malware candidates classified as malware by executing a functional analysis that traces application program interface calls and memory. The alteration engine alters the binary form of the malware candidates classified as malware to render a synthesized malware. The malware analysis determines if the synthesized malware is operational by comparing an image of the synthesized malware to an image of at least one of the plurality of malware candidates. A target classifier engine identifies the vulnerabilities of a targeted computer.

Core Innovation

The invention discloses an adaptive malware rewriting system that uses a targeting engine to classify malware candidates as malicious or benign via a surrogate model. This surrogate model, implemented as a convolutional neural network, assigns weights to each byte in a malware candidate through a saliency vector, producing a malware classification score. By altering the least number of bytes necessary to shift this score, the system aims to reclassify malware as benign without compromising the malicious functionality of the binary.

An alteration engine executes functional analysis using various tools to trace application program interface (API) calls, analyze memory, and identify behavioral signatures. This engine alters binaries that have been classified as malware, applying semantic changes such as appending benign code sections, padding with bytes, renaming headers, or injecting code blocks, to create synthesized malware that maintains operational capacity. The process is guided by optimization and recursive sampling to focus on alterations most likely to evade existing malware detection schemes.

A malware analysis engine then confirms whether the synthesized malware remains operational by comparing images of the altered binary to its original form. If the malware functionality is intact, these new variants are integrated into malware profiles and used to generate vulnerability reports and training data. This improved data is used to enhance malware detection models, supplementing traditional static, signature-based, and behavior-based malware detectors to cover threats that may not have been identifiable before exposure or infection.

Claims Coverage

There are three independent claims, each detailing the major inventive features of the adaptive malware rewriting system and process.

Non-transitory machine-readable medium for adaptive malware rewriting

A non-transitory machine-readable medium encoded with instructions to: - Process multiple binary-form malware candidates that can disrupt, damage, or gain unauthorized access to a targeted computer. - Classify each malware candidate as malicious or benign using a targeting engine with a surrogate model, which assigns a byte-wise saliency vector and generates a malware classification score. - Alter the binary of candidates classified as malicious using an alteration engine, which applies multiple functional analysis tools that trace API calls and analyze memory, until some candidates are reclassified as benign, creating synthesized malware. - Confirm operational status of the synthesized malware via a malware analysis engine that compares images of the original and altered binaries. - Generate a vulnerability report identifying multiple security vulnerabilities of the targeted computer via a target classifier engine.

Process for generating synthesized malware with vulnerability output

A process using a non-transitory machine-readable medium to: - Process plural binary-form malware candidates capable of disrupting, damaging, or unauthorized access. - Classify candidates as malicious or benign through a targeting engine using a surrogate model with saliency vectors and classification scores. - Employ an alteration engine with functional analysis to alter binaries marked as malicious, rendering some as benign and thus synthesized malware. - Determine operational status by comparing images of synthesized malware with originals using a malware analysis engine. - Produce training data from operational synthesized malware and generate a vulnerability report by a target classifier engine identifying security vulnerabilities.

Adaptive malware writing system architecture for synthesized malware detection and reporting

An adaptive malware writing system comprising: - A targeting engine to classify operational malware candidates as malicious or benign using a surrogate model that assigns byte-level saliency vectors and computes a classification score. - An alteration engine to modify the binary of malicious candidates, guided by multiple functional analysis tools that trace API calls and analyze memory, until benign classification is achieved, producing synthesized malware. - A malware analysis engine to validate operational status by image comparison, discarding non-operational examples. - A target classifier to identify targeted computer vulnerabilities and generate training data from synthesized and operational malware.

The independent claims collectively protect a system and process for adaptively rewriting malware binaries using a machine-learned surrogate model and functional analysis, to generate operational, evasive malware variants and vulnerability reports for enhancing detection capabilities.

Stated Advantages

Improves detection of malware threats regardless of the malicious software's origin or execution sequence by generating and identifying new forms and variants of malware.

Enhances traditional static, signature-based, and behavior-based malware detection systems through expanded, high-quality training data from synthesized malware samples.

Reduces the processing burden during detection by using optimization and dimensionality-reduction techniques to focus on the most viable synthesized malware candidates.

Enables identification and mitigation of malware designed to evade detection, including through cloud-based or local implementation, supporting scalability and flexibility of deployment.

Allows for automatic isolation or rollback of infected systems to pre-infection states, restoring systems to uncompromised operating conditions.

Documented Applications

Protecting computers from intrusive software and targeted cyber attacks by identifying, generating, and analyzing operational synthesized malware and their variants.

Generating vulnerability reports that identify critical and informational security weaknesses in targeted computer systems, including recommendations for mitigation.

Training machine learning models to improve malware detection accuracy, including for new and evasive malware variants.

Automatically isolating or rolling back infected or compromised computer systems to their pre-infection states.

Abstract
Core Innovation
Claims Coverage
Stated Advantages
Documented Applications
Interested in licensing this patent?

Adaptive malware binary rewriting

Inventors

Assignees

Publication Number

Publication Date

Expiration Date

Interested in licensing this patent?

Abstract

Core Innovation

Claims Coverage

Non-transitory machine-readable medium for adaptive malware rewriting

Process for generating synthesized malware with vulnerability output

Adaptive malware writing system architecture for synthesized malware detection and reporting

Stated Advantages

Documented Applications

Interested in licensing this patent?

Stay Connected with MTEC