Method and framework for pattern of life analysis
Inventors
Pottenger, William M. • Nagy, James M. • BLASCH, ERIK P. • Tong, Tuanjie
Assignees
United States Department of the Air Force
Publication Number
US-11308384-B1
Publication Date
2022-04-19
Expiration Date
2038-09-04
Interested in licensing this patent?
MTEC can help explore whether this patent might be available for licensing for your application.
Abstract
In accordance with various embodiments of the disclosed subject matter, a method and framework configured for modeling a pattern of life (POL) by processing both categorical data and non-categorical data (e.g., numeric, spatial etc.), conducting pattern of life estimation (POLE), and detecting anomalous data in a multi-dimensional data set in a substantially simultaneous manner by comparing statistical PoL results.
Core Innovation
The invention provides a method and framework for modeling a pattern of life (POL) by processing both categorical data and non-categorical data (such as numeric and spatial data), conducting pattern of life estimation (POLE), and detecting anomalous data in a multidimensional data set. This is performed in a substantially simultaneous manner by comparing statistical POL results to identify patterns and anomalies. The method involves grouping data by type and dimension, assigning precision values, building statistical models using kernel density estimation (KDE) for non-categorical data, and normal distribution for categorical data, and then labeling data based on detected anomalies.
The problem being solved addresses deficiencies in existing data science applications which focus primarily on analytics of text, imagery, and numeric data without mission-directed context and semantic interpretation. Existing activity-based intelligence techniques monitor streaming data from single sources without integrating multiple data types for comprehensive pattern of life estimation and anomaly detection. This invention remedies these limitations by providing a system that simultaneously processes multiple data types and dimensions, incorporating semantic, numeric, and spatial data to improve normalcy modeling and anomalous data identification within multidimensional data sets.
Claims Coverage
The patent contains several independent claims that detail methods, apparatus, and computer-readable media for detecting anomaly trends in datasets containing both numeric and categorical attributes.
Simultaneous processing of multiple data types in anomaly detection
Grouping received data by type and dimension into at least one categorical and one non-categorical data group; assigning precision values to non-categorical data; building a first statistical model with kernel density estimation (KDE) for non-categorical data and a second statistical model using normal distribution for categorical data; determining anomalous data items per probability thresholds within each model; labeling the dataset accordingly; and presenting the labeled dataset via numerical graphs, spatial overlays, or semantic listings.
Tuning of kernel density estimation based on historical pattern of life data
Adjusting each KDE model in accordance with historical pattern of life information specific to the data type to enhance detection accuracy.
Threshold-based anomaly determination grounded in uniform distribution
Utilizing probability thresholds based on uniform distributions, where events with probabilities less than 0.5 are considered anomalous.
Kernel reduction techniques to improve computational efficiency
Reducing the number of kernels used in KDE by responding to input signals indicating lower precision levels and limiting kernel usage based on maximum calculated probabilities for given values.
Anomaly detection approach for categorical data
Applying pattern of life analysis to categorical data by counting occurrences, treating counts as a normal distribution with the most frequent value as mean, and identifying categories with counts in the distribution tail as anomalous.
Extraction and processing of aggregative and collective data contexts
Extracting numeric, spatial, and categorical data from unstructured text for aggregative statistical modeling and anomaly labeling, as well as building collective statistical models from data collections for anomaly detection and labeling.
Apparatus and storage media implementing the anomaly detection methods
An apparatus and a tangible computer-readable medium configured to perform the described grouping, modeling using KDE and normal distribution, anomaly determination, labeling, and presentation of anomaly trends.
The independent claims cover methods, apparatus, and media for multidimensional anomaly detection by processing different data types with KDE and normal distribution models, tuning based on historical data, reducing computational loads through kernel management, detecting anomalies in various data contexts, and visually presenting anomaly trends to users.
Stated Advantages
Improved processing of multidimensional data by simultaneously handling both categorical and non-categorical data, leading to enhanced pattern of life modeling and anomaly detection.
Superior performance demonstrated in outlier/anomaly detection compared to existing techniques across multiple datasets as per comparative analyses.
Capability to identify anomalies not only from individual attribute values but also from joint collections of attributes, allowing more comprehensive detection.
User tunable precision settings that reduce computational complexity while maintaining or improving detection accuracy.
Anomaly labeling and visual presentation via user defined operating pictures (UDOP) that enhance situational awareness by depicting patterns and anomalies in numerical, spatial, and semantic formats.
Documented Applications
Pattern of life estimation and anomaly detection for targets under study including entities such as regions, sites, equipment, and actors represented by multidimensional data sets.
Activity-based intelligence monitoring incorporating physics-based sensor modalities such as visible, infrared, wide-area motion imagery, and GPS data along with semantic, imagery, and numerical values.
Detection of anomalous data trends in networked data structures including social networks modeled by nodes, edges, and metadata with higher order learning-derived metrics.
Processing unstructured textual fields to extract numeric, spatial, and categorical data for aggregative anomaly detection.
Real-time situational awareness presentations via numerical graphs, spatial overlays, and semantic listings to communicate pattern of life and anomaly information to users.
Interested in licensing this patent?