Method and apparatus for predicting object properties and events using similarity-based information retrieval and modeling

Inventors

Birdwell, J. DouglasWang, Tse-WeiIcove, David J.Horn, Sally P.Rader, Mark S.

Assignees

University of Tennessee Research FoundationUnited States Department of the Army

Publication Number

US-8375032-B2

Publication Date

2013-02-12

Expiration Date

2030-06-25

Interested in licensing this patent?

MTEC can help explore whether this patent might be available for licensing for your application.


Abstract

Method and apparatus for predicting properties of a target object, in particular, one of an origin and a source, comprise application of a search manager for analyzing parameters of a plurality of databases for a plurality of objects, the databases comprising an electrical, electromagnetic, acoustic spectral database (ESD), a micro-body assemblage database (MAD) and a database of image data whereby the databases store data objects containing identifying features, source information and information on site properties and context including time and frequency varying data. The method comprises application of multivariate statistical analysis and principal component analysis in combination with content-based image retrieval for providing two-dimensional attributes of three dimensional objects, for example, via preferential image segmentation using a tree of shapes and to predict further properties of objects by means of k-means clustering and related methods. By way of example, a fire event and residual objects may be located and qualified such that, for example, properties of the residual objects may be qualified, for example, via black body radiation and micro-body databases including charcoal assemblages.

Core Innovation

The invention provides a method and apparatus for predicting properties of a target object, including its origin or source, by using similarity-based information retrieval combined with modeling. It involves analyzing parameters across multiple databases containing data objects that store identifying features, source information, and site properties and context such as time and frequency varying data. These databases include an electrical, electromagnetic, acoustic spectral database (ESD), a micro-body assemblage database (MAD), and an image data database. The method applies multivariate statistical analysis and principal component analysis (PCA) combined with content-based image retrieval (CBIR) to extract two-dimensional attributes of three-dimensional objects, for example using preferential image segmentation based on a tree of shapes.

The method proceeds to predict further properties of target objects by performing k-means clustering and related clustering techniques on the multivariate data. By way of example, the technique can locate and qualify fire events and residual objects, qualifying properties of residual objects using black body radiation data and micro-body assemblages including charcoal particles.

The problem addressed by the invention relates to the need for effective prediction and inference of object properties, such as material composition, manufacturer, geographic origin, recognition of humans or vegetation, combustion products, fire causation, and environmental context. Prior systems focus on searching databases with exact or approximate matches, but a challenge remains to combine multiple types of measured data (electrical, electromagnetic, acoustic, chemical, image data, micro-body assemblages, and isotopic measurements) to perform accurate similarity-based searches that can be applied to infer object properties from diverse types of data and across multiple data repositories.

Claims Coverage

The patent claims include several independent claims covering methods for predicting object properties using similarity-based search and modeling across multiple databases.

Similarity-based search and modeling utilizing multiple databases

A method of predicting properties of a target object by receiving queries at a processor search manager coupled to multiple databases including micro-body assemblage databases (pollen, charcoal, diatom, foraminifera) and frequency spectral databases, determining sets of similar objects, applying models (e.g., Bayesian models), comparing hypothetical relationships, and thereby predicting geographic location, origin, or environmental properties of the target object.

Use of tree-structured indexing for multifaceted data comparison

A method involving receiving input query data for micro-body assemblage information and electrical, electromagnetic, acoustic, chemical, mechanical, optical or isotopic measurements, utilizing tree-structured indexes to identify stored data most similar to the input, retrieving corresponding geographic, origin or environmental property information from reference databases, and predicting properties of a target object using models based on the retrieved data.

Modeling methods for predicting geographic and environmental properties

Methods employing models such as Bayesian networks, least squares or maximum likelihood optimization, compartmental models, likelihood ratio testing, and multi-hypothesis testing applied to sets of similar objects to predict geographic location, origin or environmental characteristics of target objects based upon stored database information.

Content-based image retrieval using preferential image segmentation

Use of a content-based image retrieval database with a tree-structured index and preferential segmentation algorithm based on a tree of shapes model to analyze and identify micro-body assemblages (pollen, charcoal, dust, soil grains), enabling retrieval of similar images and subsequent prediction of target object properties.

The independent claims cover methods integrating similarity search across multiple heterogeneous databases of spectral, micro-body assemblage, and image data, employing advanced indexing and modeling techniques to predict diverse object properties including geographic origin, environmental context, and event association.

Stated Advantages

Rapid selection of objects with similar attributes from very large databases enables accurate prediction and modeling of target object properties.

Fusion of multiple types of data (electrical, electromagnetic, acoustic, isotopic, micro-body assemblages, and image data) increases accuracy and discriminatory power of predictions.

The database system dynamically incorporates new data improving inference precision over time.

The use of advanced multivariate statistical analysis, clustering, and modeling (including optimization and Bayesian approaches) allows effective interpretation of complex data relationships.

Application of content-based image retrieval with preferential segmentation reduces human labor in micro-particle identification and supports automated classification.

Documented Applications

Predicting geographic origin, environmental properties, and time-varying data of objects based on similarity to reference objects in spectral and micro-body assemblage databases.

Forensic analysis such as fire event detection, fire residual qualification, forensic palynology, and tracing source regions from microfossil assemblages including pollen, charcoal, diatoms, and foraminifera.

Monitoring and tracking of livestock and intruders using thermal signature databases.

Detection and classification of vehicles, fire, intrusion events, and combustion products from passive electromagnetic and acoustic spectral data.

Automated classification and identification of microscopic particles in environmental and forensic samples using content-based image retrieval.

Financial applications including portfolio analysis based on time series clustering and correlation.

Data mining applications such as pattern detection in consumer purchasing, criminal activity pattern detection, network behavior analysis, and disease modeling or drug resistance studies.

JOIN OUR MAILING LIST

Stay Connected with MTEC

Keep up with active and upcoming solicitations, MTEC news and other valuable information.