Nonlinear function approximation over high-dimensional domains

Inventors

Kirby, Michael J.Jamshidi, Arthur A.

Assignees

National Science Foundation NSFColorado State University Research Foundation

Publication Number

US-8521488-B2

Publication Date

2013-08-27

Expiration Date

2027-09-25

Interested in licensing this patent?

MTEC can help explore whether this patent might be available for licensing for your application.


Abstract

An algorithm is disclosed for constructing nonlinear models from high-dimensional scattered data. The algorithm progresses iteratively adding a new basis function at each step to refine the model. The placement of the basis functions is driven by a statistical hypothesis test that reveals geometric structure when it fails. At each step the added function is fit to data contained in a spatio-temporally defined local region to determine the parameters, in particular, the scale of the local model. The proposed method requires no ad hoc parameters. Thus, the number of basis functions required for an accurate fit is determined automatically by the algorithm. The approach may be applied to problems including modeling data on manifolds and the prediction of financial time-series. The algorithm is presented in the context of radial basis functions but in principle can be employed with other methods for function approximation such as multi-layer perceptrons.

Core Innovation

An algorithm is disclosed for constructing nonlinear models from high-dimensional scattered data iteratively by adding basis functions at each step to refine the model. The placement of basis functions is driven by a statistical hypothesis test that reveals geometric structure when it fails, specifically testing if the residuals are independent and identically distributed (IID). This method fits the added function to data contained in a spatio-temporally defined local region to determine parameters, including the scale of the local model, without requiring ad hoc parameters, thus automatically determining the number of basis functions required for an accurate fit.

The problem being solved addresses the challenge of extracting nonlinear relationships in large high-dimensional scattered data sets critical across diverse fields such as machine learning, optimal control, mathematical modeling of physical systems, financial time-series analysis, voice recognition, failure prediction, and artificial intelligence. Existing approaches require the model complexity and parameters to be predefined or tuned, often involving computationally intensive procedures and multiple ad hoc parameters without explicitly exploiting geometric and statistical residual structures during training.

This invention provides a novel approach overcoming these limitations by employing a statistical hypothesis test on model residuals to detect structure and guide incremental addition and placement of basis functions. By using spatio-temporal balls to define local regions for fitting basis functions, the method effectively captures data periodicity or manifold representations, enabling fitting across high-dimensional domains and ranges. It also extends to multivariate outputs by testing residuals collectively. The method is designed to function without tuning parameters, facilitating black-box nonlinear function approximation over diverse data sets, including batch and streaming data, and can be adapted asymmetrically and to different approximation architectures.

Claims Coverage

The patent includes multiple inventive features relating to methods for nonlinear function approximation using residual-based placement and fitting of basis functions, applicable across different data collections and applications.

Residual-based iterative basis function addition

A method where residuals between two data collections are used to iteratively determine whether and where to add basis functions, refining a model of the relationship between the data collections.

Use of spatio-temporal local regions

Determining proximity data and subcollections of data near points identified from residuals to fit basis functions, especially employing spatio-temporally defined local regions (space-time balls) to capture complex data structure.

Application of statistical hypothesis tests for residuals

Employing hypothesis testing on residuals, including autocorrelation analysis for IID (independent identically distributed) noise, to decide when to add functions and when to terminate iterative model refinement.

Modulated or skewed approximation functions

Using approximation functions modulated or skewed by shape functions, extending basis functions to modulated asymmetric radial basis functions or other nonlinear architectures for improved fitting of asymmetric or skewed data.

Multivariate and high-dimensional data processing

Handling high-dimensional data collections and multivariate outputs through testing residuals jointly, considering cross-correlations and auto-correlations among multiple output components for parsimonious modeling.

Optimization of basis function parameters

Iterative determination and optimization of basis function parameters such as center, scale, and weight using nonlinear optimization over local data to reduce fitting error.

Application to various fields including voice recognition, failure prediction, image processing and financial time series analysis

Methods applied to real-world domains by generating models from data collections, and outputting model information for presentation or physical event identification including multi-dimensional outputs and notifications.

The claims cover a comprehensive method of nonlinear modeling leveraging residual structure analysis to guide incremental basis function placement and fitting, applicable to high-dimensional and multivariate data, employing modulated functions, and providing outputs useful in practical applications including voice recognition, failure prediction, image processing, and financial time series analysis.

Stated Advantages

Requires no ad hoc parameters to be set or tuned for each data set, enabling black-box nonlinear function approximation applicable to diverse data without adjustment.

Automatically determines the number of basis functions required for accurate fitting, thus improving computational efficiency and preventing overfitting or underfitting.

Employs a statistically sound stopping criterion based on residual IID testing, improving model quality and generalization.

Utilizes spatio-temporal local regions (space-time balls) to better capture data structure in high-dimensional and periodic or quasi-periodic data, improving approximation quality.

Can model high-dimensional inputs and outputs in a parsimonious manner by extending residual tests to multivariate contexts and using inter-output correlations.

Modulated asymmetric radial basis functions allow better fitting of skewed or asymmetric data, yielding lower order models with potentially higher accuracy.

Documented Applications

Modeling data on manifolds as graphs of functions.

Prediction and modeling of financial time-series, including stock and bond market analysis.

Voice recognition using modeling relationships between data collections.

Failure prediction of complex systems such as chemical plants and space stations.

Image processing and image reconstruction, including noise reduction and edge detection.

Target recognition and automatic guidance or targeting systems in mobile and stationary devices.

Simulations of dynamical physical systems such as airflow around objects or water flow over hulls.

Technical financial modeling and market research, including using training data from population samples.

JOIN OUR MAILING LIST

Stay Connected with MTEC

Keep up with active and upcoming solicitations, MTEC news and other valuable information.