System and method for providing automated multi-source data provisioning for a reanalysis ensemble service
Inventors
Schnase, John L. • Duffy, Daniel Q. • Tamkin, Glenn S. • Li, Jian • Strong, Savannah L. • Gill, Roger
Assignees
National Aeronautics and Space Administration NASA
Publication Number
US-11555624-B1
Publication Date
2023-01-17
Expiration Date
2035-05-13
Interested in licensing this patent?
MTEC can help explore whether this patent might be available for licensing for your application.
Abstract
An extended reanalysis ensemble service includes a loader services application program interface configured to receive data parameters for a set of automated multisource data provisioning operations, provide climate source data from one or more disparate climate data collections specified in the data parameters to conversion utilities for transforming the climate source data into flat, serialized block compressed sequence files, and load the sequence files to a distributed file system of the extended reanalysis ensemble service, and a reanalysis ensemble service application program interface configured to receive operational parameters for the set of automated multisource data provisioning operations, convert the operational parameters to one or more methods recognized by a service interface of the extended reanalysis ensemble service to be converted to analytical operations executed by the extended reanalysis ensemble service, and provide results of the one or more analytical operations executed by the extended reanalysis ensemble service to a client.
Core Innovation
The invention relates to an extended reanalysis ensemble service that automatically retrieves and processes climate data from multiple disparate climate data collections. The service utilizes a loader services API that receives data parameters, provides climate source data from various specified climate data collections to conversion utilities, which transform the data into flat, serialized block compressed sequence files. These sequence files are then loaded into a distributed file system of the service. Concurrently, a reanalysis ensemble service API receives operational parameters and converts these into methods recognized by a service interface, which are executed as analytical operations, ultimately delivering results to clients.
The problem addressed by the invention is the existence of multiple disjoint climate data sources created by different organizations, each with varying formats, variables, geographical and temporal resolutions, that make it challenging to integrate, analyze, and provide consistent climate data analytics. Current systems lack the capability to automatically collect, align, sequence, and process data from these disparate sources into a unified analytic framework. The disclosed extended reanalysis ensemble service overcomes this by automating the collection, alignment, sequencing, and analysis of multisource climate data, enabling scheduled or on-demand updates and delivering integrated analytic results.
Claims Coverage
The claims include three independent claims covering an extended reanalysis ensemble service system, a method for providing such a service, and their core inventive features.
Automated multisource data provisioning and sequencing
An extended reanalysis ensemble service with a loader services API configured to receive data parameters, connect to disparate climate data collections specified in those parameters, use conversion utilities to transform climate source data into flat, serialized block compressed sequence files, and load those sequence files into a distributed file system while validating proper sequencing.
Operation parameter handling and analytics execution
A reanalysis ensemble service API configured to receive operational parameters, convert them into methods recognized by a service interface of the service, execute analytical operations on the sequenced climate data, and provide the analytical results to clients.
Scheduled and repeatable data provisioning and analytics
The inclusion of operational parameters that specify a schedule for repeating the provisioning, transformation, and loading of climate source data over progressive temporal or spatial ranges, and repeating analytical operations accordingly, enabling automated periodic updating and analysis.
The independent claims collectively cover a specialized computing system and method that automate retrieving, transforming, sequencing, and analyzing climate data from multiple disparate sources, including scheduled updates, and providing analytic results via defined service interfaces.
Stated Advantages
Automated retrieval and integration of climate data from multiple disparate sources into a common format suitable for high-performance analysis.
Capability to schedule periodic or on-demand updating, sequencing, and analytics of climate data, ensuring timely and consistent results.
Support for high-performance parallel operations over large climate datasets using distributed file systems and MapReduce frameworks.
Provision of an API infrastructure allowing clients to specify data and operational parameters, facilitating versatile and dynamic climate data analytics as a service.
Documented Applications
Providing automatically updated reanalysis climate data analytics by integrating data from disparate climate data collections such as MERRA-2, ERA-Interim, CFSR, 20CRv2c, JRA-25, JRA-55, MODIS, LANDSAT, and GPCP.
Supporting scheduled or on-demand climate reanalysis studies with specified variables, temporal and spatial ranges, and analytic operations requested by client applications.
Implementing climate data analytics platforms as cloud-based Platform as a Service (PaaS) or Infrastructure as a Service (IaaS), such as NASA General Application Platform (NGAP) or NASA Advanced Data Analytics Platform (ADAPT).
Interested in licensing this patent?