System and method for providing an improved earth observing system forward processing data analytic service
Inventors
Schnase, John L. • Duffy, Daniel Q. • Tamkin, Glenn S. • Li, Jian • Strong, Savannah L. • Gill, Roger
Assignees
National Aeronautics and Space Administration NASA
Publication Number
US-11501394-B1
Publication Date
2022-11-15
Expiration Date
2035-05-13
Interested in licensing this patent?
MTEC can help explore whether this patent might be available for licensing for your application.
Abstract
A reanalysis ensemble service includes a plurality of conversion utilities, each conversion utility configured to convert a specific one of a plurality of disparate climate data collections from different sources to common format files that are temporally and spatially registered, where the disparate climate data collections include reanalysis data sets and forward processing data products, a data analytics platform for storing and operating on the different sourced common format files, a service interface for mapping service requests to analytic operations performed on the different sourced common format files by the data analytics platform, and a services library that dynamically creates data objects from one or more of the different sourced common format files in response to the analytic operations, and delivers the data objects to the service interface.
Core Innovation
The invention is a reanalysis ensemble service that integrates multiple disparate climate data collections, including reanalysis datasets and forward processing data products from different sources, by converting them into common format files that are temporally and spatially registered. It comprises a plurality of conversion utilities to perform these conversions, a data analytics platform for storing and operating on the common format files, a service interface for mapping service requests to analytic operations on these files, and a services library that dynamically creates data objects in response to analytic operations and delivers them to the service interface.
The problem being solved is the challenge of operating on multiple disparate climate data collections with different formats, variables, geographical and spatial domains, and temporal resolutions to enable intercomparisons among forward processing results and among forward processing results and reanalysis datasets. Existing platforms lacked the extended capability to handle an ensemble of such climate data collections for analytics and intercomparison.
The disclosed system provides an extended data analytics platform that converts disparate datasets such as NASA's MERRA-2 reanalysis, European ERA-Interim, NOAA NCEP Climate Forecast System Reanalysis, and forward processing data products from NASA's GEOS-5, into a common temporally and spatially registered file format. It then enables parallel operations such as MapReduce analytics over these files, accessible via a service interface supporting various methods including order, status, and download. This platform allows users to perform canonical climate data operations and both forward processing and reanalysis intercomparisons through a unified analytic service.
Claims Coverage
The patent claims encompass several inventive features that collectively define a reanalysis ensemble service system, method, and computer program product enabling conversion, storage, analytics, and dynamic data object creation over disparate climate data collections including reanalysis and forward processing datasets.
Conversion utilities for disparate climate data collections
- Includes individual utilities each configured to convert a specific climate data collection from different sources into temporally and spatially registered common format files.
High-performance data analytics platform
- Stores and operates on the different sourced common format files enabling distributed parallel processing.
Service interface for mapping requests to analytic operations
- Maps client service requests to particular analytic operations performed on the common format files within the data analytics platform.
Services library dynamically creating data objects
- Dynamically generates data objects from one or more common format files in response to analytic operations and delivers these objects through the service interface.
Sequencer, mapper, and reducer components within conversion utilities
- Sequencer utility temporally and spatially registers climate data and encodes it into sequence files partitioned by composite keys including timestamp and variable name; mapper filters these files; reducer creates subsets stored in the analytics platform.
Support for specific reanalysis and forward processing datasets
- Reanalysis datasets including MERRA-2, ERA-Interim, CFSR, 20CR, JRA-25, JRA-55; forward processing products based on GEOS-5 in HDF-EOS format.
Service requests for variable retrieval and intercomparison analytics
- Requests include get variable operations on specified climate variables and collections; forward processing intercomparison requests comparing climate variable predictions offset in time; and reanalysis-forward processing intercomparison requests comparing forward processing predictions with reanalysis data over matched spatial and temporal extents.
Canonical operations for dynamic data object creation
- Includes seasonal maximum, seasonal average, vertical average, spatial average, anomaly determination, and standard deviation calculations operating over combined common format files from different sources.
Method for providing reanalysis ensemble service
- Involves converting disparate collections to a common temporally and spatially registered format, storing and operating on the files via the data analytics platform, mapping service requests to analytics, dynamically creating data objects, and delivering these to the service interface.
The claims define a comprehensive system and method that converts multiple climate data collections into a unified common format and enables flexible, high-performance analytical operations and intercomparisons through a service-oriented architecture comprising conversion utilities, a data analytics platform, a service interface, and a dynamic services library.
Stated Advantages
Provides extended capabilities to conduct operations over multiple disparate climate data collections including reanalysis and forward processing products.
Enables intercomparisons among forward processing results and between forward processing and reanalysis collections.
Supports commonly used canonical operations such as seasonal maximum, average, anomaly, and standard deviation over combined datasets.
Allows high-performance parallel processing over distributed file systems, facilitating efficient analytics.
Maintains metadata integrity by preserving NetCDF metadata within sequence files during processing.
Improves accessibility by providing a web services interface compatible with client-side libraries like Python CDSlib.
Documented Applications
Investigating climate variability and conducting intercomparisons among climate reanalysis datasets and forward processing forecast products.
Supporting national disaster, civil engineering, ecological forecasting, health and air quality, water resources, and agriculture applications.
Testing and evaluating near-real-time atmospheric products produced by NASA's GEOS-5 Forward Processing system.
Providing a platform for climate data analytics as a service, including operations over high-resolution, frequently updated forecast data products.
Interested in licensing this patent?