NASA’s Earth Observing Data and Information System – What challenges for the Near Future?
Since the 1990s, a common set of data exchange and access principles created by the Canadian, Japanese, European and U.S. International Earth Observing System (IEOS) partners has fostered NASA to develop a free, open and non-discriminatory data and information policy regarding its Earth observation program. NASA’s Earth Observing System (EOS) Data and Information System (EOSDIS) has been a central component of this program.
EOSDIS manages data covering a wide range of Earth science disciplines including cryosphere, land cover change, polar processes, field campaigns, ocean surface, digital elevation, atmospheric dynamics and composition, and inter-disciplinary research, and many others. One of the key components of EOSDIS is a set of twelve discipline-based Distributed Active Archive Centers (DAACs) distributed across the United States.
Managed by NASA’s Earth Science Data and Information System (ESDIS) Project at the Goddard Space Flight Center, these DAACs serve over 4 million users globally. The ESDIS Project provides the infrastructure support for EOSDIS, which includes other components such as common metadata and metrics management systems, specialized network systems, standards management, and centralized support for use of commercial cloud capabilities.
Given the long-term requirements, and the rapid pace of information technology and changing expectations of the user community, EOSDIS has evolved continually over the past three decades. However, many challenges remain. Challenges in three key areas (managing volume and variety, enabling data discovery and access, and incorporating user feedback and concerns) were addressed by Jeanne Behnke, Andrew Mitchell and Hampapuram Ramapriyan in the paper titled NASA’s Earth Observing Data and Information System – Near-Term Challenges. Data Science Journal.
The authorso fthe paper provided a summary and conclusions about each of these categories of challenges:
1. Managing volume and variety
In the early days of EOSDIS, the scientific community responsible for the data products was reluctant to accept the need for comprehensive metadata to ensure the discovery, access and understanding of their products by users. Also, given the diversity of disciplines dealt with by the EOSDIS DAACs there were several “standards” used for providing metadata. Today the importance of metadata is well understood. The challenge of multiple standards has been dealt with by developing a Unified Metadata Model (UMM) providing for easy mapping from one standard to another without needing to go through an expensive conversion of older metadata into the recent ISO standards. A Common Metadata Repository (CMR) has been developed to manage a database of over 420 million individually addressed granules (files). Open source versions of the CMR software are available along with programming interfaces that allow anyone to access the repository. Keywords, important for proper discovery of data, are managed by the GCMD Keyword Management System, which uses the Simple Knowledge Organization System (SKOS) concepts. The ESDIS Project has used an independent review committee composed of metadata professionals to ensure that inconsistencies in metadata are detected and corrected. Like many other space data archives, EOSDIS is proceeding towards the use of the commercial cloud for data storage and services. Examples of advantages of the cloud are: use of only as much as needed of compute resources while the load fluctuates – as in the case of reprocessing of data products over an entire mission from its beginning to the current date; access to computing on petabytes of data close to where the data are stored obviating the need for large data transfers; and ease of preparation for high-data-rate missions such as SWOT and NISAR expected to be launched in the early 2020s.
2. Enabling data discovery and access
Migration of most of the EOSDIS archives to online storage since 2006 has provided easier and faster access to data and also has enabled the users to request services such as subsetting, reformatting and reprojection conveniently prior to downloading. However, with very high data rates (and volumes) expected from the upcoming missions, new challenges arise, which might be best addressed by providing near-archive computation resulting in reduction in network loading. We have started examination of analysis-ready data concepts. Several efforts are in progress by other groups in this area, especially with respect to remotely sensed land imaging. EOSDIS provides “visualization-ready” data through its Global Image Browse System (GIBS) for nearly 800 different types of data that can be represented as images. However, defining analysis-ready data for different disciplines and preparing the data to meet their diverse needs would take significant effort. Ensuring access to data decades into the future is a challenge that is being met by developing a preservation content specification for use by NASA’s earth observing missions, as well as participating in the development of international standards for preservation of data and metadata.
3. Incorporating user feedback and concerns
There are several ways the ESDIS Project receives feedback on EOSDIS from the user community. These are discussed in detail in Ramapriyan and Behnke (2019). One of the mechanisms for feedback is an annual survey of users to derive the “American Customer Satisfaction Index (ACSI)”. While the ACSI is a number indicating how satisfied the users are, the survey also includes several questions for which users provide free-form answers. The free-form answers provide important information about user needs and assist in planning on improvements to the system. Given NASA’s free and open data policy, until 2012, it was felt that there was no need for keeping track of users through a registration process. This led to not being able to obtain accurate metrics and the inability to contact users to new datasets or services offered by EOSDIS that were relevant to specific user groups. A registration system has been in use since 2012 with minimal information being collected from users. The need for better metrics and services to users is balanced relative to privacy rules.
Read the full paper here.