Common Data Index - CDI
The primary objective of the Common Data Index (CDI) is to give users a highly detailed insight in the availability and geographical spreading of marine data across the different data centres and institutes across Europe. The CDI provides an index (metadatabase) to individual data sets, is based upon the ISO19115 metadata standard and it paves the way to online data access.
CDI was initiated in the EU Sea-Search project. As part of its successor project EU SeaDataNet it is further developed and extended in data coverage to all 35 countries, participating in SeaDataNet. In the earlier CDI V0 system all data centres used a common metadata format for describing their data sets, which was available through the CDI V0 user interface. However users were given access to data sets at the data centres via a large variety of interfaces.
Therefore the CDI V1 system has been developed which recently has been upgraded to CDI V2. An important feature of the V2 system is that it provides transparent access to the distributed data sets via a unique user interface at the portal and the means for downloading data sets in common formats via a shopping basket mechanism. The data requests are forwarded automatically from the portal to the relevant data centres. Users can check the progress of their data requests via a personal online transaction register.
For the Netherlands a dedicated NODC CDI V2 service is operational, which gives access to more than 40.000 data sets from Rijkswaterstaat, KNMI, TNO, NIOZ and NIOO-CEME:
- Visit the NODC CDI V2
discovery and delivery service - Quick Search
- Visit the NODC CDI V2 discovery and delivery service - Extended Search
The Netherlands NODC data collections are also included in the pan-European SeaDataNet CDI V2 services, which brings together circa 800.000 CDI metadata from nearly all data centres within the SeaDataNet network. It also includes various CDI entries from other data centres in Europe that have joined the CDI metadata and data access infrastructure, because the CDI system has been adopted also by a number of associated marine data infrastructure initiatives, such as Geo-Seas for marine geological and geophysical data, Upgrade Black Sea SCENE for the Black Sea region and the EMODnet pilot portals for Marine Chemistry, Hydrography and Biology.
All these initiatives are underway and result in a further populating and finetuning of the CDI V2 metadatabase as well as to enlarging the number of data centres that have connected their data systems to the CDI V2 system for providing harmonised access to their data sets.
- Visit the Pan-European CDI V2 data access system - Quick Search
- Visit the Pan-European CDI V2 data access system -
Extended Search
How does it work?
The CDI V2 query interface enables to
search by a set of criteria. The selected data sets are listed. Geographical
locations are indicated on a map. Clicking on the display icon retrieves the
full metadata of the data set. This gives information on the what, where, when,
how, and who of the data set. It also gives standardised information on the data
access restrictions, that apply. The interface features a shopping mechanism, by
which selected data sets can be included in a shopping basket.
All users can freely query and browse in the CDI V2 directory; however submitting requests for data access via the shopping basket requires that users are registered in the SeaDataNet central user register, thereby agreeing with the overall SeaDataNet User Licence.
The data requests are forwarded automatically from the portal to the relevant data centres. This process is controlled via the Request Status Manager (RSM) Web Service at the portal and a Download Manager (DM) java software module, implemented at each of the data centres. The RSM also enables registered users to check regularly the status of their requests and download data sets, after access has been granted. Data centres can follow all transactions for their data sets online and can handle requests which require their consent.
Each CDI V2 metadata record includes a data access restriction tag. It indicates under which conditions the data set is accessable to users. Its values can vary from ‘unrestricted’ to ‘no access’ with a number of values in between. During registration every user will be qualified by its national NODC / Marine Data Centre with one or more SeaDataNet roles. The RSM service combines for each data set request the given data access restriction with the role(s) of the user as registered in the SeaDataNet central user register. This determines per data set request, whether a user gets direct access automatically, whether it first has to be considered by the data centre, that therefore might contact the user, or that no access is given.
Configuration, maintenance and formats
For purposes of
standardisation and international exchange the ISO19115 metadata standard has
been adopted. The CDI V2 format is defined as a dedicated subset of this
standard and ISO compliant. A CDI V2 XML format supports the exchange between
data centres and the central CDI manager, and ensures interoperability with
other systems and networks. CDI V2 XML entries are generated by participating
data centres, directly from their databases. Data centres can make use of a
dedicated Java Tool (MIKADO) to generate CDI V2 XML files automatically,
following a properties file, which defines the mapping between CDI-format and
partner’s database fields and the required local queries. CDI updates are
produced and transferred at regular intervals.
The connection between
the data systems of data centres and the RSM Web service can be realised by the
data centres installing and configuring a Download Manager java component, that
handles the communication with the portal, retrieving of requested data sets and
providing download services to the users.
More information and the software tools itself can be found at the SeaDataNet website in the section 'Standards & Software'.
Common Vocabularies and Ontologies
Use of common
vocabularies in all metadatabases and data formats is an important prerequisite
towards consistency and interoperability. Thereby it is of upmost importance
that these vocabularies are supported by a large group of stakeholders,
accessable for all users and kept up to date in a controlled way.
Therefore SeaDataNet has initiated a service to provide ‘controlled vocabularies’, which are used in the metadata and to label data. This SeaDataNet Vocabulary service provides access to lists of standardised terms that cover a broad spectrum of disciplines of relevance to the oceanographic and wider community.
The SeaDataNet Vocabulary service is based upon the NERC
DataGrid (NDG) vocabulary Web service, developed and operated by BODC. For
end-users a vocabulary Client Interface has been developed and is operated
by MARIS, to provide users the options to search and browse in the various
vocabularies. To harvest the latest versions of the lists from the NDG Web
service an automatic synchronisation is included. This arranges loading of the
latest updates into a local buffer for feeding the Search and Browse interface.
The vocabulary Web service works closely together with the MIKADO
JavaTool, that is available for the CDI V2 XML generation. In
addition there is an XML Validation Web service which supports data centres to
validate samples of their CDI V2 XML production.