Skip to content

Open Energy Metadata#

Access the Open Energy Metadata

https://github.com/OpenEnergyPlatform/oemetadata

What is it and what is it good for?#

Open Energy Metadata (OEMetadata) is a metadata standard designed specifically to be used on data for energy (systems) research. For Science, a metadata standard can provide unambiguity, transparency, objectivity, reliability, verifiability, openness, integrity and novelty. In short - it can help with good scientific practice. OEMetadata adhere to the FAIR principles, i.e. they ensure Findability, Accessibility, Interoperability, and Reuse of digital assets.

Structural Design#

Data and metadata come in different levels of structuredness.

Image title

Levels of structuredness

OEMetadata are semi-strucutred and designed to accompany the data themselves. They can describe every structural element of tabular data

Image title

Tabular Data

When designing OEMetadata the following existing standards and agreements were considered:

  • Dublin Core -> Documenting digital documents
  • Frictionless Data Package -> A container format for in a single 'package'.
  • ISO_19115 -> Geodata
  • INSPIRE -> Regulation on administrative and other specialized Geodata
  • DataCite -> Metadata Schema for data citations
  • schema.org -> Schemas for structured data markup on web pages
  • PROV -> W3C specification providing a vocabulary to interchange provenance information
  • DCAT-AP -> Application profile for data portals in Europe based on the Data Catalog Vocabulary

They shaped OEMetadata to varying degrees. Some of them were too general, others too specific. The following requirements lead us to define our own standard:

  • Compatibility with csv and database tables
  • machine- and human readability
  • Coverage of all aspects of metadata
  • Coverage of all data and tailoring to energy system analysis
  • Compliance with FAIR criteria
  • Extensibility
  • Well defined compatibility with ontology and linked open data
  • Compatibility with DCAT-AP was originally planned, but the standard was found partly incompatible with datapackages
  • Compatible with all: timeseries, geodata, parameter collections, data produced by machines, data collaboratively collected

Our concept to include ontology references is depiced in a poster (pdf) which was created during the development stage. The resulting standard is based on Data Packages. The file format is JSON (and JSON-LD). In it's simplest form a Tabular Data Package is a csv file containing data, accompanied by a JSON file which describes the name and structure of the data. OEMetadata take the standard set of keys and possible values and extend it with ones useful for energy research. It is inspired by Dublin Core, INSPIRE and DataCite. The development process is organized on GitHub and open for everyone to see and participate in. The repository contains the following useful files:

metadata_key_description.md - contains a description of each metadata key template.json - contains an empty metadata string example.json - contains a basic metadata example with filled fields schema.json - JSON schema ensures a well defined standard

Creation and management#

  • Creating a table on the OEP can be done through the wizard. The menu has a section that helps you fill out OEMetadata to accompany your data
  • To help with the creation of a standalone metadata file, the OEP has a metadata creator (You will need to be logged in to use it)
  • There is a review process to maintain any given metadata on the OEP. This process was created to replace the now deprecated process on GitHub. As a owner of a table on the OEP, you can ask for a review which will start a guided review process. At the end of the process a badge will be assigned to the metadata depicting its level of completeness:
  • Iron – Technically required for data structure
  • Bronze – Basic description of the data
  • Silver – Supplement description of the data
  • Gold – Extended description of the context
  • Platinum – Ontological annotation

Metadata keys with a description and example#

The standard is under active development and currently available in version 1.6.0. The table with a full key description is shown here for convenience, but may not be as up to date as in the repository.

General Keys#

# Key Description Example
1 name A file name or database table name. oep_metadata_table_example_v16
2 title A human readable full title including author. RLI - OEMetadata - Metadata example table
3 id An Uniform Resource Identifier (URI) that unambiguously identifies the resource. This can be a URL on the data set. It can also be a Digital Object Identifier (DOI). https://example.com
4 description A description or abstract of the package. It should be usable as summary information for the entire package that is described by the metadata. Example table used to illustrate the metadata structure and meaning.
5 language An array of languages used within the described data structures (e.g. titles, descriptions). The language key can be repeated if more languages are used. Standard: IETF (BCP47) en-GB, de-DE, fr-FR
6 subject An array of objects with topics of the data in OEO terms.
6.1 name The class label of the OEO terms. energy
6.2 path The URI of the class. https://openenergy-platform.org/ontology/oeo/OEO_00000150
7 keywords An array of keywords to assist users searching for the package in catalogs. example, template, test
8 publicationDate A date of publishing of the data or metadata. Date format is ISO 8601 (YYYY-MM-DD). 2019-02-06

Context Keys#

# Key Description Example
9 context An object that describes the general setting, environment, or project leading to the creation or maintenance of this dataset. In science this is can be the research project.
9.1 homepage A URL of the project. https://openenergy-platform.org/
9.2 documentation A URL of the project documentation. https://openenergy-platform.org/about/
9.3 sourceCode A URL of the projects source code. https://github.com/OpenEnergyPlatform
9.4 contact A reference to the creator or maintainer of the data set. It can be an email address or a GitHub handle. contact@example.com
9.5 grantNo An identifying grant number. In case of a publicly funded project, this number is assigned by the funding agency. 01AB2345
9.6 fundingAgency A name of the entity providing the funding. This can be a government agency or a company. Bundesministerium für Wirtschaft und Klimaschutz
9.7 fundingAgencyLogo A URL to the logo or image of the funding agency. https://commons.wikimedia.org/wiki/File:BMWi_Logo_2021.svg#/media/File:BMWi_Logo_2021.svg
9.8 publisherLogo A URL to the logo of the publishing agency of data. https://reiner-lemoine-institut.de//wp-content/uploads/2015/09/rlilogo.png

Spatial and Temporal Keys#

# Key Description Example
10 spatial An object that describes the spatial context of the data it contains.
10.1 location A location of the data. In case of data where the location can be described as a point. May be specified as coordinates, URI or addresses with street, house number and zip code. 52.433509, 13.535855
10.2 extent A covered area. May be the name of a region, or the geometry of a bounding box. Europe
10.3 resolution Pixel size in case of a regular raster image. Reference to administrative level or other spatial division that is present as the smallest spatially distinguished unit size. 1 ha
11 temporal An object with the time period covered in the data. Temporal information should either contain a "referenceDate" or the keys describing a time series; in rare cases both.
11.1 referenceDate The base year, month or day. Point in time for which the data is meant to be accurate. The census data or a satellite image will have a reference date. Date Format is ISO 8601. 2016-01-01
11.2 timeseries An array that describes the timeseries.
11.2.1 start The beginning point in time of a time series. 2019-02-06T10:12:04+00:00
11.2.2 end The end point in time of a time series. 2019-02-07T10:12:04+00:00
11.2.3 resolution The time span between individual points of information in a time series. 30 s
11.2.4 alignment An indicator whether stamps in a time series are left, right or middle. left
11.2.5 aggregationType Indicates whether the values are a sum, average or current. sum

Source Keys#

# Key Description Example
12 sources An array of objects with the used and underlying sources of the data and metadata.
12.1 title A human readable title of the source, a document title or organisation name. IPCC Fifth Assessment Report
12.2 description A free text description of the data set. Scientific climate change report by the UN
12.3 path A URL to the original source. https://www.ipcc.ch/site/assets/uploads/2018/02/ipcc_wg3_ar5_full.pdf
12.4 licenses An array of objects under which the source is provided.
12.4.1 name The SPDX identifier. ODbL-1.0
12.4.2 title The official (human readable) title of the license. Open Data Commons Open Database License 1.0
12.4.3 path A link to the license text. https://opendatacommons.org/licenses/odbl/1-0/index.html
12.4.4 instruction A short description of rights and restrictions. The use of tl;drLegal is recommended. You are free to share and change, but you must attribute, and share derivations under the same license. See https://tldrlegal.com/license/odc-open-database-license-(odbl) for further information.
12.4.5 attribution The copyright owner of the source. If attribution licenses are used, that name must be acknowledged. © Intergovernmental Panel on Climate Change 2014

License Keys#

# Key Description Example
13 licenses An array of objects of the license(s) under which the described package is provided. It can depend on the licenses of the sources (copyleft or share-alike) or can be granted by the creator of the data.
13.1 name The SPDX identifier. ODbL-1.0
13.2 title The official (human readable) title of the license. Open Data Commons Open Database License 1.0
13.3 path A link to the license text. https://opendatacommons.org/licenses/odbl/1-0/index.html
13.4 instruction A short description of rights and restrictions. The use of tl;drLegal is recommended. You are free to share and change, but you must attribute, and share derivations under the same license. See https://tldrlegal.com/license/odc-open-database-license-(odbl) for further information.
13.5 attribution The copyright owner of the data. If attribution licenses are used, that name must be acknowledged. © Reiner Lemoine Institut

Provenience Keys#

# Key Description Example
14 contributors An array of objects of the people or organizations who contributed to the data or metadata. Each object refers to one contributor. Every contributor must have a title and property. The path, email, role and organization properties are optional.
14.1 title A name of the contributor. Ludwig Hülk
14.2 email A email address of the contributor or GitHub handle. @Ludee
14.3 date The date of the contribution. If the contribution took more than a day, use the date of the final contribiution. Date Format is ISO 8601. 2016-06-16
14.4 object The target of the contribution. Which part of the package was supplied or changed. Can be the data, metadata or both (data and metadata). data and metadata
14.5 comment A free text commentary on what has been done. Fixed a typo in the title.

Resource Keys#

# Key Description Example
15 resources An array of objects of the data. It describes the data resource as an individual file or (database) table.
15.1 profile The profile of this descriptor according to the profiles specification. This information is retained in order to comply with the "Tabular Data Package" standard. Use "tabular-data-resource" for all tables. tabular-data-resource
15.2 name A name for the entire data package. The name must consist of only lowercase alphanumeric characters or underscore. It must not start with a number or underscore. In a database, this will be the name of the table within the schema containing it. The name can correspond to the file name (minus the file-extension) of the data file describing the resource, if it complies with the naming convention above. Name also contains information about the shema on the OEP, use "." to seperate shema from table name. openstreetmap.osm_deu_line
15.3 path A URL that should be a permanent http(s) address or other path directly linking to the resource. https://openenergy-platform.org/dataedit/view/openstreetmap/osm_deu_line
15.4 format The file extension. 'csv', 'xls', 'json' etc. would be expected to be the standard file extension for this type of resource. When you upload your data to the OEDB, in the shown metadata string, the format will be changed accordingly to 'PostgreSQL', since the data there are stored in a database. PostgreSQL
15.5 encoding Specifies the character encoding of the resource's data file. The values should be one of the "Preferred MIME Names" for a character encoding registered with IANA. If no value for this key is specified then the default is UTF-8. UTF-8

Resource Keys - Schema#

# Key Description Example
15.6 schema An object that describes the structure of the present data. It contains all fields (columns of the table), the primary key and optional foreign keys.
15.6.1 fields An array of objects describing a column and providing name, description, type and unit.
15.6.1.1 name The name of the field. The name must consist of only lowercase alphanumeric characters or underscore. It must not start with a number or underscore. year
15.6.1.2 description A text describing the field. Reference year for which the data were collected.
15.6.1.3 type The data type of the field. In case of a geom column in a database, also indicate the shape and CRS. geometry(Point, 4326)
15.6.1.4 unit The unit, preferably SI-unit, that values in this field are mapped to. If 'unit' doesn't apply to a field, use 'null'. If the unit is given in a seperate field, reference this field. MW
15.6.1.5 isAbout An array of objects with describe the field in OEO terms.
15.6.1.5.1 name The class label of the OEO terms. wind energy converting unit
15.6.1.5.2 path The URI of the class. https://openenergy-platform.org/ontology/oeo/OEO_00000044
15.6.1.6 valueReference An array of objects for an extended description of the values in the column in OEO terms.
15.6.1.6.1 value The name of the value in the column. onshore
15.6.1.6.2 name The class label of the OEO terms. onshore wind farm
15.6.1.6.3 path The URI of the class. https://openenergy-platform.org/ontology/oeo/OEO_00000311

Resource Keys - Properties#

# Key Description Example
15.6.2 primaryKey A primary key is a field or set of fields that uniquely identifies each row in the table. It is recorded as an array, since it is possible to define the primary key as made up of several columns. id
15.6.3 foreignKeys A foreign key is a field that refers to a column in another table.
15.6.3.1 fields The column in the table that is constrainted by the foreign key. version
15.6.3.2 reference The reference to the foreign table.
15.6.3.2.1 resource The foreign resource (table). schema.table
15.6.3.2.2 fields The foreign resource column. version
15.7 dialect Object. A CSV Dialect defines a simple format to describe the various dialects of CSV files in a language agnostic manner. In case of a database, the values in the containing fields are 'null'.
15.7.1 delimiter The delimiter specifies the character sequence which should separate fields (columns). Common characters are "," (comma), "." (point) and "\t" (tab). ,
15.7.2 decimalSeparator A symbol used to separate the integer part from the fractional part of a number written in decimal form. Depending on language and region this symbol can be "." or ",". .

Review Keys#

# Key Description Example
16 @id A Uniform Resource Identifier (URI) that links the resource via the DBedia Databus. https://databus.dbpedia.org/kurzum/mastr/bnetza-mastr/01.04.00
17 @context Explanation of metadata keys in ontology terms. https://github.com/OpenEnergyPlatform/oemetadata/blob/master/metadata/latest/context.json
18. review Data uploaded through the OEP will go through a review process. The review will cover the data and metadata. It is done by the OEP community. See the OEP Data Review for detailed information. The review itself is documented at the specified path and a badge is rewarded with regards to completeness.
18.1 path A URL that should be a permanent http(s) address directly linking to the documented review. https://www.example.com
18.2 badge A badge of either Bronze, Silver, Gold or Platinum is used to label the given data and metadata based on its quality. Platinum

MetaMetadata Keys#

# Key Description Example
19 metaMetadata An object that describes the metadata themselves, their format, version and license. These fields should already be provided when you are filling out your metadata.
19.1 metadataVersion The type and version number of the metadata. OEP-1.5.2
19.2 metadataLicense The license of the provided metadata.
19.2.1 name The SPDX identifier. CC0-1.0
19.2.2 title The official (human readable) title of the license. Creative Commons Zero v1.0 Universal
19.2.3 path A link to the license text. https://creativecommons.org/publicdomain/zero/1.0/

Comments#

# Key Description Example
20 _comment An array of objects. This section is used as a self-description of the final metadata file. It is text, intended for humans and includes a link to the metadata documentation, required value formats and similar remarks.
20.1 metadata Reference to the metadata documentation in use. Metadata documentation and explanation (https://github.com/OpenEnergyPlatform/oemetadata)
20.2 dates Comment on data and time format. Dates and time must follow the ISO8601 including time zone (YYYY-MM-DD or YYYY-MM-DDThh:mm:ss±hh)
20.3 units Comment on units. Use a space between numbers and units (100 m)
20.4 languages Comment on language format. Languages must follow the IETF (BCP47) format (en-GB, en-US, de-DE)
20.5 licenses Comment on license format. License name must follow the SPDX License List (https://spdx.org/licenses/)
20.6 review Reference to review documentation. Following the OEP Data Review (https://github.com/OpenEnergyPlatform/data-preprocessing/blob/master/data-review/manual/review_manual.md)
20.7 null Comment on fields that don't apply. If not applicable use: null
20.8 todo Comment on fields that are not yet available and will be inserted later on. If a value is not yet available, use: todo