
Validate Metadata - Datapackage¶
In [ ]:
Copied!
__copyright__ = "Reiner Lemoine Institut"
__license__ = "GNU Affero General Public License Version 3 (AGPL-3.0)"
__url__ = "https://www.gnu.org/licenses/agpl-3.0.html"
__author__ = "christian-rli, Ludee"
__copyright__ = "Reiner Lemoine Institut"
__license__ = "GNU Affero General Public License Version 3 (AGPL-3.0)"
__url__ = "https://www.gnu.org/licenses/agpl-3.0.html"
__author__ = "christian-rli, Ludee"
In [ ]:
Copied!
from datapackage import Package
import pprint as pp
from datapackage import Package
import pprint as pp
Instructions¶
Frictionlessdata offers a python package datapackage-py to work with datapackages and validate the metadata string.
- Save metadata string as .json file in the same folder folder
- Load [Package('string')] and validate [dp.valid] metadata string
- If the validations fails, an error [dp.errors] description is printed with
Datapackage Requirements¶
Taken from https://frictionlessdata.io/specs/data-package/ and https://frictionlessdata.io/specs/data-resource/.
The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119.
- [FILE] A Data Package descriptor MUST be a valid JSON object. (JSON is defined in RFC 4627). When available as a file it MUST be named datapackage.json and it MUST be placed in the top-level directory (relative to any other resources provided as part of the data package).
- [resources] The descriptor MUST contain a resources property describing the data resources. The resources property is required, with at least one resource. Packaged data resources are described in the resources property of the package descriptor. This property MUST be an array of objects. Each object MUST follow the Data Resource specification.
- [name] A short url-usable (and preferably human-readable) name of the package. This MUST be lower-case and contain only alphanumeric characters along with ".", "_" or "-" characters. It will function as a unique identifier and therefore SHOULD be unique in relation to any registry in which this package will be deposited (and preferably globally unique).
- [licenses] MUST be an array. Each item in the array is a license. Each MUST be an object. The object MUST contain a name property and/or a path property. It MAY contain a title property. 2. [name]: The name MUST be an Open Definition license ID 2. [path]: A url-or-path string, that is a fully qualified HTTP address, or a relative POSIX path
- [contributors] The people or organizations who contributed to this Data Package. It MUST be an array. Each entry is a Contributor and MUST be an object. A Contributor MUST have a title property and MAY contain path, email, role and organization properties.
OEP metadata v1.4¶
In [ ]:
Copied!
# oep_metadata_template.json
try:
dp = Package('oep_metadata_example.json')
if dp.valid == True:
print('Metadata is a valid DataPackage!')
else:
print(dp.errors)
except:
print('No valid JSON file!')
# oep_metadata_template.json
try:
dp = Package('oep_metadata_example.json')
if dp.valid == True:
print('Metadata is a valid DataPackage!')
else:
print(dp.errors)
except:
print('No valid JSON file!')
In [ ]:
Copied!
# print JSON
dp = Package('oep_metadata_example.json')
pp.pprint(dp.descriptor)
# print JSON
dp = Package('oep_metadata_example.json')
pp.pprint(dp.descriptor)
In [ ]:
Copied!
dp = Package('oep_metadata_example.json')
dp = Package('oep_metadata_example.json')