Skip to Main Content

Research Data Management (RDM): Metadata

UCT Libraries Research Data Services provide guidance and support for all aspects of the data lifecycle, from planning your data management strategy during the proposal phase through preserving your data at the conclusion of your project.

What is Metadata

What is Metadata?

Metadata is commonly described as "data about data." While easy to remember, this definition is far too vague to be useful. The definitions below provide better explanations in plain English. 

Definition from the National Information Standards Organization (NISO)
"Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource." 

Definition from Steven Miller, Information and Metadata Lecturer
“Extra baggage associated with any resource that enables a real or potential user to find that resource and to determine value…”

Definition from Karen Coyle, Digital Librarian and Author of Coyle's InFormation
“Metadata is constructed, constructive, and actionable.”

  • Constructed - a man-made artifice, not naturally occurring
  • Constructive - serving a useful purpose, to solve some problem
  • Actionable - can be acted upon, processed by humans and machines

Metadata Tools

Dublin Core Metadata Generator

The Dublin Core Metadata Generator allows you to generate Dublin Core Metadata code for your data, either using the Basic or advanced option with various elements 

Annotare

Annotare is a forms-based software for annotating biomedical investigations and resulting data. It supports biomedical ontologies, contains standard templates for common experimental types, and includes a design wizard for creating your own forms.

CEDAR Workbench

CEDAR Workbench is an open source tool to manage metadata, using rigorous semantic principles if desired. It allows users to specify templates using a UI (like survey forms in Google Forms or Survey Monkey), then to fill out those forms efficiently using drop-down menus, help tips, and intelligent suggestions. Templates and metadata can be shared with other users and groups. Metadata also can be downloaded in JSON-LD, simple JSON, or RDF, or exported to connected repositories, which can be integrated using the full API suite.

ISA Creator

ISA Creator is an open source, stand-alone application that assists with planning and describing experiments and facilitates export and import of data directly to and from some public repositories. Additional tools are available in the ISA-Tools software suite for parsing ISA-Tab into R data structures and for parsing PERL and Python for ISA-Tab. ISA-Tab is the required format for publishing data in Nature Publishing's Scientific Data journal. This software creates separate descriptive files for your experimental files.

Morpho

Morpho allows you to describe ecological experiments and to create a catalog of data and descriptions that you can query. It includes an interface to the Knowledge Network for Biocomplexity (KNB) for sharing, querying, viewing, and retrieving data. 

OMERO

OMERO is repository software for importing, viewing, organizing, describing, analyzing, and sharing microscopy images from anywhere you have Internet access. It includes the ability to create user groups with different permissions for sharing data.

OntoMaton

OntoMaton provides ontology searching and automated tagging via NCBO's Bioportal of biomedical ontologies within Google spreadsheets. OntoMaton is part of the ISA-Tools suite. Annotations are generated within your tabular data file.

RightField

RIghtField is an open source tool that allows searching and selecting of ontology terms from within Microsoft Excel. RightField allows you to assign a pre-determined list of options to a particular cell within the spreadsheet. All annotations are embedded within the spreadsheet. The user can select from the NCBO's BioPortal ontologies or import an ontology from a URL or your local machine.

Source: Stanford Libraries - Data Management Services

Metadata Type

Metadata Type Example Properties Primary

Descriptive metadata

Common fields which help users to discover online sources through searching and browsing

Title

Author

Subject

Genre

Publication date

Discovery

Display

Interoperability

Technical metadata

Fields which describe the information required to access the data

File type

File size

Creation date/time

Compression scheme

Interoperability

Digital object management

Preservation

Administrative Metadata - Preservation

Fields that facilitate the management of resources

Checksum

Preservation event

Interoperability

Digital object management

Preservation

Administrative Metadata - Rights

Fields which deal with intellectual property rights

Copyright status

License terms

Rights holder

Interoperability

Digital object management

Structural metadata

Fields which describe how different components of a set of associated data relate to one another

Sequence

Place in hierarchy

Navigation

Markup languages

Languages which integrate metadata and flags for other structural or semantic features within content

Paragraph

Heading List

Name

Date

Navigation

Interoperability

Source: National Information Standards Organization United States;
Research data management Libguide,The University of Queensland

README File

 

A readme file provides information about a data file and is intended to help ensure that the data can be correctly interpreted, by yourself at a later date or by others when sharing or publishing data. Standards-based metadata is generally preferable, but where no appropriate standard exists, for internal use, writing “readme” style metadata is an appropriate strategy.

Here are some best practices in creating comprehensive README files.

  • Create a separate README file for each individual data file or a single README file for the dataset as a whole
  • Write your README document as a plain text file
  • Name your README file as "readme.xxx"

Here are some recommended contents for the README files of your research data. The table is adapted from Guide to writing "readme" style metadata, Cornell University Research Data Management Service Group and README guidance from Dryad.

General information
  • Title of the dataset
  • Names and Contact Information (i.e. PI, contributors, contact persons)
  • Date/Date range of data collection
  • Geographic location of data collection
  • Keywords
  • Language
  • Funding information
Data and file overview
  • Description of the file structure and relationship between data files
  • Short description of each data file and the relationship to the contents (i.e. tables, figures) of the related publications
  • Date that the file was created/updated (if any)
  • File format
  • Information specific to the particular data file
Sharing and access information
  • Licenses/Restriction information
  • Related publications/datasets (URLs)
  • Recommended citation
Methodological information
  • Methods for data collection/generation
  • Data processing steps
  • Any instrument-specific information needed to understand or interpret the data, e.g., details of any particular operating system/software required to make use of the data
  • Quality-assurance procedures (if applicable)

You can adapt to this template if appropriate. This template is adapted from Guide to Writing "readme" Style Metadata, Comprehensive Data Management Planning & Services, Cornell University.

README File Template