Data Documentation (AKA Metadata)

Metadata: why does it matter?

Metadata: why does it matter?

Data is not self-describing.

Metadata, or "data about data" explains your dataset and allows you to document important information for:

      • Finding the data later
      • Understanding what the data is later
      • Sharing the data (both with collaborators and future secondary data users)
      • Consider it an investment of time that will save you trouble later several-fold


Metadata standards


Examples:

  • FGDC (Federal Geographic Data Committee)
  • DDI (Data Documentation Initiative)
  • Dublin Core
  • Darwin Core
  • ABCD (Access to Biological Collections Data)
  • AVMS (Astronomy Visualization Metadata Standard)
  • CSDGM (Content Standard for Digital Geospatial Metadata)


Advantages:

  • Ensure you have a complete, standard set of information about each part of your data
  • Enable your dataset to be organized with other datasets 


Metadata

Do what works for you!

Document and describe your data

in whatever way works for you.

Better "good enough" than doing nothing


Metadata: our case study


Possible metadata options:

1. Dublin Core

  • General metadata standard
  • Widely applicable
  • Used in many different repositories

2. Darwin Core

  • For biological diversity
  • Emphasizes taxonomy, which I don't care about
  • Frequently used in biodiversity databases 



Metadata: our case study

Our directory: sam_monarch_wing_20150415

Metadata for this directory:

  • Creator: Katherine McNeill
  • Subject: monarch butterfly wing
  • Description: this directory contains Sashimi ESEM images of a monarch butterfly wing I took after finding a butterfly floating by the Charles River near MIT
  • Contributor: Mark Clemente helped me with these images
  • Date: 20151015
  • Original Format: Sashimi Microscope format (.sam)
  • Relation: this is a directory that will contain multiple files
  • Type: image
  • Coverage: By the Charles River in Cambridge, MA, MIT side
  • Rights: Monarch Butterfly Research Foundation (funder) owns the data (grant number: 00213)


Metadata: our case study

Metadata for this image:

  • Title:

sam_monarch_wing_20150415_CM_001.tif

  • Source:

abcdefghijklmnopqrstuvwxyz.sam

  • Relation:

is a file in the directory: sam_monarch_wing_20150415 


Metadata: capturing it

In a filename

In a readme file

In a spreadsheet

In an XML file

Into a database

When choosing how to capture metadata, consider:

  • Expertise at your disposal
  • Complexity of your project
  • Collaborators
  • Your own comfort level