Study the Data Management Plans section of the US Geological Survey. Notice the templates and examples that are provided. How would you evaluate these data management plans? How could you apply one of the templates to develop your own DMP for a situation from your professional experience?
Now that you can describe and explain DMP, let's cover a few careers within the data management field.
Science Center Data Management Strategy
Appendix
A. USGS Data Lifecycle Model
B. Definitions
- Approved Data - Those data that have USGS approval for release.
- Approved USGS Operational Database – An online database, such as NWIS, that is approved for release of USGS data. These databases are under the care of data managers who assure the quality, integrity, and preservation of the data and provide appropriate metadata.
- Data - Observations or measurements (unprocessed or processed) represented as text, numbers, or multimedia.
- Data Lifecycle - The USGS Science Data Lifecycle Model is a structure of data management activities that relate to research project workflows, from conception through preservation and sharing. This structure is used to ensure that USGS data products will be well-described, preserved, accessible, and fit for re-use. For more information see USGS OFR 2013-1265 and www.usgs.gov/datamanagement/.
- Data Management Plan (DMP) - A structured document that is submitted with a project proposal to summarize intentions and necessary resources for data management, then updated throughout the data lifecycle to serve as an official record of the data collected and how it has been managed.
- Dataset - A structured collection of data.
- Database - Datasets and other items stored together to serve one or more purposes or applications, often including data query or search and retrieval capabilities.
- Fundamental Science Practices (FSP) – The set of USGS policies that govern the management and release of data as well as scientific publications. These Chapters of the Survey Manual can be found at www.usgs.gov/fsp/ and are enforced by the Office of Science Quality and Integrity.
- Metadata - A structured, machine readable file that provides basic information about data (who, what, when, where, why, and how) that is essential to promote scientific collaboration; enable discovery, interpretation, and effective use of the data; and document its nature and quality. Current Approved Standards: FGDC Content Standard for Digital Geospatial Metadata or the International Organization for Standardization (ISO). Extensions to the standards exist, and those FGDC and ISO approved profiles or extensions that apply must be used.
- Provisional Data - USGS data, such as real-time data or preliminary measurements that are permitted to be released prior to approval to meet an immediate need, with the stipulation that they are subject to revision. For more information about restrictions on release of preliminary data, see http://internal.usgs.gov/fsp/toolbox/provisional_data_information_release.html
- Source Data - Primary or Secondary data used as input to produce products. Primary data is data measured or observed by the researcher, and is in a basic form that has been calibrated, converted to standard units, and has passed quality control procedures that remove or flag incorrect data. Secondary data is defined as data collected by someone other than the user.
- USGS Data Portal - A USGS maintained data storage system that can ensure the long-term preservation, discoverability, accessibility, and usability of USGS data that is released to the public.
- USGS Trusted Digital Repository - A USGS storage system that meets the standards at https://my.usgs.gov/confluence/display/cdi/Trusted+Digital+Repository.
C. Roles
- Generic Classifications:
- Approving Officials - Including Science Center Directors (or their designees) and Bureau Approving Officials in the OSQI, collaborate with authors, mission area managers, and others as needed regarding review and approval of scientific data. They have latitude in determining what is needed to uphold USGS standards for data quality, including ensuring the necessary reviews are obtained and the method of release is appropriate.
- Data Management Staff - The assigned or designated individuals, teams, or organizations that are responsible for stewarding scientific data through the release process using designated tools for creation of metadata and Digital Object Identifiers (DOI's) and USGS data portals. They collaborate with their mission area Science Center Directors, managers, supervisors, and scientists in the conduct of their data stewardship activities and interact with USGS data portals and other technical infrastructure for preservation of data.
- IT Staff - The assigned or designated individuals, teams, or organizations are responsible for maintaining website servers, USGS data portals, and other technical infrastructure for access, discovery, and preservation of data.
- Center Level Managers - The assigned or designated individuals who oversee project operations, are responsible for understanding data management requirements and providing projects with guidance to ensure compliance.
- Data Producers - USGS scientists and authors ensure that data is in a non-proprietary publicly available format and sufficient metadata records and Digital Object Identifiers (DOI's) are created for each data, software, and other information product they produce in accordance with requirements. This includes ensuring that the appropriate metadata review, peer review, editorial review, and approval for products they produce are obtained.
- Information Reviewers - The assigned or designated individuals, teams, or organizations responsible for skills necessary to accurately review data and metadata for products produced by USGS scientists and authors.
- Specific Classifications:
- Branch, Project, or Section Chiefs and Supervisors - Persons who oversee projects within a center.
- Bureau Approving Official (BAO) – A person who works for the Office of Science Quality and Integrity and is responsible for ensuring that our science center complies with FSP policies and for approval of publications that contain new interpretive content.
- Data Manager - Coordinates data governance, data stewardship activities, oversees data management projects, and supervises data management activities.
- Data Quality Specialist - A person who can review data for publications.
- Data Steward - A person knowledgeable in a particular area or topic who is assigned accountability for data specifications and data quality for a specific project or dataset.
- Database Administrator – For NWIS and other USGS database applications, an IT professional responsible for the installation, configuration, upgrade, administration, monitoring, maintenance, security, and backup of databases.
- DOI Manager - A person who creates and updates Digital Object Identifier numbers.
- IPDS Manager - A person who creates and monitors IPDS records. Answers publications process questions and inquiries. Guides scientists through IPDS routing and requirements.
- Metadata Specialist - A person who can provide metadata training and review metadata for publications. This requires running metadata validation software and knowledge of xml file structure.
- PI/ Project Chief/ Researcher - A person responsible for project and resulting publications. These people are identified in BASIS+ workplans.
- Science Center Director –The science center director is responsible for approving the release of data that are not considered new interpretive, and for determining which data releases and publications must be approved by the BAO. This staff member can also be responsible for overall planning and management of research activities at the science center.
- Science Center Web Staff - A person responsible for project pages and data release on web pages.
- USGS Data Portal Manager - A person who manages dataset organization and permissions of a data portal, oversees the process that ensures the quality of data added to the database or service, routine data reviews, and documentation of methods and procedures.
D. Public Release of Data Packages
The section below described the classification of data releases and associated requirements. There is nothing to fill out below, it is only required that the Science Center acknowledge these requirements.
- Approved (Data Release, Models, Software, Code)
- All data and models, software, or code intended for public release must meet USGS FSP review, approval, and release requirements. These requirements are a minimum of two reviews that include one data review and one metadata review followed by Bureau approval documented in IPDS in addition to any related written publication reviews and approval.
- Special Considerations:
- "New Interpretive" information requires BAO approval.
- Software, models, and code products do not require FGDC XML metadata files. Meta information should be documented within the code.
- Must include appropriate disclaimer. (https://www2.usgs.gov/fsp/fsp_disclaimers.asp#1)
DMP |
Metadata |
Access |
YES |
YES |
USGS Data Portal |
- Provisional
- Emergency and non-emergency provisional data are those data (such as real-time data, preliminary measurements) that are subject to revision, and may be released prior to approval to meet an immediate need.
- Special Considerations:
- Must include appropriate disclaimer: https://www2.usgs.gov/fsp/fsp_disclaimers.asp#11
- Emergency provisional data does not require finalized metadata
DMP |
Metadata |
Access |
YES |
YES |
Personal communication, websites |
-
USGS Operational Database Collections
- Data collections that are part of USGS supported database operations. (ex. NWIS, Borehole LogArchiver, and Biodata).
- Special Considerations:
- Follow other Science Center and database SOP's as authoritative procedural documentation and methods of metadata creation.
DMP |
Metadata |
Access |
NO |
NO |
USGS Operational Database |
-
Unpublished USGS Operational Database Parameters
- Data that are collected and entered into a Bureau database, but not approved for release through the database because of conflict with USGS science community data standards.
- Special Considerations:
- Follow other Science Center and database SOP's as authoritative procedural documentation in addition to maintaining a DMP and metadata.
DMP |
Metadata |
Access |
YES |
YES |
USGS Data Portal |
E. Data Not Suitable for Public Release
The section below described the classifications of data not suitable for data release and associated requirements. There is nothing to fill out below, it is only required that the Science Center acknowledge these requirements
- Restricted Data
- Proprietary or sensitive data collected or purchased by a data producer on behalf of the USGS.
- Storage and Access:
- The source data should be stored on an encrypted device with back up capability to preserve the information.
- The source data must not be released to the public through any USGS data portal, FTP, or website.
- Derivative products where the proprietary and sensitive constraints no longer apply, can be released in the final data package for publication, must have a DOI, and utilize the Science Center's designated USGS Data Portal as the primary public access location.
- Special Considerations:
- The proprietary or sensitive data may be such that the data management staff does not have privileges to handle the data.
- The source data will not be approved for public release, but data producers and Science Center Directors are responsible for meeting preservation requirements.
DMP |
Metadata |
Access |
YES |
YES |
NONE |
-
Raw Data
- Data that is collected and remains unprocessed or unverified, often requiring data producer interpretation to create meaningful information.
- Storage and Access:
- Raw data is required to be stored on the Science Center's internal network storage location or a USGS data portal if access controls are provided to restrict public access.
- Field note submission to National Archives and Records Administration (NARA) may apply to meet archival requirements.
- Special Considerations:
- The data will not be approved for public release, but data producers and Science Center Directors are responsible for meeting preservation requirements.
DMP |
Metadata |
Access |
YES |
YES |
LOCAL |