This is the "Best Practices for Managing Your Data" page of the "Data Management Planning" guide.
Alternate Page for Screenreader Users
Skip to Page Navigation
Skip to Page Content

Data Management Planning   Tags: archiving, data_curation, digital_assets, grant_proposals, publishing, research_data  

Tips and tools of data management for researchers.
Last Updated: Mar 11, 2014 URL: http://libguides.ucmerced.edu/data-management Print Guide RSS UpdatesEmail Alerts

Best Practices for Managing Your Data Print Page
  Search: 
 

Local Help for Data Management

Contact:

Susan Borda - Digital Curation Librarian
209-631-8961
Send Email

 

File Formats

Best Practices:

  • Accessible in the future, non-proprietary, commonly used by research community
  • Unencrypted and uncompressed,
  • Not proprietary use: PDF not Word, XML or RDF not RDBMS, CSV not XLS

Resources:

 

Establish a Descriptive File and Dataset Naming Convention

A consistent convention will help you easily identify your files and what they contain. Use abbreviated descriptive information such as

  • project
  • content or parameter
  • location, date and/or time (yyyymmdd for easy sorting; hhmmssTZD for time)
  • version number (establish numbering system for versions)

Use numbers, letters, dashes, underscores. Do not use spaces or special characters. Stay concise to be practical.

 

Data Documentation and Metadata

Best Practices:

  • Make good use of "readme.txt" files for documenting details
  • Document:
    • Data collection methods
    • Context of data collection
    • Variable names and description
    • Algorithms used
    • Transformations of data from the raw data through analysis
    • Software and systems used for analysis
  • Use discipline specific metadata standards
  • Use a script rather than GUI during data analysis, better for documentation and makes results easier to reproduce
  • Incorporate a workflow tool such as Kepler, Taverna or VisTrails

Resources:

 

About this guide

Acknowledgements: Sara Rutter, University of Hawaii at Manoa, for sharing her guide; UC3 (University of California Curation Center).

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Creative Commons License

 

File Organization

 

Using Excel

Best Practices:

  • Use in conjunction with a "Data Dictionary" (similar to that listed below) containing information about:
    • Variable name
    • Variable types
    • Codes and Ranges
    • Missing values
  • Place variable names in row 1
  • Always have a unique identifier per entity
  • Keep track of changes made to worksheet
  • Format columns to matchthe variable type (date, numeric, text, etc.)
  • Data entry guidelines:
    • Freeze column headings so they will not scroll of the screen
    • Enter string variables in a consistent case
    • Do not leave any blank rows in the spreadsheet
    • Do not include unessential text or fancy formatting in the spreadsheet
    • Get rid of formulas - copy the entire spreadsheet into a new sheet using "Values" option
    • Sort data with caution (always SAVE first) 
  • Verify data using double data entry
  • Save as .csv for forward compatibility and interoperability

Resources:

  • DataUp - An Excel add-in that will assist individuals in documenting and preparing Excel for archiving and sharing
  • Elliott, A C. (2006). Preparing data for analysis using Microsoft Excel. Journal of investigative medicine, 54(06), 334-341. 
 

Define Your Data Dictionary

Example Data Dictionary

Example from Hook, Les A., et al. 2010. Best Practices for Preparing Environmental Data Sets to Share and Archive. Available online (http://daac.ornl.gov/PI/BestPractices-2010.pdf) from Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee, U.S.A. doi:10.3334/ORNLDAAC/BestPractices-2010

Description

Loading  Loading...

Tip