Skip to main content

Research Data Management: Organize Your Data

Naming your Research Data Files

Research projects can generate hundreds of data files. Short descriptive file names and a simple file hierarchy make these files easier to navigate and locate.  Set up conventions for your project, document them for all team members, and be consistent.

Recommended conventions:

Include dates in file names, using YYYYMMDD format.
This format will allow you to sort your files chronologically.

Include abbreviated identifier, when possible
Abbreviations help reduce the size of file names. Meanings of abbreviations should be shared with the research team.

Very briefly describe the contents of the file
Use brief, clear language (e.g. 'questionnaire')

Avoid spaces or special characters in file names
Use underscores or capitals to separate words. Spaces and special characters do not always translate well between software types.

Use version numbers and/or dates within file names 
These will more easily allow you to keep track of the sequence and development of research documents/files. Use one or two leading zeros in version numbering (v001).

Keep folder structures simple and folder names clear
Simpler structures speed up back up and make finding files easier

Documentation

Clearly document the steps you take throughout the research process. Documentation supports shared understanding among research teams, helps researchers recall the details of the methods and procedures of the research, and provides context if research data will be put to reuse or further analysis. 

Documentation can be stored in numerous ways, but one of the best methods is to include a text file in folders containing research data and other research documents.

Consider documenting the following

  • The background and context of the research project, including research team members
  • Data collection methods at a very granular level
  • Structure of files
  • Procedures for data checking and validation
  • Any modifications made to data
  • Confidentiality and permissions
  • Names of labels and variables
  • Explanations of codes and classifications

Documentation should be as clear as possible. Will you or anybody else be able to decipher the research ten years down the road?

Metadata

For very large research projects, you might consider using an established metadata standard to describe the entire project, subsets of the project, or individual files. A metadata standard is simply a structured way of describing certain elements of the project or dataset.

Metadata standards vary, but many data repositories, disciplines, and organizations have developed specific metadata standards. For example:

           Darwin Core describes biological diversity by providing reference definitions, examples, and commentaries

DDI (Data Documentation Initiative) describes data in the social and behavioural sciences

CASRAI (Consortia Advancing Standards in Research Administration Information) describes research administration information

The UK Digital Curation Centre (DCC) maintains a comprehensive list of metadata standards to help you find the most appropriate standard for your research data: http://www.dcc.ac.uk/resources/metadata-standards.

If you are deciding which metadata standard to use, consult your Subject Librarian.