MRU Library Website: Open Scholarship: Open Data

Banner

Research & Scholarly Publishing › Open Scholarship

Open Data

Open Research Data

Open research data benefits both original researchers and the broader research community.

The benefits of open data

Increases transparency and trust in your findings
Reduces barriers to research, allowing others to verify, reuse, or build on your work
Forms part of the scholarly record along with publications, reports, and other products of research that are increasingly recognized in research assessment
Can increase the profile of the research, lead to more citations and greater impact and recognition
Helps meet funder and publisher requirements
Is easier to do than you might think, especially with trusted repositories

Related links: Research Data Management | Data Repository

Common Misconceptions About Open Data

“I have to share all of my data”

Most data policies only require you to share or deposit data that directly supports research findings. Nearly all open data requirements make exceptions for confidential or proprietary data.

“Open data isn’t useful unless it’s big”

Even small or partial datasets can be useful for verification or replication, or may be combined with other data for some types of secondary research.

“Open means that anyone can do anything with my data”

Many repositories allow you to choose licenses and access levels with which you are comfortable.

“It’s too complicated to publish data”

Repositories provide support to guide you through the process. Collecting data with the intent to make it open at the end can greatly reduce the amount of work required to make your data publication-ready.

“There’s no benefit to sharing qualitative data”

While some qualitative methodologies depend on direct relationships with participants, qualitative data may help readers better understand the research findings and lead to novel secondary uses.

“Data sharing provisions will make it harder to recruit participants”

Many participants want their contributions to have the widest possible impact on research. Others can opt out of data sharing when providing consent.

Getting Started
with Open Research Data

Quick guide to FAIR data

You don’t need to be a data scientist to meet FAIR principles. Here are the basics to get you started.

FAIR Principle	Key Step
Findable	Deposit data in a searchable repository that assigns a DOI
Accessible	Choose open access or, if restricted, clearly described access conditions
Interoperable	Use open or widely available formats like CSV, TXT, or R
Reusable	Add a README file, license, and information about how the data was collected

What to deposit

Identify the data that supports your research findings and include any information that another researcher would need to validate your findings or reuse your data. Most datasets include at least some of the following:

Clean data files in open formats (like CSV, TXT, R, JSON) or widely used proprietary formats (like XLSX, SAV)
Codes or scripts
Documentation like data dictionaries and codebooks
Survey instruments or protocols
Consent forms

Where to deposit

Choose a repository that fits your data type, size, and discipline. These are a number of good options to choose from:

Repository	Why deposit here?
MRU Data Repository	Locally managed and suitable for most research outputs
FRDR	Accepts large data files that are highly discoverable
Dryad or OpenICPSR	Used for discipline-specific, well-curated datasets

A Deeper Dive

Into Open Data

Understanding
FAIR Data

To ensure your data has the greatest impact adhere as closely as possible to the FAIR guidelines, which outline characteristics that data and metadata should have in order to be optimally Findable, Accessible, Interoperable, and Reusable. FAIRness is a joint effort of researchers, data curators, and repository managers, with each helping to ensure that data and metadata are clearly described and readable by both humans and machines. The FAIR principles are inclusive of all types of research data, even data that is restricted due to sensitivity or intellectual property concerns.

Findable

Make your data easy to locate by assigning an identifier, writing good metadata, and using searchable repositories.

Ensure your dataset has a unique identifier. Choose a repository that assigns a persistent identifier (like a DOI) so your data can be reliably cited and found over time.
Describe the dataset with rich metadata. Include details like title, author(s), keywords, abstract, and subject area so your data can be easily indexed and discovered.
Deposit in a searchable repository. Choose a repository (like MRU Data Repository or FRDR) that is indexed by search engines and research platforms.
Use clear, standardized titles. Avoid acronyms or unclear shorthand—use full, descriptive titles that others can understand.

Accessible

Let others know how to access your data—whether it's open or has conditions for use.

Enable open access when possible. Whenever appropriate, allow users to download the data directly, without needing special tools, software or permissions.
Explain any access restrictions. Clearly describe who can access restricted data, why it's restricted, and how someone can request access.
Keep metadata visible. Ensure the dataset’s descriptive information remains accessible even if the data files are not openly shared.

Interoperable

Use widely accepted formats and metadata standards so others (and machines) can use your data.

Use open or widely-supported file formats. Prefer formats like CSV, TXT, JSON, or XML. Avoid proprietary formats unless they are standard in your field.
Apply metadata standards. Choose a formal metadata schema (e.g. Dublin Core, DDI) suitable for your discipline and data type.
Define variables clearly. Document units of measure, column names, codes, and values to make your data interpretable by others.

Reusable

Make your dataset easy to understand and legally usable for others who were not involved in your project.

Provide detailed documentation. Include a README file, data dictionary, codebook, and any contextual information needed for reuse.
Add a clear reuse license. Apply a license (such as CC BY) to let others know how they can use your dataset legally.
Follow disciplinary norms. Format and structure your data according to what is typical in your field to support trust and adoption.

Publishing Your
Data

Publishing FAIR data requires consideration of both the structure of the data and the repository. The MRU Data Repository incorporates several elements that will help you make your data FAIR, including persistent identifiers, licensing, metadata fields that conform to standards across disciplines, and integration with discovery tools. Other elements of FAIRness may be incorporated by depositors prior to and at the time of deposit.

Documentation

For data to be reusable, it must be well-documented. Documentation should include the contexts of data collection, including dates and locations, methods, sample or population, participant consent, and any other information that a secondary user would need to accurately use the data. Documentation may also include codebooks, data dictionaries, and instructions for using specialized software or code. If possible, documentation should be shared even if data files are restricted.

File Formats and Structures

Using open, non-proprietary formats helps ensure preservation of data files. Common open formats include TXT, CSV, JSON, XML, and JPEG. If it’s not possible to use open formats, try to use formats that are widely used in your discipline. The MRU Data Repository will automatically convert tabular data from Excel, SPSS, Stata, and R to an open TAB format, when possible, so that users may download data in either the original or open format.

The format or structure of the data depends on its type and disciplinary norms, but it should be as clean as possible and easily decipherable by a disciplinary expert. That may mean defining headings and units of measure, standardizing data formatting, removing or masking identifiers, and addressing null values.

Metadata

When depositing data, fill in all relevant metadata fields that describe the dataset and its component files. The MRU Data Repository allows depositors to add variable-level metadata for some types of data. Variable-level metadata greatly improves discoverability of datasets and allows users to explore the data prior to download.

Data
Repositories

Repository choice may be based on data type, discipline, or personal preference. Some repositories accept only specific types of data (e.g. GenBank, PANGAEA), while others are more general in scope. The following repositories support FAIR data deposit.

Name	Type	Good For	Key Features	Limitations	Access Model
MRU Data Repository	Institutional	Most MRU researchers	Licensing, DOIs, and standardized metadata Canadian storage Local support	5 gb file size limit, multiple files permitted	Open Restricted Embargo
Federated Research Data Repository (FRDR)	General	Large datasets	Licensing, DOIs, and standardized metadata Canadian storage Up to 1 tb of storage	Open datasets only	Open Embargo
Dryad	General but focus on biological sciences	Biological sciences researchers	Licensing, DOIs, and standardized metadata Highly curated Integration with journals for submission and citation	Fee for deposit unless publishing a partner journal All datasets receive a public domain license (CC0)	Open Embargo
OpenICPSR	Subject - Social, behavioural, and health sciences	Social science data	Licensing, DOIs, and standardized metadata Datasets up to 30 gb and 1000 files	No curation resulting in variable quality of data and documentation	Open Restricted Embargo
Open Science Framework	General	Collaborative projects	Facilitates licensing and DOI registration Publish data along with registrations, preprints, and other project documents	5 gb limit for private projects	Open Restricted

Open Research

Data Policies

Many research funding agencies and academic publishers now require or encourage researchers to make their data openly available. This section outlines the most relevant policies and expectations for researchers preparing grants or manuscripts.

Funder Policies

Tri-Agency Research Data Management Policy

The Tri-Agency Research Data Management Policy requires researchers to deposit data underlying publications and pre-prints resulting from agency-funded research.

Deposits should be made at the time of publication.
Data is not required to be open, but openness is strongly encouraged when possible. As of now, the requirement applies to a limited number of grants, but it is expected to expand by 2026.

Other Funder Policies

Genome Canada
Data Release and Sharing Policy

Heart & Stroke Foundation
Open Access to Research Outputs Policy

National Institutes of Health
Data Management and Sharing Policy

Publisher Policies

Publisher data policies usually include terms for both open data and data availability statements.

Open Data

A few things to know about journal data policies:

Open data requirements of publishers vary widely from journal to journal, ranging from encouragement to requirement that authors make the data underlying articles openly accessible.
Some publishers, like Taylor & Francis and Wiley, use tiered open data policies, with journals under the banners of those publishers typically adopting one of those tiered policies. For example, Taylor & Francis’ Basic Data Sharing Policy only encourages data sharing, while its Open & FAIR Policy comes with strict data sharing and licensing requirements.
Open data policies may only apply to certain types of articles and most journals will make exceptions for reasons related to confidentiality or intellectual property.

Please Note

Researchers should review journal data policies early in the research process - ideally prior to applying for ethics approval or submitting manuscripts. This ensures that your data sharing plans align with the journal’s requirements and helps avoid unexpected barriers later.

Data Availability Statements

Data availability statements are a common requirement when submitting an article to a journal. Data availability statements require authors to indicate if the data underlying the article’s findings will be shared and, if so, how. When writing data availability statements consider:

If you are using a repository, include the name of the repository and a persistent link to the dataset.
If data will be available only by request, specify access procedures (e.g. prior ethics approval) and conditions of use (e.g. particular types of research).
If you plan to share data only by request, consider archiving the data in a repository that permits access restrictions (like the MRU Data Repository). This approach ensures your data is securely preserved and discoverable, while still allowing you to control who can access the data and under what conditions.
If data will not be shared, provide reasons (e.g. absence of participant consent)

Cambridge University Press Example Data Availability Statements

Learn More

Finding Research and
Government Open Data

See the Statistics and Data subject guide for lists of resources or contact your subject librarian.

Contact

Brian Jackson

He/Him

Contact:

Email: bjackson@mtroyal.ca
Phone: 403.440.5032
Office: EL4423X

1st Floor*	24 Hours
Room Tablet Open Hours	–
Service Desk & access to Floors 2 - 4	8am – 5pm
Maker & Media Commons	–
Maker Studio	11am – 2pm
Audio Production Rooms	8am – 5pm
VR Lab	8am – 5pm
Archives & Special Collections	By Appointment Only

Data analysis and Visualization

Zines

FAQs

Open Scholarship

Book a Space

Find a Space

Events

Library Awards for Research Excellence

Banner

Open Data

Open Research Data

The benefits of open data

Common Misconceptions About Open Data

“I have to share all of my data”

“Open data isn’t useful unless it’s big”

“Open means that anyone can do anything with my data”

“It’s too complicated to publish data”

“There’s no benefit to sharing qualitative data”

“Data sharing provisions will make it harder to recruit participants”

Getting Started with Open Research Data

Quick guide to FAIR data

What to deposit

Where to deposit

A Deeper Dive

Into Open Data

UnderstandingFAIR Data

Findable

Accessible

Interoperable

Reusable

Publishing YourData

Documentation

File Formats and Structures

Metadata

DataRepositories

Open Research

Data Policies

Funder Policies

Tri-Agency Research Data Management Policy

Other Funder Policies

Publisher Policies

Open Data

Please Note

Data Availability Statements

Cambridge University Press Example Data Availability Statements

Finding Research andGovernment Open Data

Contact

Getting Started
with Open Research Data

Understanding
FAIR Data

Publishing Your
Data

Data
Repositories

Finding Research and
Government Open Data