All guides: Open research: Open research data

What is open data?

Open data is data that is:

Freely available to download in a reusable form. Large or complex data may be accessible via a service or facility that enables access in situ or the compilation of sub-sets.
Licensed with minimal reuse restrictions.
Well described with provenance and reuse information provided.
Available in convenient, modifiable, and open formats.
Managed by the provider on an ongoing basis.

See the Library Guide Research data management for more information.

Benefits of open data

Who benefits from open data? Everyone! Open data supports:

New research and new types of research
The application of automated knowledge discovery tools online
The verification of previous results
A broader base set of data than any one researcher can hope to collect
The exploration of topics not envisioned by the initial investigators
The creation of new data sets, information and knowledge when data from multiple sources are combined
The transfer of factual information to promote development and capacity building in developing countries
Interdisciplinary, inter-sectoral, inter-institutional and international research

FAIR Data Principles

The FAIR Data Principles (Findable, Accessible, Interoperable, Reusable) were drafted at a Lorentz Center workshop in Leiden in the Netherlands in 2015, and have since received worldwide recognition by various organisations, including FORCE11, National Institutes of Health (NIH), and the European Commission, as a useful framework for thinking about sharing data in a way that will enable maximum use and reuse. They are a way of thinking about getting the most out of your research data, and its place in the wider researcher community.

Findable

Can your data be found if someone is looking for it? Does it have a DOI or a Handle? Does it have rich metadata? Is it discoverable through a research portal or a repository?

Accessible

Does your data utilise a standardised protocol? Your data does not necessarily have to be "open" - there are sometimes good reasons why data cannot be made open, i.e. privacy concerns, national security, or commercial interests - but if it is not there should be clarity and transparency around the conditions governing access and reuse.

Interoperable

To be interoperable the data will need to use community agreed formats, language, and vocabularies. Will someone who finds your data be able to meaningfully reuse it, and build or reproduce your work? The metadata you use will also need to use community agreed standards and vocabularies, and contain links to related information using identifiers.

Reusable

Reusable data should maintain its initial richness. For example, it should not be diminished for the purpose of explaining the findings in one particular publication. It needs a clear machine-readable licence and provenance information on how the data was formed. It should also have discipline-specific data and metadata standards to give it rich contextual information that will allow for reuse.

FAIR Data Principles

CARE Principles for Indigenous Data Governance

The CARE Principles for Indigenous Data Governance guide appropriate use and reuse of Indigenous data. This set of principles indicates the significant and crucial role of data in advancing Indigenous innovation and self-determination.

Collective benefit

Data ecosystems should be designed and function in ways that enable Indigenous Peoples to derive benefit from the data.

Authority to control

Indigenous Peoples' rights and interests in Indigenous data must be recognised and their authority to control such data should be empowered. Indigenous data governance enables Indigenous Peoples to determine how they are represented within data.

Responsibility

Those working with Indigenous data have a responsibility to share how this data is used to support Indigenous Peoples' self-determination and collective benefit.

Ethics

Indigenous Peoples' rights and wellbeing should be the primary concern at all stages of the data life cycle.

CARE Principles
More information from the Research Data Alliance International Indigenous Data Sovereignty Interest Group.

How to make data open

To be made open and FAIR, data should be deposited in a data repository. This is a service that exists to preserve and provide access to research data, and is a future-proofed vehicle for ensuring that data remain accessible and usable over the long-term. Deposit in a data repository is preferable to sharing data as supplementary files alongside a published article, or via cloud-based file storage services, or maintaining data in private storage and sharing on request only.

A data repository should not be confused with cloud-based services that provide file storage and sharing, such as Google Drive or the Open Science Framework. A data repository performs a number of specific functions:

It actively preserves data e.g. replicating and validating data files, migrating to preservation formats.
It publishes metadata to enable online discovery.
It assigns persistent unique identifiers (e.g. DOIs or Handles) to datasets and makes them citable.
It quality-controls datasets and enhances metadata, e.g. by applying standard vocabularies (not all repositories do this).
It manages access to data so that they can be used by other people.
It applies licence notices, to make terms of use and attribution requirements clear.

RMIT Repository

RMIT Research Repository
The RMIT Research Repository is where your data can be stored and published openly.

Discipline Repositories

It may be most strategic for you to publish your data in a discipline-specific data repository, where it will be found by other researchers in your field. You may already be aware of prominent data repositories in your area of study. Your colleagues or your supervisor might also be able to point you towards suitable repositories. Another method of finding suitable discipline-specific repositories is by consulting a registry of data repositories such as:

re3data
re3data offers detailed descriptions of more than 2600 repositories. These descriptions are based on the re3data Metadata Schema and can be accessed via the re3data API.
Directory of Open Access Repositories
Try this directory to locate other repositories. OpenDOAR is the quality-assured, global Directory of Open Access Repositories.
Fairsharing
A catalogue of databases, described according to the BioDBcore guidelines, along with the standards used within them. Partly compiled with the support of Oxford University Press.

General Repositories

There are also several general repositories where you can create a free account and deposit research data from any discipline. These include:

Zenodo
Developed by CERN to support the open access and open science movement in Europe, but available for use by researchers worldwide.
OSF
Developed by the Center for Open Science in the US to support open science and reproducibility, but available for use by researchers worldwide.
Dryad
The Dryad Digital Repository is a curated resource that makes research data discoverable, freely reusable, and citable. Dryad provides a general-purpose home for a wide diversity of data types.

Via a data journal

You may also want to promote your data by publishing a data paper in a data journal. Data papers provide an opportunity for you to describe your dataset in detail and have your work peer-reviewed. Here are some methods of finding data journals:

This 2014 blog post by Katherine Akers lists data journals.
A list of data journals is also publicly available as supplementary material to the following paper:
Candela, L., Castelli, D., Manghi, P., & Tani, A. (2015). Data journals: A survey. Journal of the Association for Information Science and Technology, 66(9), 1747–1762. https://doi.org/10.1002/asi.23358

Teaching and Research guides

Open research