Selecting data to keep for long-term preservation is subjective and predicting what information may be required in the future is not a precise process. 

It would be impractical to save all data at the end of a project. Before submitting data to a repository or considering long term archival, it is important to identify what is important to keep and what can be deleted without issue.

  • Data must be easily discovered and usable. A large dataset with only a few useful bits of information isn’t as accessible or useful as carefully selected data stored effectively.
  • The costs associated with storage and long-term archival are significant. Storing unnecessary data can be a waste of money.
  • Any information stored mat be subject to Freedom of Information requests and the data disclosed.
  • What is needed to validate findings in your publications?
  • Are you obliged to destroy anything?


Selecting what to keep

The University of Nottingham highlights that when ascertaining what data to keep, consider the following questions:

  1. What are my funder and institutional requirements on what data to keep?
  2. Who holds the intellectual property and legal rights to this data in relation to storage and re-use? Can I negotiate these rights if it is not me?
  3. Is there sufficient metadata to enable future users to locate the data effectively?
  4. If the costs of storing the data are my responsibility can I afford it?
  5. Is the data transient or a ‘one off’ that cannot be replicated e.g. weather records?

Useful guide from DCC on how to decide what to keep -  Five steps to decide what data to keep


Additional Resources

Top of page