In planning a research project, it is important that you consider which file formats you will use to store your data. In some cases, this will be dictated by the software you are using or the conventions of your discipline. In other cases you may have to make a choice between several options.
In some cases, it might be the best to use one format for data collection and analysis, and converting your data to another format for archiving once your project is complete.
If you are not aware of any disciplinary standards these are some good file formats for the preservation of the most common data types:
Ideal File Format Types:
Selecting which file format to save your research has long term usage and access implications; for example, if the file format that you use is proprietary its long term accessibility and subsequent usage is unpredictable as it depends on the success and longevity of the business. The reality of technology changing is real and as a result, researchers should plan for both hardware and software obsolescence and should plan to make file format decisions that will ensure long term usage and accessibility. The following are some guidelines to help you in choosing an appropriate file format for your research:
Preferred File Formats:
Oregon State University has a table of other acceptable formats on top of the preferred file formats.
Type of data | Recommended formats | Acceptable formats |
Tabular data with extensive metadata variable labels, code labels, and defined missing values |
|
proprietary formats of statistical packages: SPSS (.sav), Stata (.dta), MS Access (.mdb/.accdb) |
Tabular data with minimal metadata column headings, variable names |
|
|
Geospatial data vector and raster data |
|
|
Textual data |
|
|
Image data | TIFF 6.0 uncompressed (.tif) |
|
Audio data | Free Lossless Audio Codec (FLAC) (.flac) |
|
Video data |
|
AVCHD video (.avchd) |
Documentation and scripts |
|
|
https://www.ukdataservice.ac.uk/manage-data/format/recommended-formats