File formats can affect long-term preservation and reuse. While researchers may use proprietary file formats for analysis, converting data to open and/or standard formats will help ensure the data can be rendered and accessed in the future. Researchers can also choose to make data available in both preservation-friendly formats and original file formats.
Best practice suggests selecting formats that are open/documented standards, non-proprietary, unencrypted, uncompressed, and commonly used by your research community. For example, when you have spreadsheet-based (aka tabular) data save the file as Comma-separated values (.csv) instead of Excel (.xls, .xlsx) and for text files use Plain text (.txt) or PDF/A (.pdf) instead of Microsoft Word (.doc, .docx).
Repositories may provide a list of preferred files formats. The Library of Congress and NARA also provides information on recommended file formats. Here is a list of the typical file formats we recommend using:
|
|
*preferred document format.