Research integrity (data)

The University relies on OneDrive as its primary storage, where 1TB of space is available. We could not yet agree on the physical transfer of the laboratory’s data centre; therefore, until further notice, you may use OneDrive as your main storage area.

However, the laboratory NAS should be used as a secondary storage to backup and organise your data and documents. The NAS also stores data from alumni and past research activities. It has multiple redundancies but it is currently not connected to a fast line and the disaster recovery copy is updated sporadically.

Your data and documents should be stored with a logical file structure. The next guidelines might seem a bit too complex or rigid at first read. In fact, it is simple and it will help us to work effectively as a team and, equally important, it will ensure we respect terms&condition of funders, basic requirements for scientific reporting and effective long-term archiving of our data. Your Cronus folder will have to be organised in logical collections of files.

I mandate the use of the following collections, but you can add others if necessary:

  • PRJ_short_project_identifier | to store all data and analysis related to a specific project. This folder may or may not be further organised, but I advise using identifiable sub-folders with prefixes DATA_ , ANALYSIS_ , SOFTWARE_ , MANUSCRIPT_ to identify different types of collections:
    • For instance, you may wish to organise all imaging data in one folder and all western blots in another, or all results belonging to a task within the project in one folder. The important thing is that there is logic in your organisation. If experiments are made of different files, which is often the case, create a folder for each collection of data. Either the data file, or the folder containing a collection of data files, must store the date of the experiment and basic experimental notes (see below).
    • Simple data analysis might be stored within DATA folders; however, you might have more complex analyses that tap into different data. In this case, save also folders for data analysis (excel files, matlab code, etc…)
    • Similarly, you might need to develop software to analyse your data and you can have code within the DATA or ANALYSIS sub-folder. However, if your software grows large and it is of general use, you might wish to create a dedicated SOFTWARE subfolder.
    • You might identify several different collections, for example one could be all files to assemble a MANUSCRIPT
  • LITERATURE (optional) | you can be flexible with this folder, but I advise having a dedicated folder where you store all your PDFs, at the level of organisation you prefer. I also really advise storing notes about papers, or preparing literature reviews for yourself. Here, you could also save your Mendeley database.
  • PRESENTATIONS | Store here all internal presentations, posters or talks prepared for conferences or retreats, in other words, everything that was presented internally or publicly. I advise to structure the folder as in collections of files for CONFERENCES, LABMEETINGS and POSTERS. Clearly date all presentations. Feel free to add journal clubs or those more articulated slides you may prepare for our 1:1 meetings, but these are optional
  • RESOURCES | Store here all files related to plasmids, cell lines, or other materials you need to inventory and that could be important for the group. Also, create a LAB_BOOKS folder. You will store here all your digital lab books, or pdf exports. Would you wish to have a distinct lab book per project, feel free to do so, but ensure that every project folder contains its own LAB_BOOK collection.
  • Collect all other non-organised material in OTHER folders, or create other collections. However, make sure that they are identifiable. Once you leave the lab, I would probably delete the OTHER folder to avoid archiving unnecessary files.

Confused? Search for the folder ‘FOLDER_STRUCTURE_DEMO’ on the NAS and you will see how simple this is.

How to save data

It is probably impossible to give a policy for each data type or file you will handle. Once you stored files in the right place, half of the job is already done. The next step is to ensure that we can easily identify a logical link between data, analysis, lab books and, eventually, manuscripts. Therefore, date experiments, store source data names in the analysis files and in the documents you create to disseminate our work.

File names

Experimental data, or the folder containing them when multple files are generated should be of the format:

YYYYMMDD_brief_description_[v#] or, for example, 20180131_mTagBFP_1.sdt or 20180131_SW48_ERK_sensor (folder)

We use the YYYYMMDD format because when you order files, or folder, alphabetically, they will be ordered chronologically. We do not rely on the OS time stamp.

For analysis file, manuscritps or other files you will update, but you wished to keep a record of, version them adding a ‘v#’ at the end, where # is a progressive number.

Experimental notes [additional to lab books]

Please read the guidelines for lab bookkeeping. Do have an entry for each experiment in your lab book. However, several experimental parameters or observations might be more conveniently stored within a sub-folder containing data. For instance, when we do imaging, we do not enter imaging parameters within lab books, but we rather have a ‘readme.txt’ or ‘notes.txt’ file. Usually, we use similar settings from experimental to experimental sessions and, therefore, we can simply copy and paste these files and update what is necessary to update. Mind you that if you use a confocal or a motorized microscope, most information is stored in the metadata of your experiment. However, some instruments do not perform this operation. The bottom line is, either in your digital lab book, or within the data folder, there should be enough information to understand how the sample was prepared and how the experiment was executed.