# Overview Data storage and preservation are integral to CRC1261's commitment to data integrity and reproducibility. :::{admonition} Data Handling Guidelines :class: important - **Use Central Storage** Research data may be stored locally *only temporarily* during acquisition or initial processing. All data must be synchronized to designated central storage systems to ensure: - Security and protection of data - Traceability and reproducibility - Access and collaboration within the research team - **Organize and Document Data** Use clear, type-based folder structures. Each dataset must include an `INFO` file or `README.md` documenting: - Methods and instruments used to generate the data - Directory structure and file naming conventions - Any context necessary for reuse or interpretation See: [Guidelines for a Good README](../best_practices/readme.md). - **Assign Responsibilities** Principal Investigators (PIs) are responsible for: - Ensuring proper data storage practices within their research group - Confirming timely synchronization of locally stored data to central storage - **Manage Experimental Data** Non-sensitive experimental data and lab records must be maintained in [eLabFTW](../eln/elabftw.md) for traceability and collaborative use. - **Handle Sensitive or Personal Data Carefully** Sensitive or personal data must be stored pseudonymously in the **REDCap** system (hosted by the Department of Neurology, fully EU GDPR compliant). - **Store and Share Code** Software, scripts, and documentation must be managed in the [CRC1261 GitLab Group](https://cau-git.rz.uni-kiel.de/CRC-1261/) See platform usage: [GitLab RZ CAU](../version_management/cau_gitlab.md). ::: ## Storage Services | Service | Purpose / Notes | Quota | Backup | | ------- | --------------- | ----- | ------ | | [**Home Network Drive**](./home_drive.md) | Personal files or temporary local storage (must sync centrally). | 50 GB[^1] | Daily incremental (60 days) | | [**Project Network Drive**](./network_drive.md) | Shared project data ≤ 1 TB. For small/medium project datasets. | On request | Daily incremental (60 days) | | [**Research Data Storage**](./research_data_storage.md) | Shared project data > 1 TB or long-term collaborative work. | On request | Daily incremental (60 days) | | [**CAU-Cloud Sync & Share**](./cau_cloud.md) | Synchronization across devices / external collaboration. | 20 GB per user[^2] | Managed via recycle bin; 7–30 day retention[^3] | | [**GitLab RZ CAU**](../version_management/cau_gitlab.md) | Version control and collaborative development (code & documentation). | 20 GB per repo | Daily incremental (60 days) | | [**SAMBA Storage Service**](https://www.rz.uni-kiel.de/en/our-portfolio/storage/storage-service-samba) | Legacy / transition storage. Only for legacy use. | - | - | ## Which storage to use? The flowchart below guides the selection of appropriate storage based on **data type, size, and sharing needs**. ```mermaid flowchart TD A["Start: You need to store research data"] --> B{"Is the data code or documentation?"} B -->|"Yes"| G["GitLab RZ CAU"] B -->|"No"| C{"Is the data experimental / lab data (non-sensitive)?"} C -->|"Yes"| H["eLabFTW"] C -->|"No"| D{"Is the data sensitive?"} D -->|"Yes"| M["REDCap (pseudonymous, GDPR-compliant)"] D -->|"No"| E{"Is it personal work?"} E -->|"Yes"| I["Home Network Drive (temporary)"] E -->|"No"| F{"Is total project data ≤ 1 TB?"} F -->|"Yes"| J["Project Network Drive"] F -->|"No"| K["Research Data Storage"] G --> L["Ensure daily backup / sync"] H --> L I --> L J --> L K --> L M --> L L["Data stored according to CRC1261 policy
Local storage only temporary, must be synced centrally"] ``` ## Backup Data stored on the central CRC1261 storage systems (Home Network Drive, Project Network Drive, Research Data Storage) is protected by the Computing Centre with **daily incremental backups (60-day retention)**. However, **data located on local devices (laptops, lab PCs, acquisition computers)** is **not backed up** and is at risk of loss. Therefore: **Local research data must be synchronized daily to central storage using secure transfer methods.** - When files are synchronized to the central storage systems, they are automatically included in the Computing Centre’s backup schedule. - Software and code stored in **GitLab RZ CAU** is version-controlled, but **raw data and results still need to be synchronized** to central storage. - PIs are responsible for ensuring that daily synchronization to central storage occurs and that research group members understand this workflow. :::{admonition} Example :class: tip ### Synchronizing local data to the central storage ```sh # Mount the network drive to the local directory sudo mount -t cifs //my_network_path /mnt/my_home_drive -o uid=1000,gid=1000,rw,user,username=suabc123,domain=uni-kiel.de # Use rsync to backup data from a source directory to the mounted network drive rsync -av --delete /path/to/source/directory /mnt/my_home_drive/backup/ # Unmount the network drive after backup is completed sudo umount /mnt/my_home_drive ``` Explanation of rsync options used: - `-a`: Archive mode (preserves permissions, ownership, timestamps, etc.) - `-v`: Verbose mode (provides detailed output) - `--delete`: Deletes any files in the backup directory that are not present in the source directory ::: [^1]: Upon informal request to [hotline@rz.uni-kiel.de](mailto:hotline@rz.uni-kiel.de), more storage space is provided within reasonable limits. [^2]: Upon informal request to [cloud@rz.uni-kiel.de](mailto:cloud@rz.uni-kiel.de), more storage space is provided within reasonable limits. [^3]: Ideally, the users can regulate restoration in self-service via the recycle bin. The computing centre guarantees the restoration of the data for 7 days regardless of the quota and up to 30 days if the recycle bin still fits into the quota. In addition, the data centre provides daily incremental backups of the last 60 days from which files can be replicated.