Set-up and synchronization ========================== .. To synchronize your data with the cloud, you need to set up an account for our services. .. Register for a new account at: https://qdrive.qutech.nl/register.html. .. After a successful registration, login credentials will be sent to your TU Delft email address. The qDrive package can be used to synchronize live data into the cloud. This document describes how to set up the synchronization process. .. contents:: Overview :local: :depth: 2 Setting Up The Synchronization ------------------------------- The qDrive package manages data synchronization via a separate process that starts automatically when the package is imported in Python, i.e., when you run ``import qdrive``. .. note:: This means the synchronization process will not start until qDrive is imported in python after a system startup. We aim to automate this process in future releases. .. Tip:: When working on a server, with no graphical environment, you can log in using an API-Token. Instructions on how to do this can be found :ref:`here `. Launching the Synchronization GUI ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The simplest way to manage synchronization sources is through the GUI. To launch the GUI, run the following command: .. code-block:: console python -c "import qdrive; qdrive.launch_GUI()" This will open the qDrive user interface: .. image:: _static/GUI_qdrive.png :width: 1000 From the GUI, click the **Add source** button to add new synchronization sources. The available source types are: FileBase, QCoDeS, Quantify (QMI) and Core-Tools Setting Up a Synchronization for a FileBase Source ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This synchronization agent works well for synchronizing (arbitrary) file structures. For example : .. code-block:: main_folder ├── 20240101 │ ├── 20240101-211245-165-731d85-experiment_1 <-- This is a dataset │ │ ├── my_metadata.json │ │ ├── my_data.hdf5 ├── 20240102 │ ├── 20240102-220655-268-455d85-experiment_2 <-- This is a dataset │ │ ├── my_metadata.json │ │ ├── my_data.hdf5 │ │ ├── analysis │ │ │ ├── analysis_metadata.json │ │ │ ├── analysis_data.hdf5 ├── some_other_folder <-- This is a dataset │ ├── my_data.json Here we see that datasets can be found at different levels in the folder structure. To synchronize this data, a file called ``_QH_dataset_info.yaml`` can be placed in every folder from which you want to create a dataset. In this file you can also specify specific metadata and methods to convert files (if needed). More information on how to create these can be found :ref:`here `. You can set up the synchronization using this method in the GUI by: * Selecting the scope to which the data should be synchronized. * Selecting the folder to synchronize (e.g., ``main_folder`` in this example). * Choosing whether the location is on a local or network drive. Note that performance may suffer on a network drive, so you might want to try both options to see which works best. Once these settings are configured, the synchronization agent will start looking for ``_QH_dataset_info.yaml`` files in the folders. Setting Up QCoDeS Synchronization ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To add a QCoDeS database for synchronization: * Open the Add Source menu. * Define a name that indicates what database is being synchronized, the set-up at which your measurement were measured and select your QCoDeS database, e.g., ``mydatabase.db``, The synchronization should begin immediately once the database is selected. .. note:: While using QCoDeS, it is also possible to add custom attributes to your datasets. A guide on how to do this can be found :ref:`here `. .. note:: It is also to add the QCoDeS database programmatically by running the following code. .. code-block:: python import pathlib from etiket_client.sync.backends.qcodes.qcodes_sync_class import QCoDeSSync, QCoDeSConfigData from etiket_client.sync.backends.sources import add_sync_source from etiket_client.python_api.scopes import get_scope_by_name data_path = pathlib.Path('/path/to/my/database.db') scope = get_scope_by_name('scope_name') # optional: add extra attributes extra_attributes = { 'attribute_name': 'attribute_value' } configData = QCoDeSConfigData(database_directory=data_path, set_up = "my_setup", extra_attributes = extra_attributes) add_sync_source('my_sync_source_name', QCoDeSSync, configData, scope) Setting Up Quantify (QMI) Synchronization ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For Quantify data, the expected folder structure should resemble the following format: .. code-block:: main_folder ├── 20240101 │ ├── 20240101-211245-165-731d85-experiment_1 │ │ ├── 01-01-2024_01-01-01.json │ │ ├── 01-01-2024_01-01-01.hdf5 ├── 20240102 │ ├── 20240102-220655-268-455d85-experiment_2 │ │ ├── 02-01-2024_02-02-02.json │ │ ├── 02-01-2024_02-02-02.hdf5 To set up synchronization for Quantify data: * Open the Add Source menu. * Define a name that indicates what database is being synchronized, the set-up at which your measurement were measured and select the folder containing your Quantify data, e.g., ``main_folder`` in this example. The synchronization should start automatically after the folder is selected. Setting Up Core-Tools Synchronization ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To configure synchronization with Core-Tools, you'll need the credentials for the Core-Tools database. These credentials are usually stored in the ``ct_config.yaml`` file or initialized within the core-tools setup, for example: .. code-block:: python from core_tools.data.SQL.connect import SQL_conn_info_local SQL_conn_info_local(dbname = 'dbname', user = "user_name", passwd="password", host = "localhost", port = 5432) .. warning:: Please avoid syncing data from the host `vanvliet.qutech.tudelft.nl `_ to the cloud. Setting Up Labber Synchronization ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To set up synchronization for Labber data, you need to provide the path to your Labber dataset directory. Labber typically organizes data in a hierarchical folder structure. For example: .. code-block:: main_folder ├── 2024 │ ├── 01 │ │ ├── Data_0101 │ │ │ ├── measurement_1.hdf5 │ │ │ ├── measurement_2.hdf5 │ │ ├── Data_0102 │ │ │ ├── measurement_1.hdf5 │ │ │ ├── measurement_2.hdf5 In this structure, you should provide the root folder (``main_folder``) to the sync agent. **Setting up Labber synchronization programmatically:** .. code-block:: python import pathlib from etiket_client.sync.backends.labber.labber_sync_class import LabberSync, LabberConfigData from etiket_client.sync.backends.sources import add_sync_source from etiket_client.python_api.scopes import get_scope_by_name # Define the path to your Labber data directory data_path = pathlib.Path('C:/path/to/your/labber/main_folder') # Get the target scope for synchronization scope = get_scope_by_name('my_scope') # Configure the Labber synchronization settings config_data = LabberConfigData( labber_directory=data_path, set_up="my_experimental_setup" ) # Add the synchronization source add_sync_source( 'labber_sync_source', LabberSync, config_data, scope ) .. note:: The Labber synchronization currently cannot be added through the GUI. Uploading a folder ------------------ Use ``upload_folder`` to upload a directory (recursively) as a dataset. All regular files are included except manifest files; optional Zarr conversion is supported. Minimal example ^^^^^^^^^^^^^^^ .. code-block:: python from qdrive.utility.uploads import upload_folder from qdrive.scopes import get_scope upload_scope = get_scope("my_scope") folder_path = "/my/location" upload_folder(folder_path, scope=upload_scope.uuid) If name of the dataset is not provided, it defaults to the folder name (e.g. ``location``). Parameters ^^^^^^^^^^ - **folder_path**: Path to the directory to upload. - **scope**: Scope name or UUID. Defaults to the current default scope. - **dataset_name**: Dataset name. Defaults to the basename of ``folder_path``. - **dataset_description**: Optional description. - **dataset_tags**: Optional list of tags. - **dataset_attributes**: Optional key/value attributes. - **dataset_alt_uid**: Optional alternative identifier. - **dataset_collected**: Optional, the datetime when the dataset was collected. By default the earliest creation time of all the files in the ``folder_path`` is used. - **direct_upload**: If True, upload directly to the server; if False, register locally and let the sync agent upload. - **convert_zarr_to_hdf5**: If True, each ``.zarr`` directory is converted to a single ``.hdf5`` (NetCDF via xarray) and the original ``.zarr`` is skipped. - **allow_scope_override**: If the existing manifest has a different scope, set this to True to overwrite it; otherwise an error is raised. Behavior notes ^^^^^^^^^^^^^^ - **Manifest and scope**: A manifest stored alongside the folder tracks the dataset. If the folder was previously uploaded to a different scope, you must pass ``allow_scope_override=True`` to change scope, or remove the manifest. - **Zarr conversion**: When ``convert_zarr_to_hdf5=True``, each ``.zarr`` directory is converted using xarray into a sibling ``.hdf5`` file. The version id of the converted file is computed from the maximum last-modified time across all files inside the source ``.zarr`` directory. Complete example ^^^^^^^^^^^^^^^^ .. code-block:: python import datetime from qdrive.utility.uploads import upload_folder from qdrive.scopes import get_scope scope = get_scope("my_scope") folder = "/data/experiment_42" upload_folder( folder, scope=scope.uuid, dataset_name="experiment_42", dataset_description="Room-temperature sweep", dataset_tags=["rt", "sweep"], dataset_attributes={"operator": "alice", "run": "42"}, dataset_alt_uid="exp-42", dataset_collected=datetime.datetime.now(), direct_upload=True, convert_zarr_to_hdf5=True, allow_scope_override=False, ) Copying a dataset ----------------- Use ``copy_dataset(dataset_src_uuid, scope_src_uuid, scope_dst_uuid)`` to duplicate a dataset from one scope to another. The destination dataset receives a new UUID. The ``alt_uid`` is preserved if present; if not, it is set to the source dataset UUID, enabling idempotent re-runs in the destination. Parameters ^^^^^^^^^^ - **dataset_src_uuid**: UUID of the dataset to copy (string). - **scope_src_uuid**: UUID of the source scope. - **scope_dst_uuid**: UUID of the destination scope. What gets copied ^^^^^^^^^^^^^^^^^ - **Metadata**: name, description, keywords, collected, creator, ranking, attributes, and ``alt_uid``. - **Files and versions**: all files including all version IDs. Example ^^^^^^^ .. code-block:: python from qdrive.utility.copy import copy_dataset from qdrive.scopes import get_scope source_scope = get_scope("scope_1") target_scope = get_scope("scope_2") copy_dataset("90c7fc9474754691a989f061bf5a6752", source_scope.uuid, target_scope.uuid) Managing Datasets and Files in the DataQruiser App -------------------------------------------------- In addition to browsing and visualizing your data, you can also create datasets and upload files directly within the DataQruiser app. .. _using-api-tokens-for-authentication: .. _adding-qcodes-metadata-to-a-dataset: Adding QCoDeS metadata to a dataset ----------------------------------- You can attach tags and attributes to a dataset by adding the ``QHMetaData`` instrument to your QCoDeS ``Station``. Because the instrument is recorded in the QCoDeS snapshot, we can extract the tags and attributes and add them to the QHarbor dataset. You can add both static and dynamic metadata: - Static metadata stays the same for all your measurements (e.g., project, lab PC). - Dynamic metadata changes per run (e.g., calibration type, measurement type, ...) Initializing the instrument ^^^^^^^^^^^^^^^^^^^^^^^^^^^ There are two ways to define the instrument: directly in Python or via a Station YAML file. Using Python """""""""""" .. code-block:: python from qdrive.utility.qcodes_metadata import QHMetaData from qcodes import Station station = Station() # Create and register the instrument qh_meta = QHMetaData("qh_meta", static_tags=["cool_experiment"], static_attributes={"project": "resonator project"}, ) station.add_component(qh_meta) .. note:: The instrument name is fixed to ``qh_meta``; do not change this. Using a YAML configuration file """"""""""""""""""""""""""""""" - **Using a Station YAML config:** .. code-block:: yaml instruments: qh_meta: type: qdrive.utility.qcodes_metadata.QHMetaData init: static_tags: - "cool_experiment" static_attributes: "project": "resonator project" "measurement_computer": "LAB-A-PC-5" .. note:: The instrument name is fixed to ``qh_meta``; do not change this key. The instrument can be loaded as: .. code-block:: python from qcodes import Station # replace by your qc_config.yaml name station = Station(config_file='qc_config.yaml') station.load_instrument("qh_meta") For more information, see the QCoDeS docs: `Configuring the Station by using a YAML configuration file `_. Runtime usage """"""""""""" Access the instrument and set per-run metadata in one place: .. code-block:: python # If defined via Station YAML qh_meta = station.components["qh_meta"] # Start of run: clear dynamic values, then set new ones qh_meta.reset() qh_meta.add_tags(["testing"]) # e.g. ["dynamic_tag1", "dynamic_tag2"] qh_meta.add_attributes({ "calibration": "ALLXY", # e.g. {"key": "value"} }) # measurement happens here Behavior notes """""""""""""" - **Static vs dynamic**: Static tags/attributes are set at construction and persist across ``reset()``. Dynamic ones are cleared by ``reset()``. - **Overrides**: Dynamic attributes with the same key as a static attribute will override the static value in the combined view. - **Tags order/duplicates**: Dynamic tags are appended; duplicates are preserved. Using API-Tokens for Authentication ----------------------------------- When working on a server, with no graphical environment, you can log in using an API-Token. An API-Token can be created in the DataQruiser app: 1. Open the DataQruiser app. 2. Click on the account icon (👤) in the top right corner. 3. Navigate to the "API-Tokens" section. 4. Click the "+" button to generate a new token. 5. Enter a descriptive name for your token (e.g., "Synchronization server token"). 6. Copy the generated token immediately - it will only be shown once. Now it is possible to authenticate on the server using your API-Token: .. code-block:: python from qdrive import login_with_api_token login_with_api_token('your_api_token@qharborserver.nl') # Verify the authentication was successful from qdrive.scopes import get_scopes print(f"Successfully authenticated. You have access to {len(get_scopes())} scopes.") .. tip:: The API-Token is a secret key that should be kept confidential. We do not recommend storing it in any files. If you suspect your API-Token has been compromised, immediately revoke it in the DataQruiser app.