Duqduq demo: large scale validation
This notebook shows how to use duqtools to large scale validation.
It will go over the steps required to do uncertainty quantification from a sequence of data sets.
Where duqtools
does UQ for a single data set, duqduq
loops over multiple datasets to do UQ in sequence.
We define 2 directories:
- duqduq directory, this is where the duqtools and UQ config resides. This is also the directory we work in with duqduq.
- run directory, this is a directory where slurm has access and where all the simulation files and data are stored.
from pathlib import Path
duqtools_dir = Path('/afs/eufus.eu/user/g/g2ssmee/duqduq_demo')
duqtools_dir_done = Path('/afs/eufus.eu/user/g/g2ssmee/duqduq_demo_done')
run_dir = Path('/afs/eufus.eu/user/g/g2ssmee/jetto_runs/duqduq_long')
import os
os.chdir(duqtools_dir)
duqduq help
The main interface for duqduq is via the CLI. You can run duqduq --help
to give a list of available subcommands.
You will notice that the subcommands here mimic what is available in duqtools
.
!duqduq --help
duqduq setup
The starting point for duqduq
is 2 files:
duqtools.template.yaml
, this is the template config thatduqduq setup
will use to generate theduqtools.yaml
data.csv
, each entry in this csv file corresponds to an IMAS data set
Below is an example data.csv
file. This is how you tell duqduq
which data to do UQ for.
%cat data.csv
Below is an example duqtools.template.yaml
.
The index of each entry in data.csv
file will be used as the run name (run.name
).
The details for each entry in data.csv
will be written to the template_data
section.
Machine/dataset specific parameters, such as major radius or the start time are grabbed from the IDS.
For more information, see the documentation for large scale validation.
%cat duqtools.template.yaml
Running duqduq setup
will generate a new directory for each dataset in data.csv
. Each directory is in itself a valid duqtools directory.
!duqduq setup --yes --force
This is what the directory looks like after setup.
!tree .
It creates a duqtools config in each of the subdirectories. At this stage you could modify each of the duqtools.yaml
if you wish. The config is no different than for a single UQ run. This means you could docd data_01
and treat it as a single UQ run.
%cat data_01/duqtools.yaml
Create runs using duqduq create
This is the equivalent of duqtools create
, but for a large number of runs.
It will take each of the duqtools configs generated and set up the jetto runs and imas data according to the specification.
Since this will take a long time, we will use the --dry_run
option.
!duqduq create --force --dry-run
Submit to slurm using duqduq submit
Use duqduq submit
to submit the jobs to slurm. This tool will find all jobs (.llcmd
files in the subdirectories) and submit them to slurm.
Use the --array
option to submit the jobs as a slurm array.
os.chdir(duqtools_dir_done)
!duqduq submit --array --max_jobs 10 --force --dry-run
duqduq status
Query the status using duqduq status
. This essentially parses all the jetto.status
files in the run directory.
!duqduq status
Overview of LSV output directory
The output of duqduq
differs from a single run in that there is an additional directory layer with the name of the data entry. The logs
directory contains the slurm logs.
os.chdir(run_dir)
!tree -L 1
Each directory is a run directory as you know it from a single UQ run.
!tree 'data_01' -L 1
Merge data using duqduq merge
.
os.chdir(duqtools_dir_done)
!duqduq merge --force --dry-run
Merged data are stored in in a local imasdb for each data entry in the run directory.
os.chdir(run_dir)
!tree 'data_01/imasdb'
Data exploration with duqtools dash
The imas handles for each merged data set are stored in merge_data.csv
. They can be visualized using the duqtools dashboard.
os.chdir(duqtools_dir_done)
!duqtools dash