Large scale validation
Set up large scale validations using the helper tool duqduq.
This page documents the different subcommands available in this cli program.
To get started:
duqduq --help
For information on how to configure your UQ runs via duqtools.yaml
, check out the configuration page.
To start with large scale validation, two files are needed:
data.csv
contains the template dataduqtools.template.yaml
is the duqtools config template
Then, run the programs in the intended sequence:
Each of these commands mimick the duqtools
equivalent, for example, duqduq create
is the large scale quivalent of duqtools create
.
Input data
data.csv
contains a list of IMAS handles pointing. For more info on this file, click here. duqduq setup
will loop over the entries in this file, and create a new directory (named after the index column) in the current directory with input for duqtools.
,user,db,shot,run
run_1,user,jet,12345,0002
run_2,user,jet,98760,0002
run_3,user,jet,2222,0002
run_4,user,jet,3333,0002
run_5,user,jet,4444,0001
Each column will be exposed through the handle
dataclass in the config template below.
Config template
duqtools.template.yaml
is a template for the duqtools create config. It contains a few placeholders for variable data (see the documentation for setup
).
The template uses jinja2 as a templating language.
tag: {{ run.name }}
create:
runs_dir: /pfs/work/username/jetto_runs/duqduq/{{ run.name }}
template: /pfs/work/username/jetto/runs/path/to/template/
template_data:
user: {{ handle.user }}
db: {{ handle.db }}
shot: {{ handle.shot }}
run: {{ handle.run }}
operations:
- variable: major_radius
operator: copyto
value: {{ variables.major_radius | round(4) }}
- variable: b_field
operator: copyto
value: {{ variables.b_field | round(4) }}
- variable: t_start
operator: copyto
value: {{ variables.t_start | round(4) }}
- variable: t_end
operator: copyto
value: {{ (variables.t_start + 0.01) | round(4) }}
sampler:
method: latin-hypercube
n_samples: 3
dimensions:
- variable: zeff
operator: add
values: [0.01, 0.02, 0.03]
- variable: t_e
operator: multiply
values: [0.8, 1.0, 1.2]
system:
name: jetto
Split base and UQ directories
With duqtools you can generate a base run (no sampling), and use the results of the base run as the template for subsequent uq runs.
There are different ways this can be achieved. Below is an variation of the config above to show how this can be achieved using a single template. This uses the run.output
attribute and jinja2 statements to control where to read the jetto template from.
tag: {{ run.name }}
create:
runs_dir: /pfs/work/username/jetto_runs/duqduq/{{ run.name }}
{% if run.output == 'base' -%}
template: /pfs/work/username/jetto/runs/path/to/template
{% else -%}
template: /pfs/work/username/jetto/runs/duqduq/{{ run.name }}/base
{% endif -%}
template_data:
...
operations:
...
sampler:
...
dimensions:
...
system:
name: jetto
Create and submit base runs
The first step is to setup, create and run the base runs. --no-sampling
means that duqtools performs the runs with just the operations. Anything under dimensions
is skipped. -p
is a filter that tells duqtools where to load the instructions from.
duqduq setup --output base
duqduq create --no-sampling -p 'base/**'
duqduq submit -p 'base/**'
duqduq status -p 'base/**'
Create and submit UQ runs
Setup and perform the full UQ run.
duqduq setup --output uq
duqduq create -p 'uq/**'
duqduq submit -p 'uq/**'
duqduq status -p 'uq/**'
duqduq
For more information, check out the documentation:
https://duqtools.readthedocs.io/large_scale_validation
Usage:
duqduq [OPTIONS] COMMAND [ARGS]...
Options:
Name | Type | Description | Default |
---|---|---|---|
--help |
boolean | Show this message and exit. | False |
Subcommands
- create: Create data sets for large scale validation.
- merge: Merge data sets with error propagation.
- setup: Set up large scale validation.
- status: Check status large scale validation runs.
- submit: Submit large scale validation runs.
duqduq create
Create data sets for large scale validation.
Example to only match config files in subdirectories matching jet*:
duqduq create --pattern 'jet*/**'
Usage:
duqduq create [OPTIONS]
Options:
Name | Type | Description | Default |
---|---|---|---|
--force |
boolean | Overwrite existing run directories and IDS data. | False |
-p , --pattern |
text | Only create data for configs in subdirectories matching this glob pattern. | None |
-i , --input |
text | Only create data for configs where template_data matches a handle in this data.csv. |
None |
--no-sampling |
boolean | Create base runs (ignores dimensions /sampler ). |
False |
--dry-run |
boolean | Execute without any side-effects. | False |
--yes |
boolean | Answer yes to questions automatically. | False |
--debug |
boolean | Enable debug print statements. | False |
--logfile , -l |
text | where to send the logfile, the special values stderr/stdout will send it there respectively. | duqtools.log |
--help |
boolean | Show this message and exit. | False |
duqduq merge
Merge data sets with error propagation.
By default, duqduq merge
attempts to merge all known variables.
Use --variable
to select which variables to merge.
Usage:
duqduq merge [OPTIONS]
Options:
Name | Type | Description | Default |
---|---|---|---|
--force |
boolean | Overwrite existing data | False |
-v , --variable |
text | Name of the variables. | None |
--dry-run |
boolean | Execute without any side-effects. | False |
--yes |
boolean | Answer yes to questions automatically. | False |
--debug |
boolean | Enable debug print statements. | False |
--logfile , -l |
text | where to send the logfile, the special values stderr/stdout will send it there respectively. | duqtools.log |
--help |
boolean | Show this message and exit. | False |
duqduq setup
Set up large scale validation.
Usage:
duqduq setup [OPTIONS]
Options:
Name | Type | Description | Default |
---|---|---|---|
-i , --input |
path | Input file, i.e. data.csv or runs.yaml |
data.csv |
-t , --template |
path | Template duqtools.yaml | duqtools.template.yaml |
--force |
boolean | Overwrite existing run config directories | False |
-o , --output |
text | Output subdirectory | None |
--dry-run |
boolean | Execute without any side-effects. | False |
--yes |
boolean | Answer yes to questions automatically. | False |
--debug |
boolean | Enable debug print statements. | False |
--logfile , -l |
text | where to send the logfile, the special values stderr/stdout will send it there respectively. | duqtools.log |
--help |
boolean | Show this message and exit. | False |
duqduq status
Check status large scale validation runs.
Usage:
duqduq status [OPTIONS]
Options:
Name | Type | Description | Default |
---|---|---|---|
--detailed |
boolean | Detailed info on progress | False |
--progress |
boolean | Fancy progress bar | False |
-p , --pattern |
text | Show status only for runs in subdirectories matching this glob pattern. | None |
--debug |
boolean | Enable debug print statements. | False |
--logfile , -l |
text | where to send the logfile, the special values stderr/stdout will send it there respectively. | duqtools.log |
--help |
boolean | Show this message and exit. | False |
duqduq submit
Submit large scale validation runs.
Usage:
duqduq submit [OPTIONS]
Options:
Name | Type | Description | Default |
---|---|---|---|
--force |
boolean | Re-submit running or completed jobs. | False |
--schedule |
boolean | Schedule and submit jobs automatically. | False |
-j , --max_jobs |
integer | Maximum number of jobs running simultaneously. | 10 |
-s , --status |
text | Only submit jobs with this status. | None |
-p , --pattern |
text | Only submit jobs for runs in subdirectories matching this glob pattern. | None |
-i , --input |
text | Only submit jobs for configs where template_data matches a handle in this data.csv. |
None |
-a , --array |
boolean | Submit jobs as array. | False |
--array-script |
boolean | Create script to submit jobs as array. Like --array, but does not submit. | False |
--limit |
integer | Limits total number of jobs to submit. | None |
--max_array_size |
integer | Maximum array size for slurm (usually 1001, default = 100). | 100 |
--dry-run |
boolean | Execute without any side-effects. | False |
--yes |
boolean | Answer yes to questions automatically. | False |
--debug |
boolean | Enable debug print statements. | False |
--logfile , -l |
text | where to send the logfile, the special values stderr/stdout will send it there respectively. | duqtools.log |
--help |
boolean | Show this message and exit. | False |