Create
The create subcommand creates the UQ run files.
To run the command:
duqtools create
Check out the command-line interface for more info on how to use this command.
The create
config
The options of the create
subcommand are stored in the create
key in
the config.
runs_dir
- Relative location from the workspace, which specifies the folder where to store all the created runs. This defaults to
workspace/duqtools_experiment_x
wherex
is a not yet existing integer. template
- Template directory to modify. Duqtools copies and updates the settings required for the specified system from this directory. This can be a directory with a finished run, or one just stored by JAMS (but not yet started). By default, duqtools extracts the input IMAS database entry from the settings file (e.g. jetto.in) to find the data to modify for the UQ runs.
template_data
- Specify the location of the template data to modify. This overrides the location of the data specified in settings file in the template directory.
operations
-
These
operations
are always applied to the data. All operations specified here are added to any operations sampled from the dimensions. They can be used to, for example, set the start time for an experiment or update some physical parameters. This parameter is optional. sampler
- For efficient UQ, it may not be necessary to sample the entire matrix or hypercube. By default, the cartesian product is taken (
method: cartesian-product
). For more efficient sampling of the space, the followingmethod
choices are available:latin-hypercube
,sobol
,halton
. Wheren_samples
gives the number of samples to extract. dimensions
- The
dimensions
specifies the dimensions of the matrix to sample from. Each dimension is a compound set of operations to apply. From this, a matrix all possible combinations is generated. Essentially, it generates the Cartesian product of all operations. By specifying a differentsampler
, a subset of this hypercube can be efficiently sampled. This paramater is optional. data
- Required for system
jetto-v210921
, ignored for other systems. Where to store the in/output IDS data. The data key specifies the machine or imas database name where to store the data (imasdb
). duqtools will write the input data files for UQ start with the run number given byrun_in_start_at
. The data generated by the UQ runs (e.g. from jetto) will be stored starting by the run number given byrun_out_start_at
.
For example:
create:
runs_dir: /pfs/work/username/jetto/runs/run_1
template: /pfs/work/username/jetto/runs/duqtools_template
operations:
- variable: t_start
operator: copyto
value: 2.875
- variable: t_end
operator: copyto
value: 2.885
dimensions:
- variable: t_e
operator: multiply
values: [0.9, 1.0, 1.1]
scale_to_error: false
- variable: zeff
operator: multiply
values: [0.9, 1.0, 1.1]
scale_to_error: false
sampler:
method: latin-hypercube
n_samples: 3
Jetto output directory
If you do not specify anything, the jetto output location depends on the location of duqtools.yaml
:
- If
duqtools.yaml
is outside$JRUNS
:$JRUNS/duqtools_experiment_xxx
- If
duqtools.yaml
is inside$JRUNS
: Parent directory ofduqtools.yaml
You can override the $JRUNS
directory by setting the jruns
variable. This must be a directory that rjettov
can write to.
system:
name: jetto
jruns: /pfs/work/username/jetto/runs/
You can modify the duqtools output directory via runs_dir
:
runs_dir: my_experiment
Specify the template data
By default the template IMAS data to modify is extracted from the path specified in the template
field.
template: /pfs/work/username/jetto/runs/duqtools_template
In some cases, it may be useful to re-use the same set of model settings, but with different input data. If the template_data
field is specified, these data will be used instead. To do so, specify template_data
with the fields below:
relative_location
- Set as the relative location to the imasdb location if a local imasdb is used
user
- Username.
db
- IMAS db/machine name.
shot
- IMAS Shot number.
run
- IMAS Run number.
For example:
template: /pfs/work/username/jetto/runs/duqtools_template
template_data:
user: username
db: jet
shot: 91234
run: 5
Data location
Specification for the data generated by the create step.
When setting up a sequence of UQ runs, duqtools reads the source data from the template. For each individual UQ run needs, two locations must be defined. 1. The location of the input data. This is where duqtools stores the modified source data. 2. The location of the output data. The modelling software must know in advance where to store the results of the simulation.
Input data are defined by run_in_start_at
, and output data by
run_out_start_at
. A sequence is generated starting from these numbers.
For example, with run_in_start_at
: 7000 and run_out_start_at
: 8000,
the generated input stored at run number 7000 would correspond to output
8000, 7001 to 8001, 7002 to 8002, etc.
Note that these sequences may overlap with existing data sets. Duqtools will stop if it detects that data will be overwritten.
user
- Username for the IMAS database to use, defaults to current user
imasdb
- IMAS database or machine name.
run_in_start_at
- The sequence of input data files start with this run number.
run_out_start_at
- The sequence of output data files start with this run number.
For example:
data:
imasdb: test
run_in_start_at: 7000
run_out_start_at: 8000
Samplers
Depending on the number of dimensions, a hypercube is constructed from which duqtools will select a number of entries. For a setup with 3 dimension of size \(i\), \(j\), \(k\), a hypercube of \(i\times j\times k\) elements will e constructed, where each element is a one of the combinations.
By default the entire hypercube is sampled:
sampler:
method: cartesian-product
For smarter sampling, use one of the other methods: latin-hypercube
, sobol
, or halton
.
n_samples
gives the number of samples to extract. For example:
sampler:
method: latin-hypercube
n_samples: 5
Dimensions
These instructions operate on the template model. Note that these are compound operations, so they are expanded to fill the matrix with possible entries for data modifications (depending on the sampling method).
Arithmetic operations
Apply set of arithmetic operations to IDS.
Takes the IDS data and subtracts, adds, multiplies, etc with each the given values.
values
- Values to use with operator on field to create sampling space.
operator
- Which operator to apply to the data in combination with any of the given values below. This can be any of the basic numpy arithmetic operations. Available choices:
add
,multiply
,divide
,power
,subtract
,floor_divide
,mod
,none
andremainder
. These directly map to the equivalent numpy functions, i.e.add
->np.add
. scale_to_error
- If True, multiply value(s) by the error (sigma). With asymmetric errors (i.e. both lower/upper error nodes are available), scale to the lower error node for values < 0, and to the upper error node for values > 0.
clip_min
- If set, clip (limit) data at this value (upper bound). Uses
np.clip
. clip_max
- If set, clip (limit) data at this value (lower bound). Uses
np.clip
. linear_ramp
- Linearly ramp the operation using the start and stop value given. The first value (start) corresponds to multiplier at the beginning of the data range, the second value (stop) to the multiplier at the end. The ramp is linearly interpolated between the start and stop values. The linear ramp acts as a multiplier of the specified
value
. For example, foroperator: add
:new_data = data + np.linspace(start, stop, len(data)) * value
custom_code
- Custom python code to apply for the
custom
operator. This will be evaluated as if it were inline Python code. Two variables are accessible:data
corresponds to the variable data, andvalue
corresponds to pass value. For example, an implementation ofoperator: multiply
:custom_code: 'value * data'
The resulting data must be of the same shape. variable
- IDS variable for the data to modify. The time slice can be denoted with '*', this will match all time slices in the IDS. Alternatively, you can specify the time slice directly, i.e.
profiles_1d/0/t_i_ave
to only match and update the 0-th time slice.
For example:
variable: zeff
operator: add
values: [0.01, 0.02, 0.03]
will generate 3 entries, zeff += 0.01
, zeff += 0.02
, and zeff += 0.03
.
variable: t_i_ave
operator: multiply
values: [1.1, 1.2, 1.3]
will generate another 3 entries, t_i_ave *= 1.1
, t_i_ave *= 1.2
, and t_i_ave *= 1.3
.
With these 2 entries, the parameter hypercube would consist of 9 entries total (3 for zeff
times 3 for t_i_ave
).
With the default sampler: latin-hypercube
, this means 9 new data files will be written.
Note
The python equivalent is essentially np.<operator>(ids, value, out=ids)
for each of the given values.
Note
If you want to copy all time ranges, you can use path: profiles_1d/*/t_i_ave
. The *
substring will
duqtools to apply the operation to all available time slices.
Clipping profiles
Values can be clipped to a lower or upper bound by specifying clip_min
or clip_max
. This can be helpful to guard against unphysical values. The example below will clip the profile for Zeff at 1 (lower bound):
variable: zeff
operator: multiply
values: [0.8, 0.9, 1.0, 1.1, 1.2]
clip_min: 1
Linear ramps
Before applying the operator, the given value can be ramped along the horizontal axis (rho) by specifying the linear_ramp
keyword.
The two values represent the start and stop value of a linear ramp. For each value in values
, the data at \(\rho = 0\) are multiplied by 1 * value
, data at \(\rho = 1\) are multiplied by 2 * value
. All values inbetween get multiplied based on a linear interpolation betwen those 2 values.
variable: t_e
operator: multiply
values: [0.8, 1.0, 1.2]
linear_ramp: [1, 2]
Custom functions
If the standard operators are not suitable for your use-case, you can define your own functions using the custom
operator.
This can be any custom Python code. Two variables are accessible. data
corresponds to the variable data, and value
to one of the specified values in the values
field. The only restriction is that the output of the code must have the same dimensions as the input.
The example shows an implementation of operator: multiply
with lower and upper bounds using a custom function.
variable: t_e
operator: custom
values: [0.8, 1.0, 1.2]
custom_code: 'np.clip(data * value, a_min=0, a_max=100)'
Variables
To specify additional variables, you can use the extra_variables
lookup file. The examples will use the name
attribute to look up the location of the data. For example, variable: zeff
will refer to the entry with name: zeff
.
For more info about variables, see here.
Value ranges
Although it is possible to specify value ranges explicitly in an operator, sometimes it may be easier to specify a range.
There are two ways to specify ranges in duqtools.
By number of samples
Generated evenly spaced numbers over a specified interval.
See the implementation of numpy.linspace for more details.
start
- Start value of the sequence.
stop
- End value of the sequence.
num
- Number of samples to generate.
This example generates a range from 0.7 to 1.3 with 10 steps:
variable: t_i_ave
operator: multiply
values:
start: 0.7
stop: 1.3
num: 10
By stepsize
Generate evenly spaced numbers within a given interval.
See the implementation of numpy.arange for more details.
start
- Start of the interval. Includes this value.
stop
- End of the interval. Excludes this interval.
step
- Spacing between values.
This example generates a range from 0.7 to 1.3 with steps of 0.1:
variable: t_i_ave
operator: multiply
values:
start: 0.7
stop: 1.3
step: 0.1
Sampling between error bounds
From the data model convention, only the upper error node (_error_upper
) should be filled in case of symmetrical error bars. If the lower error node (_error_lower
) is also filled, duqtools will scale to the upper error for values larger than 0, and to the lower error for values smaller than 0.
The following example takes t_e
, and generates a range from \(-2\sigma\) to \(+2\sigma\) with defined steps:
variable: t_e
operator: add
values: [-2, -1, 0, 1, 2]
scale_to_error: True
The following example takes t_i_ave
, and generates a range from \(-3\sigma\) to \(+3\sigma\) with 10 equivalent steps:
variable: t_i_ave
operator: add
values:
start: -3
stop: 3
num: 10
scale_to_error: True
Note
When you specify a sigma range, make sure you use add
as the operator. While the other operators are also supported, they do not make much sense in this context.
Coupling Variables
It is possible to couple the sampling of two variables, simply add them as a single List
entry to the configurations file:
- - variable: t_start
operator: copyto
values: [0.1, 0.2, 0.3]
- variable: t_end
operator: copyto
values: [1.1, 1.2, 1.3]