The create subcommand creates the UQ run files.
To run the command:
Check out the command-line interface for more info on how to use this command.
The options of the
create subcommand are stored in the
create key in
- Relative location from the workspace, which specifies the folder where to store all the created runs. This defaults to
xis a not yet existing integer.
- Template directory to modify. Duqtools copies and updates the settings required for the specified system from this directory. This can be a directory with a finished run, or one just stored by JAMS (but not yet started). By default, duqtools extracts the input IMAS database entry from the settings file (e.g. jetto.in) to find the data to modify for the UQ runs.
- Specify the location of the template data to modify. This overrides the location of the data specified in settings file in the template directory.
operationsare always applied to the data. All operations specified here are added to any operations sampled from the dimensions. They can be used to, for example, set the start time for an experiment or update some physical parameters. This parameter is optional.
- For efficient UQ, it may not be necessary to sample the entire matrix or hypercube. By default, the cartesian product is taken (
method: cartesian-product). For more efficient sampling of the space, the following
methodchoices are available:
n_samplesgives the number of samples to extract.
dimensionsspecifies the dimensions of the matrix to sample from. Each dimension is a compound set of operations to apply. From this, a matrix all possible combinations is generated. Essentially, it generates the Cartesian product of all operations. By specifying a different
sampler, a subset of this hypercube can be efficiently sampled. This paramater is optional.
- Required for system
jetto-v210921, ignored for other systems. Where to store the in/output IDS data. The data key specifies the machine or imas database name where to store the data (
imasdb). duqtools will write the input data files for UQ start with the run number given by
run_in_start_at. The data generated by the UQ runs (e.g. from jetto) will be stored starting by the run number given by
create: runs_dir: /pfs/work/username/jetto/runs/run_1 template: /pfs/work/username/jetto/runs/duqtools_template operations: - variable: t_start operator: copyto value: 2.875 - variable: t_end operator: copyto value: 2.885 dimensions: - variable: t_e operator: multiply values: [0.9, 1.0, 1.1] scale_to_error: false - variable: zeff operator: multiply values: [0.9, 1.0, 1.1] scale_to_error: false sampler: method: latin-hypercube n_samples: 3
Jetto output directory
If you do not specify anything, the jetto output location depends on the location of
$JRUNS: Parent directory of
You can override the
$JRUNS directory by setting the
jruns variable. This must be a directory that
rjettov can write to.
system: name: jetto jruns: /pfs/work/username/jetto/runs/
You can modify the duqtools output directory via
Specify the template data
By default the template IMAS data to modify is extracted from the path specified in the
In some cases, it may be useful to re-use the same set of model settings, but with different input data. If the
template_data field is specified, these data will be used instead. To do so, specify
template_data with the fields below:
- Set as the relative location to the imasdb location if a local imasdb is used
- IMAS db/machine name.
- IMAS Shot number.
- IMAS Run number.
template: /pfs/work/username/jetto/runs/duqtools_template template_data: user: username db: jet shot: 91234 run: 5
Specification for the data generated by the create step.
When setting up a sequence of UQ runs, duqtools reads the source data from the template. For each individual UQ run needs, two locations must be defined. 1. The location of the input data. This is where duqtools stores the modified source data. 2. The location of the output data. The modelling software must know in advance where to store the results of the simulation.
Input data are defined by
run_in_start_at, and output data by
run_out_start_at. A sequence is generated starting from these numbers.
For example, with
run_in_start_at: 7000 and
the generated input stored at run number 7000 would correspond to output
8000, 7001 to 8001, 7002 to 8002, etc.
Note that these sequences may overlap with existing data sets. Duqtools will stop if it detects that data will be overwritten.
- Username for the IMAS database to use, defaults to current user
- IMAS database or machine name.
- The sequence of input data files start with this run number.
- The sequence of output data files start with this run number.
data: imasdb: test run_in_start_at: 7000 run_out_start_at: 8000
Depending on the number of dimensions, a hypercube is constructed from which duqtools will select a number of entries. For a setup with 3 dimension of size \(i\), \(j\), \(k\), a hypercube of \(i\times j\times k\) elements will e constructed, where each element is a one of the combinations.
By default the entire hypercube is sampled:
sampler: method: cartesian-product
sampler: method: latin-hypercube n_samples: 5
These instructions operate on the template model. Note that these are compound operations, so they are expanded to fill the matrix with possible entries for data modifications (depending on the sampling method).
Apply set of arithmetic operations to IDS.
Takes the IDS data and subtracts, adds, multiplies, etc with each the given values.
- Values to use with operator on field to create sampling space.
- Which operator to apply to the data in combination with any of the given values below. This can be any of the basic numpy arithmetic operations. Available choices:
remainder. These directly map to the equivalent numpy functions, i.e.
- If True, multiply value(s) by the error (sigma). With asymmetric errors (i.e. both lower/upper error nodes are available), scale to the lower error node for values < 0, and to the upper error node for values > 0.
- If set, clip (limit) data at this value (upper bound). Uses
- If set, clip (limit) data at this value (lower bound). Uses
- Linearly ramp the operation using the start and stop value given. The first value (start) corresponds to multiplier at the beginning of the data range, the second value (stop) to the multiplier at the end. The ramp is linearly interpolated between the start and stop values. The linear ramp acts as a multiplier of the specified
value. For example, for
new_data = data + np.linspace(start, stop, len(data)) * value
- Custom python code to apply for the
customoperator. This will be evaluated as if it were inline Python code. Two variables are accessible:
datacorresponds to the variable data, and
valuecorresponds to pass value. For example, an implementation of
custom_code: 'value * data'The resulting data must be of the same shape.
- IDS variable for the data to modify. The time slice can be denoted with '*', this will match all time slices in the IDS. Alternatively, you can specify the time slice directly, i.e.
profiles_1d/0/t_i_aveto only match and update the 0-th time slice.
variable: zeff operator: add values: [0.01, 0.02, 0.03]
will generate 3 entries,
zeff += 0.01,
zeff += 0.02, and
zeff += 0.03.
variable: t_i_ave operator: multiply values: [1.1, 1.2, 1.3]
will generate another 3 entries,
t_i_ave *= 1.1,
t_i_ave *= 1.2, and
t_i_ave *= 1.3.
With these 2 entries, the parameter hypercube would consist of 9 entries total (3 for
times 3 for
With the default
sampler: latin-hypercube, this means 9 new data files will be written.
The python equivalent is essentially
np.<operator>(ids, value, out=ids) for each of the given values.
If you want to copy all time ranges, you can use
path: profiles_1d/*/t_i_ave. The
* substring will
duqtools to apply the operation to all available time slices.
Values can be clipped to a lower or upper bound by specifying
clip_max. This can be helpful to guard against unphysical values. The example below will clip the profile for Zeff at 1 (lower bound):
variable: zeff operator: multiply values: [0.8, 0.9, 1.0, 1.1, 1.2] clip_min: 1
Before applying the operator, the given value can be ramped along the horizontal axis (rho) by specifying the
The two values represent the start and stop value of a linear ramp. For each value in
values, the data at \(\rho = 0\) are multiplied by
1 * value, data at \(\rho = 1\) are multiplied by
2 * value. All values inbetween get multiplied based on a linear interpolation betwen those 2 values.
variable: t_e operator: multiply values: [0.8, 1.0, 1.2] linear_ramp: [1, 2]
If the standard operators are not suitable for your use-case, you can define your own functions using the
This can be any custom Python code. Two variables are accessible.
data corresponds to the variable data, and
value to one of the specified values in the
values field. The only restriction is that the output of the code must have the same dimensions as the input.
The example shows an implementation of
operator: multiply with lower and upper bounds using a custom function.
variable: t_e operator: custom values: [0.8, 1.0, 1.2] custom_code: 'np.clip(data * value, a_min=0, a_max=100)'
To specify additional variables, you can use the
extra_variables lookup file. The examples will use the
name attribute to look up the location of the data. For example,
variable: zeff will refer to the entry with
For more info about variables, see here.
Although it is possible to specify value ranges explicitly in an operator, sometimes it may be easier to specify a range.
There are two ways to specify ranges in duqtools.
By number of samples
Generated evenly spaced numbers over a specified interval.
See the implementation of numpy.linspace for more details.
- Start value of the sequence.
- End value of the sequence.
- Number of samples to generate.
This example generates a range from 0.7 to 1.3 with 10 steps:
variable: t_i_ave operator: multiply values: start: 0.7 stop: 1.3 num: 10
Generate evenly spaced numbers within a given interval.
See the implementation of numpy.arange for more details.
- Start of the interval. Includes this value.
- End of the interval. Excludes this interval.
- Spacing between values.
This example generates a range from 0.7 to 1.3 with steps of 0.1:
variable: t_i_ave operator: multiply values: start: 0.7 stop: 1.3 step: 0.1
Sampling between error bounds
From the data model convention, only the upper error node (
_error_upper) should be filled in case of symmetrical error bars. If the lower error node (
_error_lower) is also filled, duqtools will scale to the upper error for values larger than 0, and to the lower error for values smaller than 0.
The following example takes
t_e, and generates a range from \(-2\sigma\) to \(+2\sigma\) with defined steps:
variable: t_e operator: add values: [-2, -1, 0, 1, 2] scale_to_error: True
The following example takes
t_i_ave, and generates a range from \(-3\sigma\) to \(+3\sigma\) with 10 equivalent steps:
variable: t_i_ave operator: add values: start: -3 stop: 3 num: 10 scale_to_error: True
When you specify a sigma range, make sure you use
add as the operator. While the other operators are also supported, they do not make much sense in this context.
It is possible to couple the sampling of two variables, simply add them as a single
List entry to the configurations file:
- - variable: t_start operator: copyto values: [0.1, 0.2, 0.3] - variable: t_end operator: copyto values: [1.1, 1.2, 1.3]