Evaluating a Surrogate Model

This example demonstrates how to evaluate a trained surrogate model. NearestPointSurrogate is used as the example surrogate model. See Creating a surrogate model for details on how this surrogate is created. See Training a surrogate model for details on how to train a surrogate model.

Overview

In general, a SurrogateModel object takes in training data in the form of .rd files or the name of the training object itself. The former performs training and evaluating in two separate steps, the latter performs them both in a single step. Since most practical applications perform training once and evaluation in multiple instances, this example focuses on the two step method. Every SurrogateModel object has a public member function, evaluate, that gives the surrogate's estimate to a quantity of interest with the parameters as input. Some specialized surrogates, like PolynomialChaos, have more functions for computing statistics. However, this example will focus on the simple NearestPointSurrogate, which only uses the evaluate function.

Model Problem

This example uses a one-dimensional heat conduction problem as the full-order model which has certain uncertain parameters. The model equation is as follows:

-k\frac{d^2T}{dx^2} = q \,, \quad x\in[0, L]

\begin{aligned} \left.\frac{dT}{dx}\right|_{x=0} &= 0 \\ T(x=L) &= T_{\infty} \end{aligned}

The quantities of interest are the average and maximum temperature:

\begin{aligned} \bar{T} &= \frac{\int_{0}^{L}T(x)dx}{L} \\ T_{\max} &= \max_{x\in[0,L]}T(x) \end{aligned}

Parameter Uncertainty

For demonstration, each of these parameters will have two types of probability distributions: Uniform ( $\mathcal{U}(a,b)$ ) and Normal ( $\mathcal{N}(\mu,\sigma)$ ). Where $a$ and $b$ are the max and minimum bounds of the uniform distribution, respectively. And $\mu$ and $\sigma$ are the mean and standard deviation of the normal distribution, respectively.

The uncertain parameters for this model problem are:

Parameter	Symbol	Uniform	Normal
Conductivity	$k$	$\sim\mathcal{U}(1, 10)$	$\sim\mathcal{N}(5, 2)$
Volumetric Heat Source	$q$	$\sim\mathcal{U}(9000, 11000)$	$\sim\mathcal{N}(10000, 500)$
Domain Size	$L$	$\sim\mathcal{U}(0.01, 0.05)$	$\sim\mathcal{N}(0.03, 0.01)$
Right Boundary Temperature	$T_{\infty}$	$\sim\mathcal{U}(290, 310)$	$\sim\mathcal{N}(300, 10)$

Analytical Solutions

This simple model problem has analytical descriptions for the field temperature, average temperature, and maximum temperature:

\begin{aligned} T(x,k,q,L,T_{\infty}) &= \frac{q}{2k}\left(L^2 - x^2\right) + T_{\infty} \\ \bar{T}(k,q,L,T_{\infty}) &= \frac{qL^2}{3k} + T_{\infty} \\ T_{\max}(k,q,L,T_{\infty}) &= \frac{qL^2}{2k} + T_{\infty} \end{aligned}

With the quadratic feature of the field temperature, using quadratic elements in the discretization will actually yield the exact solution.

Input File

Below is the input file used to solve the one-dimensional heat conduction model.

[Mesh]
  type = GeneratedMesh
  dim = 1
  nx = 100
  xmax = 1
  elem_type = EDGE3
[]

[Variables]
  [T]
    order = SECOND
    family = LAGRANGE
  []
[]

[Kernels]
  [diffusion]
    type = MatDiffusion
    variable = T
    diffusivity = k
  []
  [source]
    type = BodyForce
    variable = T
    value = 1.0
  []
[]

[Materials]
  [conductivity]
    type = GenericConstantMaterial
    prop_names = k
    prop_values = 2.0
  []
[]

[BCs]
  [right]
    type = DirichletBC
    variable = T
    boundary = right
    value = 300
  []
[]

[Executioner]
  type = Steady
  solve_type = PJFNK
  petsc_options_iname = '-pc_type -pc_hypre_type'
  petsc_options_value = 'hypre boomeramg'
[]

[Postprocessors]
  [avg]
    type = AverageNodalVariableValue
    variable = T
  []
  [max]
    type = NodalExtremeValue
    variable = T
    value_type = max
  []
[]

[Outputs]
[]

With this input the uncertain parameters are defined as:

$k\rightarrow$ Materials/conductivity/prop_values
$q\rightarrow$ Kernels/source/value
$L\rightarrow$ Mesh/xmax
$T_{\infty}\rightarrow$ BCs/right/value

These values in the sub.i file are arbitrary since the stochastic master app will be modifying them.

Evaluation

This section shows how to set up an input file to load and evaluate a surrogate model. The training data was created with the steps from Training a surrogate model. To demonstrate the usefulness of a surrogate model, the model will be evaluated using Monte Carlo sampling with parameter distributions described in the previous section. The results of this sampling will be used to compute statistical moments and produce probability distributions.

Omitting Solve

Any input file in MOOSE needs to include a Mesh, Variables, and Executioner block. However, the stochastic master app does not actually create or solve a system. So the StochasticToolsAction builds a minimal model to satisfy these requirements:

[StochasticTools]
[]

Surrogate Model

The surrogate model is loaded by inputting the training data file with the "filename" parameter. In this example, two surrogates are loaded with two different training data files for average temperature and maximum temperature.

[Surrogates]
  [nearest_point_avg]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_avg.rd'
  []
  [nearest_point_max]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_max.rd'
  []
[]

Defining a Sampler

In this example, the surrogate is evaluated at points given by a sampler. Here we use the MonteCarloSampler to generate random points defined by a Uniform or a Normal distribution for each parameter. See Example 1: Monte Carlo for more details on setting up this sampler.

Uniform distribution for each parameter

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
  [L_dist]
    type = Uniform
    lower_bound = 0.01
    upper_bound = 0.05
  []
  [Tinf_dist]
    type = Uniform
    lower_bound = 290
    upper_bound = 310
  []
[]

Normal distribution for each parameter

[Distributions]
  [k_dist]
    type = Normal
    mean = 5
    standard_deviation = 2
  []
  [q_dist]
    type = Normal
    mean = 10000
    standard_deviation = 500
  []
  [L_dist]
    type = Normal
    mean = 0.03
    standard_deviation = 0.01
  []
  [Tinf_dist]
    type = Normal
    mean = 300
    standard_deviation = 10
  []
[]

Monte Carlo sampler

[Samplers]
  [sample]
    type = MonteCarlo
    num_rows = 100000
    distributions = 'k_dist q_dist L_dist Tinf_dist'
    execute_on = initial
  []
[]

Sampling Surrogate

Evaluating a surrogate model occurs within objects that obtain the surrogate object's reference and call the evaluate function. In this example, we will use EvaluateSurrogate to evaluate the surrogate. EvaluateSurrogate takes in a sampler and the surrogate model as inputs, and evaluates the surrogate at the points given by the sampler.

[Reporters]
  [samp]
    type = EvaluateSurrogate
    model = 'nearest_point_avg nearest_point_max'
    sampler = sample
    parallel_type = ROOT
  []
[]

The results of evaluating the surrogate can then be used to compute statistics like mean and standard deviation:

[Reporters]
  [stats]
    type = StatisticsReporter
    reporters = 'samp/nearest_point_avg samp/nearest_point_max'
    compute = 'mean stddev'
  []
[]

Results

The results of the of inputs from the previous sections produce csv files of the evaluation data. These files can then be used to produce probability distributions like in Figure 1 and Figure 2. Even for this simple model problem, evaluating the surrogate was orders of magnitude faster with significantly less memory consumption.

Figure 1: Temperature distributions with uniform parameter distribution

Figure 2: Temperature distributions with normal parameter distribution

(contrib/moose/modules/stochastic_tools/examples/surrogates/sub.i)

[Mesh]
  type = GeneratedMesh
  dim = 1
  nx = 100
  xmax = 1
  elem_type = EDGE3
[]

[Variables]
  [T]
    order = SECOND
    family = LAGRANGE
  []
[]

[Kernels]
  [diffusion]
    type = MatDiffusion
    variable = T
    diffusivity = k
  []
  [source]
    type = BodyForce
    variable = T
    value = 1.0
  []
[]

[Materials]
  [conductivity]
    type = GenericConstantMaterial
    prop_names = k
    prop_values = 2.0
  []
[]

[BCs]
  [right]
    type = DirichletBC
    variable = T
    boundary = right
    value = 300
  []
[]

[Executioner]
  type = Steady
  solve_type = PJFNK
  petsc_options_iname = '-pc_type -pc_hypre_type'
  petsc_options_value = 'hypre boomeramg'
[]

[Postprocessors]
  [avg]
    type = AverageNodalVariableValue
    variable = T
  []
  [max]
    type = NodalExtremeValue
    variable = T
    value_type = max
  []
[]

[Outputs]
[]

(contrib/moose/modules/stochastic_tools/examples/surrogates/nearest_point_uniform.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
  [L_dist]
    type = Uniform
    lower_bound = 0.01
    upper_bound = 0.05
  []
  [Tinf_dist]
    type = Uniform
    lower_bound = 290
    upper_bound = 310
  []
[]

[Samplers]
  [sample]
    type = MonteCarlo
    num_rows = 100000
    distributions = 'k_dist q_dist L_dist Tinf_dist'
    execute_on = initial
  []
[]

[Reporters]
  # Sampling surrogate
  [samp]
    type = EvaluateSurrogate
    model = 'nearest_point_avg nearest_point_max'
    sampler = sample
    parallel_type = ROOT
  []
  # Computing statistics
  [stats]
    type = StatisticsReporter
    reporters = 'samp/nearest_point_avg samp/nearest_point_max'
    compute = 'mean stddev'
  []
[]

[Surrogates]
  [nearest_point_avg]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_avg.rd'
  []
  [nearest_point_max]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_max.rd'
  []
[]

[Outputs]
  csv = true
[]

(contrib/moose/modules/stochastic_tools/examples/surrogates/nearest_point_uniform.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
  [L_dist]
    type = Uniform
    lower_bound = 0.01
    upper_bound = 0.05
  []
  [Tinf_dist]
    type = Uniform
    lower_bound = 290
    upper_bound = 310
  []
[]

[Samplers]
  [sample]
    type = MonteCarlo
    num_rows = 100000
    distributions = 'k_dist q_dist L_dist Tinf_dist'
    execute_on = initial
  []
[]

[Reporters]
  # Sampling surrogate
  [samp]
    type = EvaluateSurrogate
    model = 'nearest_point_avg nearest_point_max'
    sampler = sample
    parallel_type = ROOT
  []
  # Computing statistics
  [stats]
    type = StatisticsReporter
    reporters = 'samp/nearest_point_avg samp/nearest_point_max'
    compute = 'mean stddev'
  []
[]

[Surrogates]
  [nearest_point_avg]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_avg.rd'
  []
  [nearest_point_max]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_max.rd'
  []
[]

[Outputs]
  csv = true
[]

(contrib/moose/modules/stochastic_tools/examples/surrogates/nearest_point_uniform.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
  [L_dist]
    type = Uniform
    lower_bound = 0.01
    upper_bound = 0.05
  []
  [Tinf_dist]
    type = Uniform
    lower_bound = 290
    upper_bound = 310
  []
[]

[Samplers]
  [sample]
    type = MonteCarlo
    num_rows = 100000
    distributions = 'k_dist q_dist L_dist Tinf_dist'
    execute_on = initial
  []
[]

[Reporters]
  # Sampling surrogate
  [samp]
    type = EvaluateSurrogate
    model = 'nearest_point_avg nearest_point_max'
    sampler = sample
    parallel_type = ROOT
  []
  # Computing statistics
  [stats]
    type = StatisticsReporter
    reporters = 'samp/nearest_point_avg samp/nearest_point_max'
    compute = 'mean stddev'
  []
[]

[Surrogates]
  [nearest_point_avg]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_avg.rd'
  []
  [nearest_point_max]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_max.rd'
  []
[]

[Outputs]
  csv = true
[]

(contrib/moose/modules/stochastic_tools/examples/surrogates/nearest_point_normal.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Normal
    mean = 5
    standard_deviation = 2
  []
  [q_dist]
    type = Normal
    mean = 10000
    standard_deviation = 500
  []
  [L_dist]
    type = Normal
    mean = 0.03
    standard_deviation = 0.01
  []
  [Tinf_dist]
    type = Normal
    mean = 300
    standard_deviation = 10
  []
[]

[Samplers]
  [sample]
    type = MonteCarlo
    num_rows = 100000
    distributions = 'k_dist q_dist L_dist Tinf_dist'
    execute_on = initial
  []
[]

[Reporters]
  # Sampling surrogate
  [samp]
    type = EvaluateSurrogate
    model = 'nearest_point_avg nearest_point_max'
    sampler = sample
    parallel_type = ROOT
  []
  # Computing statistics
  [stats]
    type = StatisticsReporter
    reporters = 'samp/nearest_point_avg samp/nearest_point_max'
    compute = 'mean stddev'
  []
[]

[Surrogates]
  [nearest_point_avg]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_avg.rd'
  []
  [nearest_point_max]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_max.rd'
  []
[]

[Outputs]
  csv = true
[]

(contrib/moose/modules/stochastic_tools/examples/surrogates/nearest_point_uniform.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
  [L_dist]
    type = Uniform
    lower_bound = 0.01
    upper_bound = 0.05
  []
  [Tinf_dist]
    type = Uniform
    lower_bound = 290
    upper_bound = 310
  []
[]

[Samplers]
  [sample]
    type = MonteCarlo
    num_rows = 100000
    distributions = 'k_dist q_dist L_dist Tinf_dist'
    execute_on = initial
  []
[]

[Reporters]
  # Sampling surrogate
  [samp]
    type = EvaluateSurrogate
    model = 'nearest_point_avg nearest_point_max'
    sampler = sample
    parallel_type = ROOT
  []
  # Computing statistics
  [stats]
    type = StatisticsReporter
    reporters = 'samp/nearest_point_avg samp/nearest_point_max'
    compute = 'mean stddev'
  []
[]

[Surrogates]
  [nearest_point_avg]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_avg.rd'
  []
  [nearest_point_max]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_max.rd'
  []
[]

[Outputs]
  csv = true
[]

(contrib/moose/modules/stochastic_tools/examples/surrogates/nearest_point_uniform.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
  [L_dist]
    type = Uniform
    lower_bound = 0.01
    upper_bound = 0.05
  []
  [Tinf_dist]
    type = Uniform
    lower_bound = 290
    upper_bound = 310
  []
[]

[Samplers]
  [sample]
    type = MonteCarlo
    num_rows = 100000
    distributions = 'k_dist q_dist L_dist Tinf_dist'
    execute_on = initial
  []
[]

[Reporters]
  # Sampling surrogate
  [samp]
    type = EvaluateSurrogate
    model = 'nearest_point_avg nearest_point_max'
    sampler = sample
    parallel_type = ROOT
  []
  # Computing statistics
  [stats]
    type = StatisticsReporter
    reporters = 'samp/nearest_point_avg samp/nearest_point_max'
    compute = 'mean stddev'
  []
[]

[Surrogates]
  [nearest_point_avg]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_avg.rd'
  []
  [nearest_point_max]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_max.rd'
  []
[]

[Outputs]
  csv = true
[]

(contrib/moose/modules/stochastic_tools/examples/surrogates/nearest_point_uniform.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
  [L_dist]
    type = Uniform
    lower_bound = 0.01
    upper_bound = 0.05
  []
  [Tinf_dist]
    type = Uniform
    lower_bound = 290
    upper_bound = 310
  []
[]

[Samplers]
  [sample]
    type = MonteCarlo
    num_rows = 100000
    distributions = 'k_dist q_dist L_dist Tinf_dist'
    execute_on = initial
  []
[]

[Reporters]
  # Sampling surrogate
  [samp]
    type = EvaluateSurrogate
    model = 'nearest_point_avg nearest_point_max'
    sampler = sample
    parallel_type = ROOT
  []
  # Computing statistics
  [stats]
    type = StatisticsReporter
    reporters = 'samp/nearest_point_avg samp/nearest_point_max'
    compute = 'mean stddev'
  []
[]

[Surrogates]
  [nearest_point_avg]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_avg.rd'
  []
  [nearest_point_max]
    type = NearestPointSurrogate
    filename = 'nearest_point_training_out_nearest_point_max.rd'
  []
[]

[Outputs]
  csv = true
[]

Overview
Model Problem
Evaluation