Response surface models¶

Response surface models, also known as surrogate models or metamodels, can be used to develop a fast-running approximation to an expensive simulation. para-atm provides the ResponseSurface base class to define general functionality that can be shared across different response surface implementations. A Gaussian process regression capability, powered by scikit-learn, is provided by the SklearnGPRegressor class. Additional response surface model types may be added in the future.

To get started immediately, refer to the Examples section.

Response surface base class¶

class paraatm.rsm.base.ResponseSurface(X, Y)¶

Base class for response surface models

Derived classes should:

Implement an __init__ method that fits the model and calls super().__init__
Implement a __call__ method for making predictions. See paraatm.rsm.gp.SklearnGPRegressor.__call__().

plot(ax=None, lb=None, ub=None, range_factor=1.1, ci=True, show_data=True, input_vals=None, ivar=None, color=None, legend=True, ncol_leg=3)¶

Create 1D response surface plots

By default, plot 1D curves of all variables on the same graph, but can also plot individual curves or subsets.

Parameters:

ax (matplotlib.axes or None) – Axis to plot on. If None, create new axis.
lb (float) – Lower bound, only supported for single-variable plot
ub (float) – Upper bound, only supported for single-variable plot
range_factor (float) – Factor used to determine default plotting range. The plotting range is determined as a multiple of the data range.
ci (bool) – Whether or not to plot confidence intervals for 1-D plots
show_data (bool) – Whether to plot the training data. For slice plots, training data are projected
input_vals (array) – Values to use for non-varying inputs (default to mean of training data)
ivar (int or list of ints) – Variable index to plot. Can be scalar or a list. Default is to plot all variables.
color (str) – Plotting color, for single-variable plots only
legend (bool) – Whether to include legend
ncol_leg (int) – Number of columns to use for formatting legend

surface_plot(ax=None, lb=None, ub=None, range_factor=1.1, ngrid=40, show_data=True, input_vals=None, slice_vars=None, surf_args=None)¶

Create 2D surface plot

Parameters:	ax (matplotlib.axes or None) – Axis to plot on. If None, create new axis lb (array) – Lower bound to use for each variable ub (array) – Upper bound to use for each variable range_factor (float) – Factor used to determine default plotting range. The plotting range is determined as a multiple of the data range. ngrid (int) – Grid density used in each dimension show_data (bool) – Whether to plot training data input_vals (array) – Array of input values, required when plotting a slice of a surface with more than two input variables. slice_vars (tuple) – Length-2 tuple with indices of the variables to plot. Required when the response surface has more than two input variables. surf_args (dict) – Additional keyword arguments to send to Axes3D.plot_surface
Returns:
Return type:	matplotlib.axes

Gaussian process regression¶

class paraatm.rsm.gp.SklearnGPRegressor(X, Y, noise=False, n_restarts_optimizer=10, alpha=1e-10, kernel=None, optimizer='fmin_l_bfgs_b')¶

Gaussian Process regression using scikit-learn

__init__(X, Y, noise=False, n_restarts_optimizer=10, alpha=1e-10, kernel=None, optimizer='fmin_l_bfgs_b')¶

Create GP regression instance and fit to training data

Parameters:

X (array) – 2d array of input values, with each row representing a case and each column a variable
Y (array) – 1d array of output values
noise (bool) – Whether the outputs are observed with noise. If True, the noise variance is estimated as part of the fitting. Additionally, scikit-learn includes the noise in the prediction standard deviation.
n_restarts_optimizer (int) – Number of times optimizer is restarted using randomly selected initial points
alpha (float) – Value added to diagonal of covariance matrix during fitting. Unlike the noise variance that can be fit through the WhiteKernel, the alpha value is not included in the predicted standard deviation.
kernel (kernel object) – Kernel specifying the covariance function of the GP. If None, RBF (a.k.a., squared exponential) is used with unique length parameters for each dimension.
optimizer (str, callable, or None) – Passed on to sklearn. In particular, None can be used to fix the kernel parameters instead of optimizing them.

__call__(X, return_stdev=False)¶

Compute predicted mean and standard deviation (optionally) at specified input(s)

Parameters:	X (array, 1d or 2d) – Input values, either as a single point (1d array) or a 2d array containing a set of points return_stdev (bool) – Whether to return standard deviation of the prediction, in addition to the mean.
Returns:	The return value is either a tuple containing the mean and standard deviation (if return_stdev==True) or the mean value only. If ndim(X)==2, then the mean and standard deviations are returned as arrays, otherwise they are floats.
Return type:	tuple, float, or array

fit_rsquared()¶: Get fit R-squared value

loo_rsquared()¶

Get leave-one-out R-squared value

Kernel parameters and normalization constants are not changed. The intent is to test holding out the observations but not re-fitting the kernel parameters.

Examples¶

Here is an example showing a GP fit to a 1D function, with confidence bounds:

import matplotlib.pyplot as plt
import numpy as np
from paraatm.rsm.gp import SklearnGPRegressor

X = np.array([1., 3., 5., 6., 7., 8.])
Y = X * np.sin(X)
X = X[:,np.newaxis] # Make input array 2d

# Use n_restarts_optimizer=0 to get reproducible behavior
gp = SklearnGPRegressor(X, Y, n_restarts_optimizer=0)

gp.plot()

plt.show()

(Source code, png, hires.png, pdf)

The Gaussian Process model can also accommodate noisy data, as shown in the below example:

import matplotlib.pyplot as plt
import numpy as np
from paraatm.rsm.gp import SklearnGPRegressor

ndata = 150
np.random.seed(1)
X = np.linspace(0,10,ndata)
Y = np.sin(X) + np.random.normal(0,.2, size=ndata)
X = X[:,np.newaxis]

gp = SklearnGPRegressor(X, Y, n_restarts_optimizer=0, noise=True)

gp.plot()

plt.show()

(Source code, png, hires.png, pdf)

And an example of a 2D function, using surface_plot():

import matplotlib.pyplot as plt
import numpy as np
from paraatm.rsm.gp import SklearnGPRegressor

np.random.seed(1)
X = np.random.uniform(-2, 2, size=(30,2))
Y = X[:,0] * np.exp(-X[:,0]**2 - X[:,1]**2)

gp = SklearnGPRegressor(X, Y, n_restarts_optimizer=0)

ax = gp.surface_plot()
ax.view_init(17, -70)

plt.show()

(Source code, png, hires.png, pdf)

para-atm

Navigation

Related Topics

Response surface models¶

Response surface base class¶

Gaussian process regression¶

Examples¶