Scikit Bayesian Optimizer

This algorithm class provides a wrapper for Bayesian optimizer using Gaussian process implemented in scikit optimize.

experiment:
    algorithms:
        BayesianOptimizer:
            seed: null
            n_initial_points: 10
            acq_func: gp_hedge
            alpha: 1.0e-10
            n_restarts_optimizer: 0
            noise: "gaussian"
            normalize_y: False
            parallel_strategy:
                of_type: StatusBasedParallelStrategy
                strategy_configs:
                    broken:
                        of_type: MaxParallelStrategy
                default_strategy:
                    of_type: NoParallelStrategy
class orion.algo.skopt.bayes.BayesianOptimizer(space, seed=None, n_initial_points=10, acq_func='gp_hedge', alpha=1e-10, n_restarts_optimizer=0, noise='gaussian', normalize_y=False, parallel_strategy=None, convergence_duplicates=5)[source]

Wrapper skopt’s bayesian optimizer

Parameters
spaceorion.algo.space.Space

Problem’s definition

seed: int (default: None)

Seed used for the random number generator

n_initial_pointsint (default: 10)

Number of evaluations of trials with initialization points before approximating it with base_estimator. Points provided as x0 count as initialization points. If len(x0) < n_initial_points additional points are sampled at random.

acq_funcstr (default: gp_hedge)

Function to minimize over the posterior distribution. Can be: ["LCB", "EI", "PI", "gp_hedge", "EIps", "PIps"]. Check skopt docs for details.

alphafloat or array-like (default: 1e-10)

Value added to the diagonal of the kernel matrix during fitting. Larger values correspond to increased noise level in the observations and reduce potential numerical issue during fitting. If an array is passed, it must have the same number of entries as the data used for fitting and is used as datapoint-dependent noise level. Note that this is equivalent to adding a WhiteKernel with c=alpha. Allowing to specify the noise level directly as a parameter is mainly for convenience and for consistency with Ridge.

n_restarts_optimizerint (default: 0)

The number of restarts of the optimizer for finding the kernel’s parameters which maximize the log-marginal likelihood. The first run of the optimizer is performed from the kernel’s initial parameters, the remaining ones (if any) from thetas sampled log-uniform randomly from the space of allowed theta-values. If greater than 0, all bounds must be finite. Note that n_restarts_optimizer == 0 implies that one run is performed.

noise: str (default: “gaussian”)

If set to “gaussian”, then it is assumed that y is a noisy estimate of f(x) where the noise is gaussian.

normalize_ybool (default: False)

Whether the target values y are normalized, i.e., the mean of the observed target values become zero. This parameter should be set to True if the target values’ mean is expected to differ considerable from zero. When enabled, the normalization effectively modifies the GP’s prior based on the data, which contradicts the likelihood principle; normalization is thus disabled per default.

parallel_strategy: dict or None, optional

The configuration of a parallel strategy to use for pending trials or broken trials. Default is a MaxParallelStrategy for broken trials and NoParallelStrategy for pending trials.

convergence_duplicates: int, optional

Number of duplicate points the algorithm may sample before considering itself as done. Default: 10.