Store and load skopt optimization results#

Mikhail Pak, October 2016. Reformatted by Holger Nahrstaedt 2020

Problem statement#

We often want to store optimization results in a file. This can be useful, for example,

  • if you want to share your results with colleagues;

  • if you want to archive and/or document your work;

  • or if you want to postprocess your results in a different Python instance or on an another computer.

The process of converting an object into a byte stream that can be stored in a file is called _serialization_. Conversely, _deserialization_ means loading an object from a byte stream.

Warning: Deserialization is not secure against malicious or erroneous code. Never load serialized data from untrusted or unauthenticated sources!

print(__doc__)
import numpy as np

from skopt import gp_minimize

Simple example#

We will use the same optimization problem as in the Bayesian optimization with skopt notebook:

noise_level = 0.1


def obj_fun(x, noise_level=noise_level):
    return np.sin(5 * x[0]) * (1 - np.tanh(x[0] ** 2)) + np.random.randn() * noise_level


res = gp_minimize(
    obj_fun,  # the function to minimize
    [(-2.0, 2.0)],  # the bounds on each dimension of x
    x0=[0.0],  # the starting point
    acq_func="LCB",  # the acquisition function (optional)
    n_calls=15,  # the number of evaluations of f including at x0
    n_random_starts=3,  # the number of random initial points
    random_state=777,
)
D:\git\scikit-optimize\skopt\optimizer\optimizer.py:517: UserWarning: The objective has been evaluated at point [5.2414561579894325e-09] before, using random point [-1.1498279780295497]
  warnings.warn(

As long as your Python session is active, you can access all the optimization results via the res object.

So how can you store this data in a file? skopt conveniently provides functions skopt.dump and skopt.load that handle this for you. These functions are essentially thin wrappers around the joblib module’s joblib.dump and joblib.load.

We will now show how to use skopt.dump and skopt.load for storing and loading results.

Using skopt.dump() and skopt.load()#

For storing optimization results into a file, call the skopt.dump function:

from skopt import dump, load

dump(res, 'result.pkl')

And load from file using skopt.load:

res_loaded = load('result.pkl')

res_loaded.fun
-0.33287005499757594

You can fine-tune the serialization and deserialization process by calling skopt.dump and skopt.load with additional keyword arguments. See the joblib documentation joblib.dump and joblib.load for the additional parameters.

For instance, you can specify the compression algorithm and compression level (highest in this case):

dump(res, 'result.gz', compress=9)

from os.path import getsize

print('Without compression: {} bytes'.format(getsize('result.pkl')))
print('Compressed with gz:  {} bytes'.format(getsize('result.gz')))
Without compression: 75397 bytes
Compressed with gz:  27335 bytes

Unserializable objective functions#

Notice that if your objective function is non-trivial (e.g. it calls MATLAB engine from Python), it might be not serializable and skopt.dump will raise an exception when you try to store the optimization results. In this case you should disable storing the objective function by calling skopt.dump with the keyword argument store_objective=False:

dump(res, 'result_without_objective.pkl', store_objective=False)

Notice that the entry 'func' is absent in the loaded object but is still present in the local variable:

res_loaded_without_objective = load('result_without_objective.pkl')

print('Loaded object: ', res_loaded_without_objective.specs['args'].keys())
print('Local variable:', res.specs['args'].keys())
Loaded object:  dict_keys(['dimensions', 'base_estimator', 'n_calls', 'n_random_starts', 'n_initial_points', 'initial_point_generator', 'acq_func', 'acq_optimizer', 'x0', 'y0', 'random_state', 'verbose', 'callback', 'n_points', 'n_restarts_optimizer', 'xi', 'kappa', 'n_jobs', 'model_queue_size', 'space_constraint'])
Local variable: dict_keys(['func', 'dimensions', 'base_estimator', 'n_calls', 'n_random_starts', 'n_initial_points', 'initial_point_generator', 'acq_func', 'acq_optimizer', 'x0', 'y0', 'random_state', 'verbose', 'callback', 'n_points', 'n_restarts_optimizer', 'xi', 'kappa', 'n_jobs', 'model_queue_size', 'space_constraint'])

Possible problems#

  • Python versions incompatibility: In general, objects serialized in Python 2 cannot be deserialized in Python 3 and vice versa.

  • Security issues: Once again, do not load any files from untrusted sources.

  • Extremely large results objects: If your optimization results object

is extremely large, calling skopt.dump with store_objective=False might cause performance issues. This is due to creation of a deep copy without the objective function. If the objective function it is not critical to you, you can simply delete it before calling skopt.dump. In this case, no deep copy is created:

del res.specs['args']['func']

dump(res, 'result_without_objective_2.pkl')

Total running time of the script: (0 minutes 1.543 seconds)

Gallery generated by Sphinx-Gallery