Dask-Optuna
===========
.. toctree::
:maxdepth: 1
:hidden:
install
api
changelog
Dask-Optuna helps improve integration between `Optuna `_
and `Dask `_.
What Dask-Optuna does
---------------------
Dask-Optuna leverages Optuna's existing distributed optimization capabilities to run
optimization trials in parallel on a Dask cluster. It does this by providing a
Dask-compatible :class:`dask_optuna.DaskStorage` storage class which wraps an
Optuna storage class (e.g. Optuna's in-memory or sqlite storage) and can be used
directly by Optuna. For example:
.. code-block::
import dask.distributed
import dask_optuna
client = dask.distributed.Client()
# Wraps Optuna's in-memory storage
storage_1 = dask_optuna.DaskStorage()
# Wraps Optuna's SQLite DB storage
storage_2 = dask_optuna.DaskStorage("sqlite:///example.db")
The underlying Optuna storage object lives on the cluster's scheduler and any method
calls on the ``DaskStorage`` instance results in the same method being called on the
underlying Optuna storage object.
This offers two primary benefits:
1. Helps extend Optuna's ``InMemoryStorage`` class to run across multiple processes.
This is important when using remote workers in a Dask cluster or situations
where Python's GIL leads to less-than-ideal parallelization.
2. Reduces setup when using persistent storage (e.g. creating a SQLite DB that's globally available)
as the underlying Optuna storage class on the scheduler is accessible all workers
in a Dask cluster.
Example
-------
.. code-block::
import optuna
import joblib
import dask.distributed
import dask_optuna
def objective(trial):
x = trial.suggest_uniform("x", -10, 10)
return (x - 2) ** 2
with dask.distributed.Client() as client:
# Create a study using Dask-compatible storage
storage = dask_optuna.DaskStorage()
study = optuna.create_study(storage=storage)
# Optimize in parallel on your Dask cluster
with joblib.parallel_backend("dask"):
study.optimize(objective, n_trials=100, n_jobs=-1)
print(f"best_params = {study.best_params}")
Community discussion
--------------------
Discussions on improving integration between Dask and Optuna are taking place in both the
`Dask issue tracker `_ and
`Optuna issue tracker `_. Please feel free to join
these conversations if you'd like to get involved.
If you have feedback or thoughts on how Dask-Optuna may be improved, please feel free to `open
an issue in Dask-Optuna's issue tracker `_.
FAQ
---
When would I use this?
^^^^^^^^^^^^^^^^^^^^^^
Dask-Optuna is useful if you want to use Optuna's ``InMemoryStorage`` when running trials in
parallel across multiple processes or if the workers in your Dask cluster don't use the same
filesystem that your Dask ``Client`` uses. If, for example, you're using a
``dask.distributed.LocalCluster`` you may be better served by using Optuna's built in storage classes.