This guide explains how benchmarks work through common interfaces in DecisionFocusedLearningBenchmarks.jl. Understanding this interface is essential for using existing benchmarks and implementing new ones.
All benchmarks work with DataSample objects that encapsulate the data needed for decision-focused learning:
@kwdef struct DataSample{I,F,S,C}
x::F = nothing # Input features
θ_true::C = nothing # True cost/utility parameters
y_true::S = nothing # True optimal solution
instance::I = nothing # Problem instance object/additional data
endThe DataSample provides flexibility - not all fields need to be populated depending on the benchmark type and use case.
The package defines a hierarchy of three abstract types:
AbstractBenchmark
└── AbstractStochasticBenchmark{exogenous}
└── AbstractDynamicBenchmark{exogenous}
AbstractBenchmark: static, single-stage optimization problemsAbstractStochasticBenchmark{exogenous}: stochastic, single stage optimization problemsAbstractDynamicBenchmark{exogenous}: multi-stage sequential decision-making problems
The {exogenous} type parameter indicates whether uncertainty distribution comes from external sources (true) or is influenced by decisions (false), which affects available methods.
Every benchmark must implement a data generation method:
# Generate a single sample
generate_sample(benchmark::AbstractBenchmark, rng::AbstractRNG; kwargs...) -> DataSampleThis method should generate a single DataSample given a random number generator and optional parameters.
If needed, benchmarks can instead override the generate_dataset method to directly create the entire dataset:
generate_dataset(benchmark::AbstractBenchmark, size::Int; kwargs...) -> Vector{DataSample}The default generate_dataset implementation calls generate_sample repeatedly, but benchmarks can override this for custom dataset generation logic.
Benchmarks provide the building blocks for decision-focused learning policies:
# Create a statistical model (e.g., a neural network)
generate_statistical_model(benchmark::AbstractBenchmark; kwargs...)
# Create an optimization maximizer/solver
generate_maximizer(benchmark::AbstractBenchmark; kwargs...)The statistical model typically maps from features x to cost parameters θ.
The maximizer solves optimization problems given cost parameters θ (and potentially additional problem dependent keyword arguments), returning decision y.
Benchmarks can provide baseline policies for comparison and evaluation:
# Get baseline policies for comparison
generate_policies(benchmark::AbstractBenchmark) -> Tuple{Policy}This returns a tuple of Policy objects representing different benchmark-specific policies:
struct Policy{F}
name::String
description::String
policy_function::F
endA Policy is just a function with a name and description.
Policies can be evaluated across multiple instances/environments using:
evaluate_policy!(policy::Policy, instances; kwargs...) -> (rewards, data_samples)Optional methods for analysis and visualization:
# Visualize data samples
plot_data(benchmark::AbstractBenchmark, sample::DataSample; kwargs...)
plot_instance(benchmark::AbstractBenchmark, instance; kwargs...)
plot_solution(benchmark::AbstractBenchmark, sample::DataSample, solution; kwargs...)
# Compute optimality gap
compute_gap(benchmark::AbstractBenchmark, dataset, model, maximizer) -> Float64
# Evaluate objective value
objective_value(benchmark::AbstractBenchmark, sample::DataSample, solution)Static benchmarks follow the basic interface above.
Exogenous stochastic benchmarks add methods for scenario generation and anticipative solutions:
# Generate uncertainty scenarios (for exogenous benchmarks)
generate_scenario(benchmark::AbstractStochasticBenchmark{true}, instance; kwargs...)
# Solve anticipative optimization problem for given scenario
generate_anticipative_solution(benchmark::AbstractStochasticBenchmark{true},
instance, scenario; kwargs...)In order to model sequential decision-making, dynamic benchmarks additionally work with environments. For this, they implement methods to create environments from instances or datasets:
# Create environment for sequential decision-making
generate_environment(benchmark::AbstractDynamicBenchmark, instance, rng; kwargs...) -> <:AbstractEnvironment
# Generate multiple environments
generate_environments(benchmark::AbstractDynamicBenchmark, dataset; kwargs...) -> Vector{<:AbstractEnvironment}Similarly to generate_dataset and generate_sample, one only needs to implement generate_environment, as generate_environments has a default implementation that calls it repeatedly.
The AbstractEnvironment interface is defined as follows:
# Environment methods
get_seed(env::AbstractEnvironment) # Get current RNG seed
reset!(env::AbstractEnvironment; reset_rng::Bool, seed=get_seed(env)) # Reset to initial state
observe(env::AbstractEnvironment) -> (obs, info) # Get current observation
step!(env::AbstractEnvironment, action) -> reward # Take action, get reward
is_terminated(env::AbstractEnvironment) -> Bool # Check if episode ended