Improve documentation

BatyLeo · BatyLeo · commit cdacf3b0c275 · 2026-03-12T17:22:13.000+01:00
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -24,20 +24,20 @@ x \;\longrightarrow\; \boxed{\,\text{Statistical model } \varphi_w\,}
 ```
 
 Where:
-- **Statistical model** $\varphi_w$: machine learning predictor (e.g., neural network)
-- **CO algorithm** $f$: combinatorial optimization solver
 - **Instance** $x$: input data (e.g., features, context)
+- **Statistical model** $\varphi_w$: machine learning predictor (e.g., neural network)
 - **Parameters** $\theta$: predicted parameters for the optimization problem solved by `f`
+- **CO algorithm** $f$: combinatorial optimization solver
 - **Solution** $y$: output decision/solution
 
 ## Package Overview
 
 **DecisionFocusedLearningBenchmarks.jl** provides a collection of benchmark problems for evaluating decision-focused learning algorithms. The package offers:
 
-- **Standardized benchmark problems** spanning diverse application domains
-- **Common interfaces** for creating datasets, statistical models, and optimization algorithms
-- **Ready-to-use DFL policies** compatible with [InferOpt.jl](https://github.com/JuliaDecisionFocusedLearning/InferOpt.jl) and the whole [JuliaDecisionFocusedLearning](https://github.com/JuliaDecisionFocusedLearning) ecosystem
-- **Evaluation tools** for comparing algorithm performance
+- **Collection of benchmark problems** spanning diverse applications
+- **Common tools** for creating datasets, statistical models, and optimization algorithms
+- **Generic interface** for building custom benchmarks
+- Compatibility with [InferOpt.jl](https://github.com/JuliaDecisionFocusedLearning/InferOpt.jl) and the whole [JuliaDecisionFocusedLearning](https://github.com/JuliaDecisionFocusedLearning) ecosystem
 
 ## Benchmark Categories
 
diff --git a/docs/src/using_benchmarks.md b/docs/src/using_benchmarks.md
@@ -1,8 +1,42 @@
 # Using Benchmarks
 
-This guide covers everything you need to work with existing benchmarks in
-DecisionFocusedLearningBenchmarks.jl: generating datasets, assembling DFL pipeline
-components, and evaluating results.
+This guide covers everything you need to work with existing benchmarks in DecisionFocusedLearningBenchmarks.jl: generating datasets, assembling DFL pipeline components, applying algorithms, and evaluating results.
+
+---
+
+## What is a benchmark?
+
+A benchmark bundles a problem family (an instance generator, a combinatorial solver, and a statistical model architecture) into a single object. It provides everything needed to run a Decision-Focused Learning experiment out of the box, without having to create each component from scratch.
+Three abstract types cover the main settings:
+- **`AbstractBenchmark`**: static problems (one instance, one decision)
+- **`AbstractStochasticBenchmark{exogenous}`**: stochastic problems (type parameter indicates whether uncertainty is exogenous)
+- **`AbstractDynamicBenchmark`**: sequential / multi-stage problems
+
+The sections below explain what changes between these settings. For most purposes, start with a static benchmark to understand the core workflow.
+
+---
+
+## Core workflow
+
+Every benchmark exposes three key methods. For any static benchmark:
+
+```julia
+bench = ArgmaxBenchmark()
+model = generate_statistical_model(bench; seed=0)   # Flux model
+maximizer = generate_maximizer(bench)               # combinatorial oracle
+dataset = generate_dataset(bench, 100; seed=0)      # Vector{DataSample}
+```
+
+- **`generate_statistical_model`**: returns an untrained neural network that maps input features `x` to cost parameters `θ`.
+- **`generate_maximizer`**: returns a callable `(θ; context...) -> y` that solves the combinatorial problem given cost parameters.
+- **`generate_dataset`**: returns labeled training data as a `Vector{DataSample}`.
+
+At inference time these two pieces compose naturally as an end-to-end policy:
+
+```julia
+θ = model(sample.x)                  # predict cost parameters
+y = maximizer(θ; sample.context...)  # solve the optimization problem
+```
 
 ---
 
@@ -18,8 +52,7 @@ All data in the package is represented as [`DataSample`](@ref) objects.
 | `context` | `NamedTuple` | Solver kwargs spread into `maximizer(θ; sample.context...)` |
 | `extra` | `NamedTuple` | Non-solver data (scenario, reward, step, …), never passed to the solver |
 
-Not all fields are populated in every sample. For convenience, named entries inside
-`context` and `extra` can be accessed directly on the sample via property forwarding:
+Not all fields are populated in every sample, depending on the setting. For convenience, named entries inside `context` and `extra` can be accessed directly on the sample via property forwarding:
 
 ```julia
 sample.instance   # looks up :instance in context first, then in extra
@@ -28,12 +61,11 @@ sample.scenario   # looks up :scenario in context first, then in extra
 
 ---
 
-## Generating datasets for training
+## Benchmark type specifics
 
 ### Static benchmarks
 
-For static benchmarks (`<:AbstractBenchmark`) the framework already computes the
-ground-truth label `y`:
+For static benchmarks (`<:AbstractBenchmark`), `generate_dataset` may compute a default ground-truth label `y` if the benchmark implements it:
 
 ```julia
 bench = ArgmaxBenchmark()
@@ -43,15 +75,13 @@ dataset = generate_dataset(bench, 100; seed=0)   # Vector{DataSample} with x, y,
 You can override the labels by providing a `target_policy`:
 
 ```julia
-my_policy = sample -> DataSample(; sample.context..., x=sample.x,
-                                   y=my_algorithm(sample.instance))
+my_policy = sample -> DataSample(; sample.context..., x=sample.x, y=my_algorithm(sample.instance))
 dataset = generate_dataset(bench, 100; seed=0, target_policy=my_policy)
 ```
 
 ### Stochastic benchmarks (exogenous)
 
-For `AbstractStochasticBenchmark{true}` benchmarks the default call returns
-*unlabeled* samples, each sample carries one scenario in `sample.extra.scenario`:
+For `AbstractStochasticBenchmark{true}` benchmarks the default call returns *unlabeled* samples, each sample carries one scenario in `sample.extra.scenario`:
 
 ```julia
 bench   = StochasticVehicleSchedulingBenchmark()
@@ -85,20 +115,22 @@ Dynamic benchmarks use a two-step workflow:
 ```julia
 bench = DynamicVehicleSchedulingBenchmark()
 
-# Step 1 — create environments (reusable across experiments)
+# Step 1: create environments (reusable across experiments)
 envs = generate_environments(bench, 10; seed=0)
 
-# Step 2 — roll out a policy to collect training trajectories
+# Step 2: roll out a policy to collect training trajectories
 policy = generate_baseline_policies(bench)[1]          # e.g. lazy policy
 dataset = generate_dataset(bench, envs; target_policy=policy)
 # dataset is a flat Vector{DataSample} of all steps across all trajectories
 ```
 
-`target_policy` is **required** for dynamic benchmarks (there is no default label).
+`target_policy` is **required** to create datasets for dynamic benchmarks (there is no default label).
 It must be a callable `(env) -> Vector{DataSample}` that performs a full episode
 rollout and returns the resulting trajectory.
 
-### Seed / RNG control
+---
+
+## Seed / RNG control
 
 All `generate_dataset` and `generate_environments` calls accept either `seed`
 (creates an internal `MersenneTwister`) or `rng` for full control:
@@ -111,22 +143,6 @@ dataset = generate_dataset(bench, 50; rng=rng)
 
 ---
 
-## DFL pipeline components
-
-```julia
-model = generate_statistical_model(bench; seed=0)   # untrained Flux model
-maximizer = generate_maximizer(bench)                   # combinatorial oracle
-```
-
-These two pieces compose naturally:
-
-```julia
-θ = model(sample.x)                  # predict cost parameters
-y = maximizer(θ; sample.context...)      # solve the optimization problem
-```
-
----
-
 ## Evaluation
 
 ```julia
diff --git a/docs/src/warcraft_tutorial.md b/docs/src/warcraft_tutorial.md
@@ -0,0 +1,155 @@
+```@meta
+EditURL = "tutorials/warcraft_tutorial.jl"
+```
+
+# Path-finding on image maps
+
+In this tutorial, we showcase DecisionFocusedLearningBenchmarks.jl capabilities on one of its main benchmarks: the Warcraft benchmark.
+This benchmark problem is a simple path-finding problem where the goal is to find the shortest path between the top left and bottom right corners of a given image map.
+The map is represented as a 2D image representing a 12x12 grid, each cell having an unknown travel cost depending on the terrain type.
+
+First, let's load the package and create a benchmark object as follows:
+
+````@example warcraft_tutorial
+using DecisionFocusedLearningBenchmarks
+b = WarcraftBenchmark()
+````
+
+## Dataset generation
+
+These benchmark objects behave as generators that can generate various needed elements in order to build an algorithm to tackle the problem.
+First of all, all benchmarks are capable of generating datasets as needed, using the [`generate_dataset`](@ref) method.
+This method takes as input the benchmark object for which the dataset is to be generated, and a second argument specifying the number of samples to generate:
+
+````@example warcraft_tutorial
+dataset = generate_dataset(b, 50);
+nothing #hide
+````
+
+We obtain a vector of [`DataSample`](@ref) objects, containing all needed data for the problem.
+Subdatasets can be created through regular slicing:
+
+````@example warcraft_tutorial
+train_dataset, test_dataset = dataset[1:45], dataset[46:50]
+````
+
+And getting an individual sample will return a [`DataSample`](@ref) with four fields: `x`, `info`, `θ`, and `y`:
+
+````@example warcraft_tutorial
+sample = test_dataset[1]
+````
+
+`x` correspond to the input features, i.e. the input image (3D array) in the Warcraft benchmark case:
+
+````@example warcraft_tutorial
+x = sample.x
+````
+
+`θ` correspond to the true unknown terrain weights. We use the opposite of the true weights in order to formulate the optimization problem as a maximization problem:
+
+````@example warcraft_tutorial
+θ_true = sample.θ
+````
+
+`y` correspond to the optimal shortest path, encoded as a binary matrix:
+
+````@example warcraft_tutorial
+y_true = sample.y
+````
+
+`context` is not used in this benchmark (no solver kwargs needed), so it is empty:
+
+````@example warcraft_tutorial
+isempty(sample.context)
+````
+
+For some benchmarks, we provide the following plotting method [`plot_data`](@ref) to visualize the data:
+
+````@example warcraft_tutorial
+plot_data(b, sample)
+````
+
+We can see here the terrain image, the true terrain weights, and the true shortest path avoiding the high cost cells.
+
+## Building a pipeline
+
+DecisionFocusedLearningBenchmarks also provides methods to build an hybrid machine learning and combinatorial optimization pipeline for the benchmark.
+First, the [`generate_statistical_model`](@ref) method generates a machine learning predictor to predict cell weights from the input image:
+
+````@example warcraft_tutorial
+model = generate_statistical_model(b)
+````
+
+In the case of the Warcraft benchmark, the model is a convolutional neural network built using the Flux.jl package.
+
+````@example warcraft_tutorial
+θ = model(x)
+````
+
+Note that the model is not trained yet, and its parameters are randomly initialized.
+
+Finally, the [`generate_maximizer`](@ref) method can be used to generate a combinatorial optimization algorithm that takes the predicted cell weights as input and returns the corresponding shortest path:
+
+````@example warcraft_tutorial
+maximizer = generate_maximizer(b; dijkstra=true)
+````
+
+In the case o fthe Warcraft benchmark, the method has an additional keyword argument to chose the algorithm to use: Dijkstra's algorithm or Bellman-Ford algorithm.
+
+````@example warcraft_tutorial
+y = maximizer(θ)
+````
+
+As we can see, currently the pipeline predicts random noise as cell weights, and therefore the maximizer returns a straight line path.
+
+````@example warcraft_tutorial
+plot_data(b, DataSample(; x, θ, y))
+````
+
+We can evaluate the current pipeline performance using the optimality gap metric:
+
+````@example warcraft_tutorial
+starting_gap = compute_gap(b, test_dataset, model, maximizer)
+````
+
+## Using a learning algorithm
+
+We can now train the model using the InferOpt.jl package:
+
+````@example warcraft_tutorial
+using InferOpt
+using Flux
+using Plots
+
+perturbed_maximizer = PerturbedMultiplicative(maximizer; ε=0.2, nb_samples=100)
+loss = FenchelYoungLoss(perturbed_maximizer)
+
+starting_gap = compute_gap(b, test_dataset, model, maximizer)
+
+opt_state = Flux.setup(Adam(1e-3), model)
+loss_history = Float64[]
+for epoch in 1:50
+    val, grads = Flux.withgradient(model) do m
+        sum(loss(m(x), y) for (; x, y) in train_dataset) / length(train_dataset)
+    end
+    Flux.update!(opt_state, model, grads[1])
+    push!(loss_history, val)
+end
+
+plot(loss_history; xlabel="Epoch", ylabel="Loss", title="Training loss")
+````
+
+````@example warcraft_tutorial
+final_gap = compute_gap(b, test_dataset, model, maximizer)
+````
+
+````@example warcraft_tutorial
+θ = model(x)
+y = maximizer(θ)
+plot_data(b, DataSample(; x, θ, y))
+````
+
+---
+
+*This page was generated using [Literate.jl](https://github.com/fredrikekre/Literate.jl).*
+
diff --git a/src/DynamicAssortment/DynamicAssortment.jl b/src/DynamicAssortment/DynamicAssortment.jl
@@ -139,7 +139,7 @@ function Utils.generate_baseline_policies(::DynamicAssortmentBenchmark)
         "policy that selects the assortment with the highest expected revenue",
         expert_policy,
     )
-    return (expert, greedy)
+    return (; expert, greedy)
 end
 
 export DynamicAssortmentBenchmark
diff --git a/src/DynamicVehicleScheduling/DynamicVehicleScheduling.jl b/src/DynamicVehicleScheduling/DynamicVehicleScheduling.jl
@@ -115,20 +115,6 @@ end
 """
 $TYPEDSIGNATURES
 
-Generate an anticipative solution for the dynamic vehicle scheduling benchmark.
-The solution is computed using the anticipative solver with the benchmark's feature configuration.
-"""
-function Utils.generate_anticipative_solution(
-    b::DynamicVehicleSchedulingBenchmark, args...; kwargs...
-)
-    return anticipative_solver(
-        args...; kwargs..., two_dimensional_features=b.two_dimensional_features
-    )
-end
-
-"""
-$TYPEDSIGNATURES
-
 Return the anticipative solver for the dynamic vehicle scheduling benchmark.
 The callable takes a scenario and solver kwargs (including `instance`) and returns a
 training trajectory as a `Vector{DataSample}`.
@@ -160,7 +146,7 @@ function Utils.generate_baseline_policies(::DynamicVehicleSchedulingBenchmark)
         "Greedy policy that dispatches vehicles to the nearest customer.",
         greedy_policy,
     )
-    return (lazy, greedy)
+    return (; lazy, greedy)
 end
 
 """
diff --git a/src/FixedSizeShortestPath/FixedSizeShortestPath.jl b/src/FixedSizeShortestPath/FixedSizeShortestPath.jl
@@ -142,6 +142,4 @@ function Utils.generate_statistical_model(
 end
 
 export FixedSizeShortestPathBenchmark
-export generate_dataset, generate_maximizer, generate_statistical_model
-
 end
diff --git a/src/Maintenance/Maintenance.jl b/src/Maintenance/Maintenance.jl
@@ -22,7 +22,6 @@ The number of simultaneous maintenance operations is limited by a maintenance ca
 
 # Fields
 $TYPEDFIELDS
-
 """
 struct MaintenanceBenchmark <: AbstractDynamicBenchmark{true}
     "number of components"
@@ -126,7 +125,7 @@ end
 """
 $TYPEDSIGNATURES
 
-Returns two policies for the dynamic assortment benchmark:
+Returns a policy for the maintenance benchmark:
 - `Greedy`: maintains components when they are in the last state before failure, up to the maintenance capacity
 """
 function Utils.generate_baseline_policies(::MaintenanceBenchmark)
@@ -135,7 +134,7 @@ function Utils.generate_baseline_policies(::MaintenanceBenchmark)
         "policy that maintains components when they are in the last state before failure, up to the maintenance capacity",
         greedy_policy,
     )
-    return (greedy,)
+    return (; greedy)
 end
 
 export MaintenanceBenchmark
diff --git a/src/PortfolioOptimization/PortfolioOptimization.jl b/src/PortfolioOptimization/PortfolioOptimization.jl
diff --git a/src/SubsetSelection/SubsetSelection.jl b/src/SubsetSelection/SubsetSelection.jl
diff --git a/src/Warcraft/Warcraft.jl b/src/Warcraft/Warcraft.jl

Original file line number	Diff line number	Diff line change
`@@ -139,7 +139,7 @@ function Utils.generate_baseline_policies(::DynamicAssortmentBenchmark)`
`139`	`139`	`"policy that selects the assortment with the highest expected revenue",`
`140`	`140`	`expert_policy,`
`141`	`141`	`)`
`142`		`- return (expert, greedy)`
	`142`	`+ return (; expert, greedy)`
`143`	`143`	`end`
`144`	`144`
`145`	`145`	`export DynamicAssortmentBenchmark`