fix tutorial

BatyLeo · BatyLeo · commit 115113be1197 · 2026-03-17T17:01:00.000+01:00
diff --git a/docs/src/benchmarks/maintenance.md b/docs/src/benchmarks/maintenance.md
@@ -7,17 +7,17 @@ The Maintenance problem with resource constraint is a sequential decision-making
 
 ### Overview
 
-In this benchmark, a system consists of $N$ identical components, each of which can degrade over $n$ discrete states. State $1$ means that the component is new, state $n$ means that the component is failed. At each time step, the agent can maintain up to $K$ components.  
+In this benchmark, a system consists of ``N`` identical components, each of which can degrade over ``n`` discrete states. State ``1`` means that the component is new, state $n$ means that the component is failed. At each time step, the agent can maintain up to $K$ components.  
 
 This forms an endogenous multistage stochastic optimization problem, where the agent must plan maintenance actions over the horizon.
 
 ### Mathematical Formulation
 
 The maintenance problem can be formulated as a finite-horizon Markov Decision Process (MDP) with the following components:
 
-**State Space** $\mathcal{S}$: At time step $t$, the state $s_t \in [1:n]^N$ is the degradation state for each component.
+**State Space** ``\mathcal{S}``: At time step ``t``, the state ``s_t \in [1:n]^N`` is the degradation state for each component.
 
-**Action Space** $\mathcal{A}$: The action at time $t$ is the set of components that are maintained at time $t$:
+**Action Space** ``\mathcal{A}``: The action at time ``t`` is the set of components that are maintained at time ``t``:
 ```math
 a_t \subseteq \{1, 2, \ldots, N\} \text{ such that } |a_t| \leq K
 ```
@@ -51,9 +51,9 @@ Here, \(p\) is the degradation probability, \(s_t^i\) is the current state of co
 
 The immediate cost at time \(t\) is:
 
-$$
+```math
 c(s_t, a_t) = \Big( c_m \cdot |a_t| + c_f \cdot \#\{ i : s_t^i = n \} \Big)
-$$
+```
 
 Where:
 
diff --git a/docs/src/tutorials/warcraft_tutorial.jl b/docs/src/tutorials/warcraft_tutorial.jl
@@ -8,6 +8,7 @@ The map is represented as a 2D image representing a 12x12 grid, each cell having
 
 # First, let's load the package and create a benchmark object as follows:
 using DecisionFocusedLearningBenchmarks
+using Plots
 b = WarcraftBenchmark()
 
 # ## Dataset generation
@@ -59,7 +60,6 @@ starting_gap = compute_gap(b, test_dataset, model, maximizer)
 # We can now train the model using the InferOpt.jl package:
 using InferOpt
 using Flux
-using Plots
 
 perturbed_maximizer = PerturbedMultiplicative(maximizer; ε=0.2, nb_samples=100)
 loss = FenchelYoungLoss(perturbed_maximizer)