diff --git a/README.md b/README.md
index ffbb9bf..8765141 100644
--- a/README.md
+++ b/README.md
@@ -6,15 +6,84 @@
 [![Coverage](https://codecov.io/gh/JuliaDecisionFocusedLearning/DecisionFocusedLearningBenchmarks.jl/branch/main/graph/badge.svg)](https://app.codecov.io/gh/JuliaDecisionFocusedLearning/DecisionFocusedLearningBenchmarks.jl)
 [![Code Style: Blue](https://img.shields.io/badge/code%20style-blue-4495d1.svg)](https://github.com/JuliaDiff/BlueStyle)
 
-This repository contains a collection of benchmark problems for decision-focused learning algorithms.
-It provides a common interface for creating datasets, associated statistical models and combinatorial optimization maximizers for building decision-focused learning pipelines.
-They can be used for instance as benchmarks for tools in [InferOpt.jl](https://github.com/JuliaDecisionFocusedLearning/InferOpt.jl), but can be used in any other context as well.
-
-Currently, this package provides the following benchmark problems (many more to come!):
-- `SubsetSelectionBenchmark`: a minimalist subset selection problem.
-- `FixedSizeShortestPathBenchmark`: shortest path problem with on a graph with fixed size.
-- `WarcraftBenchmark`: shortest path problem on image maps
-- `PortfolioOptimizationBenchmark`: portfolio optimization problem.
-- `StochasticVehicleSchedulingBenchmark`: stochastic vehicle scheduling problem.
-
-See the [documentation](https://JuliaDecisionFocusedLearning.github.io/DecisionFocusedLearningBenchmarks.jl/stable/) for more details.
+## What is Decision-Focused Learning?
+
+Decision-focused learning (DFL) is a paradigm that integrates machine learning prediction with combinatorial optimization to make better decisions under uncertainty. Unlike traditional "predict-then-optimize" approaches that optimize prediction accuracy independently of downstream decision quality, DFL directly optimizes end-to-end decision performance.
+
+A typical DFL algorithm involves training a parametrized policy that combines a statistical predictor with an optimization component:
+```math
+\xrightarrow[\text{Instance}]{x}
+\fbox{Statistical model $\varphi_w$}
+\xrightarrow[\text{Parameters}]{\theta}
+\fbox{CO algorithm $f$}
+\xrightarrow[\text{Solution}]{y}
+```
+
+Where:
+- **Instance** $x$: input data (e.g., features, context)
+- **Statistical model** $\varphi_w$: machine learning predictor (e.g., neural network)
+- **Parameters** $\theta$: predicted parameters for the optimization problem
+- **CO algorithm** $f$: combinatorial optimization solver
+- **Solution** $y$: final decision/solution
+
+## Package Overview
+
+**DecisionFocusedLearningBenchmarks.jl** provides a comprehensive collection of benchmark problems for evaluating decision-focused learning algorithms. The package offers:
+
+- **Standardized benchmark problems** spanning diverse application domains
+- **Common interfaces** for datasets, statistical models, and optimization components  
+- **Ready-to-use pipelines** compatible with [InferOpt.jl](https://github.com/JuliaDecisionFocusedLearning/InferOpt.jl) and the whole [JuliaDecisionFocusedLearning](https://github.com/JuliaDecisionFocusedLearning) ecosystem
+- **Evaluation tools** for comparing algorithm performance
+
+## Benchmark Categories
+
+The package organizes benchmarks into three main categories based on their problem structure:
+
+### Static Benchmarks (`AbstractBenchmark`)
+Single-stage optimization problems with no randomness involved:
+- [`ArgmaxBenchmark`](@ref): argmax toy problem
+- [`Argmax2DBenchmark`](@ref): 2D argmax toy problem
+- [`RankingBenchmark`](@ref): ranking problem
+- [`SubsetSelectionBenchmark`](@ref): select optimal subset of items
+- [`PortfolioOptimizationBenchmark`](@ref): portfolio optimization problem
+- [`FixedSizeShortestPathBenchmark`](@ref): find shortest path on grid graphs with fixed size
+- [`WarcraftBenchmark`](@ref): shortest path on image maps
+
+### Stochastic Benchmarks (`AbstractStochasticBenchmark`)  
+Single-stage problems with random noise affecting the objective:
+- [`StochasticVehicleSchedulingBenchmark`](@ref): stochastic vehicle scheduling under delay uncertainty
+
+### Dynamic Benchmarks (`AbstractDynamicBenchmark`)
+Multi-stage sequential decision-making problems:
+- [`DynamicVehicleSchedulingBenchmark`](@ref): multi-stage vehicle scheduling under customer uncertainty
+- [`DynamicAssortmentBenchmark`](@ref): sequential product assortment selection
+
+## Getting Started
+
+In a few lines of code, you can create benchmark instances, generate datasets, initialize learning components, and evaluate performance, using the same syntax across all benchmarks:
+
+```julia
+using DecisionFocusedLearningBenchmarks
+
+# Create a benchmark instance for the argmax problem
+benchmark = ArgmaxBenchmark()
+
+# Generate training data
+dataset = generate_dataset(benchmark, 100)
+
+# Initialize policy components
+model = generate_statistical_model(benchmark)
+maximizer = generate_maximizer(benchmark)
+
+# Training algorithm you want to use
+# ... your training code here ...
+
+# Evaluate performance
+gap = compute_gap(benchmark, dataset, model, maximizer)
+```
+
+## Related Packages
+
+This package is part of the [JuliaDecisionFocusedLearning](https://github.com/JuliaDecisionFocusedLearning) organization, and built to be compatible with other packages in the ecosystem:
+- **[InferOpt.jl](https://github.com/JuliaDecisionFocusedLearning/InferOpt.jl)**: differentiable optimization layers and losses for decision-focused learning
+- **[DecisionFocusedLearningAlgorithms.jl](https://github.com/JuliaDecisionFocusedLearning/DecisionFocusedLearningAlgorithms.jl)**: collection of generic black-box implementations of decision-focused learning algorithms
diff --git a/docs/make.jl b/docs/make.jl
index 4a85f93..b4eaaf9 100644
--- a/docs/make.jl
+++ b/docs/make.jl
@@ -9,12 +9,11 @@ tutorial_dir = joinpath(@__DIR__, "src", "tutorials")
 benchmarks_dir = joinpath(@__DIR__, "src", "benchmarks")
 api_dir = joinpath(@__DIR__, "src", "api")
 
-api_files = map(x -> joinpath("api", x), readdir(api_dir))
 tutorial_files = readdir(tutorial_dir)
 md_tutorial_files = [split(file, ".")[1] * ".md" for file in tutorial_files]
 benchmark_files = [joinpath("benchmarks", e) for e in readdir(benchmarks_dir)]
 
-include_tutorial = true
+include_tutorial = false
 
 if include_tutorial
     for file in tutorial_files
@@ -29,10 +28,13 @@ makedocs(;
     sitename="DecisionFocusedLearningBenchmarks.jl",
     format=Documenter.HTML(; size_threshold=typemax(Int)),
     pages=[
-        "Home" => "index.md",
+        "Home" => [
+            "Getting started" => "index.md",
+            "Understanding Benchmark Interfaces" => "benchmark_interfaces.md",
+        ],
         "Tutorials" => include_tutorial ? md_tutorial_files : [],
         "Benchmark problems list" => benchmark_files,
-        "API reference" => "api/api.md",
+        "API reference" => "api.md",
     ],
 )
 
diff --git a/docs/src/api/api.md b/docs/src/api.md
similarity index 89%
rename from docs/src/api/api.md
rename to docs/src/api.md
index 36135ca..9f45d03 100644
--- a/docs/src/api/api.md
+++ b/docs/src/api.md
@@ -2,15 +2,11 @@
 
 ## Interface
 
-### Public
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.Utils]
 Private = false
 ```
 
-### Private
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.Utils]
 Public = false
@@ -18,15 +14,11 @@ Public = false
 
 ## Argmax2D
 
-### Public
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.Argmax2D]
 Private = false
 ```
 
-### Private
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.Argmax2D]
 Public = false
@@ -34,15 +26,11 @@ Public = false
 
 ## Argmax
 
-### Public
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.Argmax]
 Private = false
 ```
 
-### Private
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.Argmax]
 Public = false
@@ -50,15 +38,11 @@ Public = false
 
 ## Dynamic Vehicle Scheduling
 
-### Public
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.DynamicVehicleScheduling]
 Private = false
 ```
 
-### Private
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.DynamicVehicleScheduling]
 Public = false
@@ -66,15 +50,11 @@ Public = false
 
 ## Dynamic Assortment
 
-### Public
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.DynamicAssortment]
 Private = false
 ```
 
-### Private
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.DynamicAssortment]
 Public = false
@@ -82,15 +62,11 @@ Public = false
 
 ## Fixed-size shortest path
 
-### Public
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.FixedSizeShortestPath]
 Private = false
 ```
 
-### Private
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.FixedSizeShortestPath]
 Public = false
@@ -98,15 +74,11 @@ Public = false
 
 ## Portfolio Optimization
 
-### Public
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.PortfolioOptimization]
 Private = false
 ```
 
-### Private
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.PortfolioOptimization]
 Public = false
@@ -114,15 +86,11 @@ Public = false
 
 ## Ranking
 
-### Public
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.Ranking]
 Private = false
 ```
 
-### Private
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.Ranking]
 Public = false
@@ -130,15 +98,11 @@ Public = false
 
 ## Subset selection
 
-### Public
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.SubsetSelection]
 Private = false
 ```
 
-### Private
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.SubsetSelection]
 Public = false
@@ -146,15 +110,11 @@ Public = false
 
 ## Stochastic Vehicle Scheduling
 
-### Public
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.StochasticVehicleScheduling]
 Private = false
 ```
 
-### Private
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.StochasticVehicleScheduling]
 Public = false
@@ -162,15 +122,11 @@ Public = false
 
 ## Warcraft
 
-### Public
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.Warcraft]
 Private = false
 ```
 
-### Private
-
 ```@autodocs
 Modules = [DecisionFocusedLearningBenchmarks.Warcraft]
 Public = false
diff --git a/docs/src/benchmark_interfaces.md b/docs/src/benchmark_interfaces.md
new file mode 100644
index 0000000..19fc231
--- /dev/null
+++ b/docs/src/benchmark_interfaces.md
@@ -0,0 +1,153 @@
+# Understanding Benchmark Interface
+
+This guide explains how benchmarks work through common interfaces in DecisionFocusedLearningBenchmarks.jl.
+Understanding this interface is essential for using existing benchmarks and implementing new ones.
+
+## Core Concepts
+
+### DataSample Structure
+
+All benchmarks work with [`DataSample`](@ref) objects that encapsulate the data needed for decision-focused learning:
+
+```julia
+@kwdef struct DataSample{I,F,S,C}
+    x::F = nothing           # Input features  
+    θ_true::C = nothing      # True cost/utility parameters
+    y_true::S = nothing      # True optimal solution
+    instance::I = nothing    # Problem instance object/additional data
+end
+```
+
+The `DataSample` provides flexibility - not all fields need to be populated depending on the benchmark type and use case.
+
+### Benchmark Type Hierarchy
+
+The package defines a hierarchy of three abstract types:
+
+```
+AbstractBenchmark
+├── AbstractStochasticBenchmark{exogenous}
+    └── AbstractDynamicBenchmark{exogenous}
+```
+
+- **`AbstractBenchmark`**: static, single-stage optimization problems
+- **`AbstractStochasticBenchmark{exogenous}`**: stochastic, single stage optimization problems
+ **`AbstractDynamicBenchmark{exogenous}`**: multi-stage sequential decision problems
+
+The `{exogenous}` type parameter indicates whether uncertainty distribution comes from external sources (`true`) or is influenced by decisions (`false`), which affects available methods.
+
+## Common Interface Methods
+
+### Data Generation
+
+Every benchmark must implement a data generation method:
+
+```julia
+# Generate a single sample
+generate_sample(benchmark::AbstractBenchmark, rng::AbstractRNG; kwargs...) -> DataSample
+```
+This method should generate a single `DataSample` given a random number generator and optional parameters.
+
+If needed, benchmarks can instead override the [`generate_dataset`](@ref) method to directly create the entire dataset:
+```julia
+generate_dataset(benchmark::AbstractBenchmark, size::Int; kwargs...) -> Vector{DataSample}
+```
+
+The default `generate_dataset` implementation calls `generate_sample` repeatedly, but benchmarks can override this for custom dataset generation logic.
+
+### DFL Policy Components
+
+Benchmarks provide the building blocks for decision-focused learning policies:
+
+```julia
+# Create a statistical model (e.g., a neural network)
+generate_statistical_model(benchmark::AbstractBenchmark; kwargs...)
+
+# Create an optimization maximizer/solver
+generate_maximizer(benchmark::AbstractBenchmark; kwargs...)
+```
+
+The statistical model typically maps from features `x` to cost parameters `θ`.
+The maximizer solves optimization problems given cost parameters `θ` (and potentially additional problem dependent keyword arguments), returning decision `y`.
+
+### Benchmark Policies
+
+Benchmarks can provide baseline policies for comparison and evaluation:
+
+```julia
+# Get baseline policies for comparison
+generate_policies(benchmark::AbstractBenchmark) -> Tuple{Policy}
+```
+This returns a tuple of `Policy` objects representing different benchmark-specific policies:
+```julia
+struct Policy{F}
+    name::String
+    description::String  
+    policy_function::F
+end
+```
+A `Policy` is just a function with a name and description.
+
+Policies can be evaluated across multiple instances/environments using:
+```julia
+evaluate_policy!(policy::Policy, instances; kwargs...) -> (rewards, data_samples)
+```
+
+### Evaluation Methods
+
+Optional methods for analysis and visualization:
+
+```julia
+# Visualize data samples
+plot_data(benchmark::AbstractBenchmark, sample::DataSample; kwargs...)
+plot_instance(benchmark::AbstractBenchmark, instance; kwargs...)  
+plot_solution(benchmark::AbstractBenchmark, sample::DataSample, solution; kwargs...)
+
+# Compute optimality gap
+compute_gap(benchmark::AbstractBenchmark, dataset, model, maximizer) -> Float64
+
+# Evaluate objective value
+objective_value(benchmark::AbstractBenchmark, sample::DataSample, solution)
+```
+
+## Benchmark-Specific Interfaces
+
+### Static Benchmarks
+
+Static benchmarks follow the basic interface above.
+
+### Stochastic Benchmarks
+
+Exogenous stochastic benchmarks add methods for scenario generation and anticipative solutions:
+
+```julia
+# Generate uncertainty scenarios (for exogenous benchmarks)
+generate_scenario(benchmark::AbstractStochasticBenchmark{true}, instance; kwargs...)
+
+# Solve anticipative optimization problem for given scenario
+generate_anticipative_solution(benchmark::AbstractStochasticBenchmark{true}, 
+                               instance, scenario; kwargs...)
+```
+
+### Dynamic Benchmarks
+
+In order to model sequential decision-making, dynamic benchmarks additionally work with environments.
+For this, they implement methods to create environments from instances or datasets:
+```julia
+# Create environment for sequential decision-making
+generate_environment(benchmark::AbstractDynamicBenchmark, instance, rng; kwargs...) -> <:AbstractEnvironment
+
+# Generate multiple environments
+generate_environments(benchmark::AbstractDynamicBenchmark, dataset; kwargs...) -> Vector{<:AbstractEnvironment}
+```
+Similarly to `generate_dataset` and `generate_sample`, one only needs to implement `generate_environment`, as `generate_environments` has a default implementation that calls it repeatedly.
+
+The [`AbstractEnvironment`](@ref) interface is defined as follows:
+```julia
+# Environment methods
+get_seed(env::AbstractEnvironment)  # Get current RNG seed
+reset!(env::AbstractEnvironment; reset_rng::Bool, seed=get_seed(env))  # Reset to initial state
+observe(env::AbstractEnvironment) -> (obs, info)    # Get current observation  
+step!(env::AbstractEnvironment, action) -> reward   # Take action, get reward
+is_terminated(env::AbstractEnvironment) -> Bool     # Check if episode ended
+```
diff --git a/docs/src/benchmarks/dvsp.md b/docs/src/benchmarks/dvsp.md
index 2b96c67..f76d443 100644
--- a/docs/src/benchmarks/dvsp.md
+++ b/docs/src/benchmarks/dvsp.md
@@ -1,3 +1,132 @@
 # Dynamic Vehicle Scheduling
 
-[`DynamicVehicleSchedulingBenchmark`](@ref).
+The Dynamic Vehicle Scheduling Problem (DVSP) is a sequential decision-making problem where an agent must dynamically dispatch vehicles to serve customers that arrive over time.
+
+## Problem Description
+
+### Overview
+
+In the dynamic vehicle scheduling problem, a fleet operator must decide at each time step which customer requests to serve immediately and which to postpone to future time steps.
+The goal is to serve all customers by the end of the planning horizon while minimizing total travel time.
+
+This is a simplified version of the more complex Dynamic Vehicle Routing Problem with Time Windows (DVRPTW), focusing on the core sequential decision-making aspects without capacity or time window constraints.
+
+The problem is characterized by:
+- **Exogenous noise**: customer arrivals are stochastic and follow a fixed known distribution, independent of the agent's actions
+- **Combinatorial action space**: at each time step, the agent must build vehicle routes to serve selected customers, which leads to a huge combinatorial action space
+
+### Mathematical Formulation
+
+The dynamic vehicle scheduling problem can be formulated as a finite-horizon Markov Decision Process (MDP):
+
+**State Space** ``\mathcal{S}``: At time step ``t``, the state ``s_t`` consists of:
+```math
+s_t = (R_t, D_t, t)
+```
+where:
+- ``R_t`` are the pending customer requests (not yet served), where each request ``r_i \in R_t`` contains:
+  - ``x_i, y_i``: 2d spatial coordinates of the customer location
+  - ``\tau_i``: start time when the customer needs to be served
+  - ``s_i``: service time required to serve the customer
+- ``D_t`` indicates which requests must be dispatched this time step (i.e. that cannot be postponed further, otherwise they will be infeasible at the next time step because of their start time)
+- ``t \in \{1, 2, \ldots, T\}`` is the current time step
+
+The state also implicitly includes (constant over time):
+- Travel duration matrix ``d_{ij}``: time to travel from location ``i`` to location ``j``
+- Depot location
+
+**Action Space** ``\mathcal{A}``: The action at time step ``t`` is a set of vehicle routes:
+```math
+a_t = \{r_1, r_2, \ldots, r_k\}
+```
+where each route ``r_i`` is a sequence of customer that starts and ends at the depot.
+
+A route is feasible if:
+- It starts and ends at the depot
+- It follows time constraints, i.e. customers are served on time
+
+**Transition Dynamics** ``\mathcal{P}(s_{t+1} | s_t, a_t)``: After executing routes ``a_t``:
+
+1. **Remove served customers** from the pending request set
+2. **Generate new customer arrivals** according to the underlying exogenous distribution
+3. **Update must-dispatch set** based on postponement rules
+
+**Reward Function** ``r(s_t, a_t)``: The immediate reward is the negative total travel time of the routes:
+
+```math
+r(s_t, a_t) = - \sum_{r \in a_t} \sum_{(i,j) \in r} d_{ij}
+```
+
+where ``d_{ij}`` is the travel duration from location ``i`` to location ``j``, and the sum is over all consecutive location pairs in each route ``r``.
+
+**Objective**: Find a policy ``\pi: \mathcal{S} \to \mathcal{A}`` that maximizes expected cumulative reward:
+```math
+\max_\pi \mathbb{E}\left[\sum_{t=1}^T r(s_t, \pi(s_t)) \right]
+```
+
+## Key Components
+
+### [`DynamicVehicleSchedulingBenchmark`](@ref)
+
+The main benchmark configuration with the following parameters:
+
+- `max_requests_per_epoch`: Maximum number of new customer requests per time step (default: 10)
+- `Δ_dispatch`: Time delay between decision and vehicle dispatch (default: 1.0)
+- `epoch_duration`: Duration of each decision time step (default: 1.0)
+- `two_dimensional_features`: Whether to use simplified 2D features instead of full feature set (default: false)
+
+### Instance Generation
+
+Problem instances are generated from static vehicle routing datasets and include:
+
+- **Customer locations**: Spatial coordinates for pickup/delivery points
+- **Depot location**: Central starting and ending point for all routes
+- **Travel times**: Distance/duration matrix between all location pairs
+- **Service requirements**: Time needed to serve each customer
+
+The dynamic version samples new customer arrivals from the static instance, drawing new customers by independently sampling their locations and service times.
+
+### Features
+
+The benchmark provides two feature representations:
+
+**Full Features** (14-dimensional):
+- Start times for postponable requests
+- End times (start + service time)
+- Travel time from depot to request
+- Travel time from request to depot  
+- Slack time until next time step
+- Quantile-based travel times to other requests (9 quantiles)
+
+**2D Features** (simplified):
+- Travel time from depot to request
+- Mean travel time to other requests
+
+## Benchmark Policies
+
+### Lazy Policy
+
+The lazy policy postpones all possible requests, serving only those that must be dispatched.
+
+### Greedy Policy  
+
+The greedy policy serves all pending requests as soon as they arrive, without considering future consequences. 
+
+## Decision-Focused Learning Policy
+
+```math
+\xrightarrow[\text{State}]{s_t}
+\fbox{Neural network $\varphi_w$}
+\xrightarrow[\text{Priorities}]{\theta}
+\fbox{Prize-collecting VSP}
+\xrightarrow[\text{Routes}]{a_t}
+```
+
+**Components**:
+
+1. **Neural Network** ``\varphi_w``: Takes current state features as input and predicts customer priorities ``\theta = (\theta_1, \ldots, \theta_n)``
+2. **Optimization Layer**: Solves the prize-collecting vehicle scheduling problem to determine optimal routes given the predicted priorities
+
+The neural network architecture adapts to the feature dimensionality:
+- **2D features**: `Dense(2 => 1)` followed by vectorization
+- **Full features**: `Dense(14 => 1)` followed by vectorization
diff --git a/src/DynamicAssortment/DynamicAssortment.jl b/src/DynamicAssortment/DynamicAssortment.jl
index e3455bd..c943dba 100644
--- a/src/DynamicAssortment/DynamicAssortment.jl
+++ b/src/DynamicAssortment/DynamicAssortment.jl
@@ -142,8 +142,5 @@ function Utils.generate_policies(::DynamicAssortmentBenchmark)
 end
 
 export DynamicAssortmentBenchmark
-public generate_sample, generate_statistical_model, generate_maximizer
-public generate_environment, generate_policies
-public reset!, is_terminated, observe, step!
 
 end
diff --git a/src/DynamicVehicleScheduling/DynamicVehicleScheduling.jl b/src/DynamicVehicleScheduling/DynamicVehicleScheduling.jl
index 7421032..6b6c5b2 100644
--- a/src/DynamicVehicleScheduling/DynamicVehicleScheduling.jl
+++ b/src/DynamicVehicleScheduling/DynamicVehicleScheduling.jl
@@ -12,7 +12,7 @@ using InferOpt: LinearMaximizer
 using IterTools: partition
 using JSON
 using JuMP
-using Plots: plot, plot!, scatter!
+using Plots: plot, plot!, scatter!, @animate, Plots, gif
 using Printf: @printf
 using Random: Random, AbstractRNG, MersenneTwister, seed!, randperm
 using Requires: @require
diff --git a/src/DynamicVehicleScheduling/anticipative_solver.jl b/src/DynamicVehicleScheduling/anticipative_solver.jl
index e21ee53..863517f 100644
--- a/src/DynamicVehicleScheduling/anticipative_solver.jl
+++ b/src/DynamicVehicleScheduling/anticipative_solver.jl
@@ -51,7 +51,9 @@ function anticipative_solver(
     nb_epochs=typemax(Int),
     seed=get_seed(env),
 )
-    reset_env && reset!(env; reset_rng=true, seed)
+    if reset_env
+        reset!(env; reset_rng=true, seed)
+    end
 
     start_epoch = current_epoch(env)
     end_epoch = min(last_epoch(env), start_epoch + nb_epochs - 1)
@@ -73,7 +75,7 @@ function anticipative_solver(
 
     nb_nodes = length(customer_index)
     job_indices = 2:nb_nodes
-    epoch_indices = T#first_epoch:last_epoch
+    epoch_indices = T
 
     @variable(model, y[i = 1:nb_nodes, j = 1:nb_nodes, t = epoch_indices]; binary=true)
 
@@ -128,6 +130,7 @@ function anticipative_solver(
 
     optimize!(model)
 
+    @assert termination_status(model) == JuMP.MOI.OPTIMAL "Anticipative MIP did not solve to optimality! (status: $(termination_status(model)))"
     obj = JuMP.objective_value(model)
     epoch_routes = retrieve_routes_anticipative(
         value.(y), env, customer_index, epoch_indices
diff --git a/src/DynamicVehicleScheduling/instance.jl b/src/DynamicVehicleScheduling/instance.jl
index d65010c..535942b 100644
--- a/src/DynamicVehicleScheduling/instance.jl
+++ b/src/DynamicVehicleScheduling/instance.jl
@@ -25,13 +25,14 @@ function Instance(
     epoch_duration::Float64=1.0,
     two_dimensional_features::Bool=false,
 )
-    last_epoch = trunc(
-        Int,
-        (
-            maximum(static_instance.start_time) - minimum(static_instance.duration[1, :]) -
-            Δ_dispatch
-        ) / epoch_duration,
-    )
+    last_epoch =
+        trunc(
+            Int,
+            (
+                maximum(static_instance.start_time) -
+                minimum(static_instance.duration[1, :]) - Δ_dispatch
+            ) / epoch_duration,
+        ) - 1
     return Instance(;
         static_instance=static_instance,
         max_requests_per_epoch=max_requests_per_epoch,
diff --git a/src/DynamicVehicleScheduling/plot.jl b/src/DynamicVehicleScheduling/plot.jl
index adb0fa6..98defa2 100644
--- a/src/DynamicVehicleScheduling/plot.jl
+++ b/src/DynamicVehicleScheduling/plot.jl
@@ -1,7 +1,577 @@
-function plot_instance(env::DVSPEnv; kwargs...)
+function plot_instancee(env::DVSPEnv; kwargs...)
     return plot_instance(env.instance.static_instance; kwargs...)
 end
 
+"""
+$TYPEDSIGNATURES
+
+Plot a given DVSPState showing depot, must-dispatch requests, and postponable requests.
+"""
+function plot_state(
+    state::DVSPState;
+    customer_markersize=6,
+    depot_markersize=8,
+    alpha_depot=0.8,
+    depot_color=:lightgreen,
+    must_dispatch_color=:red,
+    postponable_color=:lightblue,
+    show_axis_labels=true,
+    markerstrokewidth=0.5,
+    kwargs...,
+)
+    # Get coordinates from the state instance
+    coordinates = coordinate(state)
+    start_times = start_time(state)
+
+    # Extract x and y coordinates
+    x = [p.x for p in coordinates]
+    y = [p.y for p in coordinates]
+
+    # Create the plot
+    plot_args = Dict(
+        :legend => :topleft, :title => "DVSP State - Epoch $(state.current_epoch)"
+    )
+
+    if show_axis_labels
+        plot_args[:xlabel] = "x coordinate"
+        plot_args[:ylabel] = "y coordinate"
+    end
+
+    # Merge with kwargs
+    for (k, v) in kwargs
+        plot_args[k] = v
+    end
+
+    fig = plot(; plot_args...)
+
+    # Plot depot (always the first coordinate)
+    scatter!(
+        fig,
+        [x[1]],
+        [y[1]];
+        label="Depot",
+        markercolor=depot_color,
+        marker=:rect,
+        markersize=depot_markersize,
+        alpha=alpha_depot,
+        markerstrokewidth=markerstrokewidth,
+    )
+
+    # Plot must-dispatch customers
+    if sum(state.is_must_dispatch) > 0
+        must_dispatch_indices = findall(state.is_must_dispatch)
+        scatter!(
+            fig,
+            x[must_dispatch_indices],
+            y[must_dispatch_indices];
+            label="Must-dispatch requests",
+            markercolor=must_dispatch_color,
+            marker=:star5,
+            markersize=customer_markersize,
+            marker_z=start_times[must_dispatch_indices],
+            colormap=:plasma,
+            markerstrokewidth=markerstrokewidth,
+        )
+    end
+
+    # Plot postponable customers
+    if sum(state.is_postponable) > 0
+        postponable_indices = findall(state.is_postponable)
+        scatter!(
+            fig,
+            x[postponable_indices],
+            y[postponable_indices];
+            label="Postponable requests",
+            markercolor=postponable_color,
+            marker=:utriangle,
+            markersize=customer_markersize,
+            marker_z=start_times[postponable_indices],
+            colormap=:viridis,
+            markerstrokewidth=markerstrokewidth,
+        )
+    end
+
+    return fig
+end
+
+"""
+$TYPEDSIGNATURES
+
+Plot a given DVSPState with routes overlaid, showing depot, requests, and vehicle routes.
+Routes should be provided as a vector of vectors, where each inner vector contains the
+indices of locations visited by that route (excluding the depot).
+"""
+function plot_routes(
+    state::DVSPState,
+    routes::Vector{Vector{Int}};
+    route_colors=nothing,
+    route_linewidth=3,  # Increased from 2 to 3
+    route_alpha=0.7,
+    show_route_labels=true,
+    kwargs...,
+)
+    # Start with the basic state plot
+    fig = plot_state(state; kwargs...)
+
+    # Get coordinates for route plotting
+    coordinates = coordinate(state)
+    x = [p.x for p in coordinates]
+    y = [p.y for p in coordinates]
+
+    # Depot coordinates (always first)
+    x_depot = x[1]
+    y_depot = y[1]
+
+    # Default route colors if not provided
+    if isnothing(route_colors)
+        route_colors = [:blue, :purple, :orange, :brown, :pink, :gray, :olive, :cyan]
+    end
+
+    # Plot each route
+    for (route_idx, route) in enumerate(routes)
+        if !isempty(route)
+            # Create route path: depot -> customers -> depot
+            route_x = vcat(x_depot, x[route], x_depot)
+            route_y = vcat(y_depot, y[route], y_depot)
+
+            # Select color for this route
+            color = route_colors[(route_idx - 1) % length(route_colors) + 1]
+
+            # Plot the route with more visible styling
+            label = show_route_labels ? "Route $route_idx" : nothing
+            plot!(
+                fig,
+                route_x,
+                route_y;
+                # color=color,
+                linewidth=route_linewidth,
+                alpha=1.0,  # Make routes fully opaque
+                label=label,
+                linestyle=:solid,
+            )
+        end
+    end
+
+    return fig
+end
+
+"""
+$TYPEDSIGNATURES
+
+Plot a given DVSPState with routes overlaid. This version accepts routes as a single
+vector where routes are separated by depot visits (index 1).
+"""
+function plot_routes(state::DVSPState, routes::Vector{Int}; kwargs...)
+    # Convert single route vector to vector of route vectors
+    route_vectors = Vector{Int}[]
+    current_route = Int[]
+
+    for location in routes
+        if location == 1  # Depot visit indicates end of route
+            if !isempty(current_route)
+                push!(route_vectors, copy(current_route))
+                empty!(current_route)
+            end
+        else
+            push!(current_route, location)
+        end
+    end
+
+    # Add the last route if it doesn't end with depot
+    if !isempty(current_route)
+        push!(route_vectors, current_route)
+    end
+
+    return plot_routes(state, route_vectors; kwargs...)
+end
+
+"""
+$TYPEDSIGNATURES
+
+Plot a given DVSPState with routes overlaid. This version accepts routes as a BitMatrix
+where entry (i,j) = true indicates an edge from location i to location j.
+"""
+function plot_routes(state::DVSPState, routes::BitMatrix; kwargs...)
+    # Convert BitMatrix to vector of route vectors
+    n_locations = size(routes, 1)
+    route_vectors = Vector{Int}[]
+
+    # Find all outgoing edges from depot (location 1)
+    depot_destinations = findall(routes[1, :])
+
+    # For each destination from depot, reconstruct the route
+    for dest in depot_destinations
+        if dest != 1  # Skip self-loops at depot
+            route = Int[]
+            current = dest
+            push!(route, current)
+
+            # Follow the route until we return to depot
+            while true
+                # Find next location (should be unique for valid routes)
+                next_locations = findall(routes[current, :])
+
+                # Filter out the depot for intermediate steps
+                non_depot_next = filter(x -> x != 1, next_locations)
+
+                if isempty(non_depot_next)
+                    # Must return to depot, route is complete
+                    break
+                elseif length(non_depot_next) == 1
+                    # Continue to next location
+                    current = non_depot_next[1]
+                    push!(route, current)
+                else
+                    # Multiple outgoing edges - this shouldn't happen in valid routes
+                    # but we'll take the first one
+                    current = non_depot_next[1]
+                    push!(route, current)
+                end
+            end
+
+            if !isempty(route)
+                push!(route_vectors, route)
+            end
+        end
+    end
+
+    return plot_routes(state, route_vectors; kwargs...)
+end
+
+"""
+$TYPEDSIGNATURES
+
+Plot multiple epochs side by side from a vector of DataSample objects.
+Each DataSample should contain an instance (DVSPState) and optionally y_true (routes).
+All subplots will use the same xlims and ylims to show the dynamics clearly.
+"""
+function plot_epochs(
+    data_samples::Vector{<:DataSample};
+    plot_routes_flag=true,
+    cols=nothing,
+    figsize=(1800, 600),
+    margin=0.05,
+    legend_margin_factor=0.15,
+    titlefontsize=14,
+    guidefontsize=12,
+    legendfontsize=11,
+    tickfontsize=10,
+    show_axis_labels=false,
+    show_colorbar=false,
+    kwargs...,
+)
+    n_epochs = length(data_samples)
+
+    if n_epochs == 0
+        error("No data samples provided")
+    end
+
+    # Determine grid layout
+    if isnothing(cols)
+        cols = min(n_epochs, 3)  # Default to max 3 columns
+    end
+    rows = ceil(Int, n_epochs / cols)
+
+    # Calculate global xlims and ylims from all states
+    all_coordinates = []
+    for sample in data_samples
+        if !isnothing(sample.instance)
+            coords = coordinate(sample.instance)
+            append!(all_coordinates, coords)
+        end
+    end
+
+    if isempty(all_coordinates)
+        error("No valid coordinates found in data samples")
+    end
+
+    xlims = (
+        minimum(p.x for p in all_coordinates) - margin,
+        maximum(p.x for p in all_coordinates) + margin,
+    )
+
+    # Add extra margin at the top for legend space
+    y_min = minimum(p.y for p in all_coordinates) - margin
+    y_max = maximum(p.y for p in all_coordinates) + margin
+    y_range = y_max - y_min
+    legend_margin = y_range * legend_margin_factor
+
+    ylims = (y_min, y_max + legend_margin)
+
+    # Calculate global color limits for consistent scaling across subplots
+    all_start_times = []
+    for sample in data_samples
+        if !isnothing(sample.instance)
+            times = start_time(sample.instance)
+            append!(all_start_times, times)
+        end
+    end
+
+    clims = if !isempty(all_start_times)
+        (minimum(all_start_times), maximum(all_start_times))
+    else
+        (0.0, 1.0)  # Default range
+    end
+
+    # Create subplots
+    plots = []
+
+    for (i, sample) in enumerate(data_samples)
+        state = sample.instance
+
+        if isnothing(state)
+            # Create empty plot if no state
+            fig = plot(;
+                xlims=xlims,
+                ylims=ylims,
+                title="Epoch $i (No Data)",
+                titlefontsize=titlefontsize,
+                guidefontsize=guidefontsize,
+                tickfontsize=tickfontsize,
+                legend=false,
+                kwargs...,
+            )
+        else
+            # Plot with or without routes
+            if plot_routes_flag && !isnothing(sample.y_true)
+                fig = plot_routes(
+                    state,
+                    sample.y_true;
+                    xlims=xlims,
+                    ylims=ylims,
+                    clims=clims,
+                    colorbar=false,
+                    title="Epoch $(state.current_epoch)",
+                    titlefontsize=titlefontsize,
+                    guidefontsize=guidefontsize,
+                    legendfontsize=legendfontsize,
+                    tickfontsize=tickfontsize,
+                    show_axis_labels=show_axis_labels,
+                    markerstrokewidth=0.5,
+                    show_route_labels=false,
+                    kwargs...,
+                )
+            else
+                fig = plot_state(
+                    state;
+                    xlims=xlims,
+                    ylims=ylims,
+                    clims=clims,
+                    colorbar=false,
+                    title="Epoch $(state.current_epoch)",
+                    titlefontsize=titlefontsize,
+                    guidefontsize=guidefontsize,
+                    legendfontsize=legendfontsize,
+                    tickfontsize=tickfontsize,
+                    show_axis_labels=show_axis_labels,
+                    markerstrokewidth=0.5,
+                    kwargs...,
+                )
+            end
+        end
+
+        push!(plots, fig)
+    end
+
+    # Calculate dynamic figure size if not specified
+    if figsize == (1800, 600)  # Using default size
+        plot_width = 600 * cols
+        plot_height = 500 * rows
+        figsize = (plot_width, plot_height)
+    end
+
+    # Combine plots in a grid layout with optional shared colorbar
+    if show_colorbar
+        combined_plot = plot(
+            plots...;
+            layout=(rows, cols),
+            size=figsize,
+            link=:both,
+            colorbar=:right,
+            clims=clims,
+        )
+    else
+        combined_plot = plot(
+            plots...; layout=(rows, cols), size=figsize, link=:both, clims=clims
+        )
+    end
+
+    return combined_plot
+end
+
+"""
+$TYPEDSIGNATURES
+
+Plot multiple epochs side by side, optionally filtering to specific epoch indices.
+"""
+function plot_epochs(
+    data_samples::Vector{<:DataSample}, epoch_indices::Vector{Int}; kwargs...
+)
+    filtered_samples = data_samples[epoch_indices]
+    return plot_epochs(filtered_samples; kwargs...)
+end
+
+"""
+$TYPEDSIGNATURES
+
+Create an animated GIF showing the evolution of states and routes over epochs.
+Each frame shows the state and routes for one epoch.
+"""
+function animate_epochs(
+    data_samples::Vector{<:DataSample};
+    filename="dvsp_animation.gif",
+    fps=1,
+    figsize=(800, 600),
+    margin=0.1,
+    legend_margin_factor=0.2,
+    titlefontsize=16,
+    guidefontsize=14,
+    legendfontsize=12,
+    tickfontsize=11,
+    show_axis_labels=true,
+    kwargs...,
+)
+    n_epochs = length(data_samples)
+
+    if n_epochs == 0
+        error("No data samples provided")
+    end
+
+    # Calculate global limits for consistent scaling
+    all_coordinates = []
+    for sample in data_samples
+        if !isnothing(sample.instance)
+            coords = coordinate(sample.instance)
+            append!(all_coordinates, coords)
+        end
+    end
+
+    if isempty(all_coordinates)
+        error("No valid coordinates found in data samples")
+    end
+
+    xlims = (
+        minimum(p.x for p in all_coordinates) - margin,
+        maximum(p.x for p in all_coordinates) + margin,
+    )
+
+    # Add extra margin at the top for legend space
+    y_min = minimum(p.y for p in all_coordinates) - margin
+    y_max = maximum(p.y for p in all_coordinates) + margin
+    y_range = y_max - y_min
+    legend_margin = y_range * legend_margin_factor
+    ylims = (y_min, y_max + legend_margin)
+
+    # Calculate global color limits
+    all_start_times = []
+    for sample in data_samples
+        if !isnothing(sample.instance)
+            times = start_time(sample.instance)
+            append!(all_start_times, times)
+        end
+    end
+
+    clims = if !isempty(all_start_times)
+        (minimum(all_start_times), maximum(all_start_times))
+    else
+        (0.0, 1.0)
+    end
+
+    # Helper function to check if routes exist and are non-empty
+    function has_routes(routes)
+        if isnothing(routes)
+            return false
+        elseif routes isa Vector{Vector{Int}}
+            return any(!isempty(route) for route in routes)
+        elseif routes isa Vector{Int}
+            return !isempty(routes)
+        elseif routes isa BitMatrix
+            return any(routes)
+        else
+            return false
+        end
+    end
+
+    # Create frame plan: determine which epochs have routes
+    frame_plan = []
+    for (epoch_idx, sample) in enumerate(data_samples)
+        # Always add state frame
+        push!(frame_plan, (epoch_idx, :state))
+
+        # Add routes frame only if routes exist
+        if has_routes(sample.y_true)
+            push!(frame_plan, (epoch_idx, :routes))
+        end
+    end
+
+    total_frames = length(frame_plan)
+
+    # Create animation with dynamic frame plan
+    anim = @animate for frame_idx in 1:total_frames
+        epoch_idx, frame_type = frame_plan[frame_idx]
+        sample = data_samples[epoch_idx]
+        state = sample.instance
+
+        if isnothing(state)
+            # Empty frame for missing data
+            plot(;
+                xlims=xlims,
+                ylims=ylims,
+                title="Epoch $epoch_idx (No Data)",
+                titlefontsize=titlefontsize,
+                guidefontsize=guidefontsize,
+                tickfontsize=tickfontsize,
+                legend=false,
+                size=figsize,
+                kwargs...,
+            )
+        else
+            if frame_type == :routes
+                # Show state with routes
+                plot_routes(
+                    state,
+                    sample.y_true;
+                    xlims=xlims,
+                    ylims=ylims,
+                    clims=clims,
+                    title="Epoch $(state.current_epoch) - Routes Dispatched",
+                    titlefontsize=titlefontsize,
+                    guidefontsize=guidefontsize,
+                    legendfontsize=legendfontsize,
+                    tickfontsize=tickfontsize,
+                    show_axis_labels=show_axis_labels,
+                    markerstrokewidth=0.5,
+                    show_route_labels=false,
+                    size=figsize,
+                    kwargs...,
+                )
+            else # frame_type == :state
+                # Show state only
+                plot_state(
+                    state;
+                    xlims=xlims,
+                    ylims=ylims,
+                    clims=clims,
+                    title="Epoch $(state.current_epoch) - Available Requests",
+                    titlefontsize=titlefontsize,
+                    guidefontsize=guidefontsize,
+                    legendfontsize=legendfontsize,
+                    tickfontsize=tickfontsize,
+                    show_axis_labels=show_axis_labels,
+                    markerstrokewidth=0.5,
+                    size=figsize,
+                    kwargs...,
+                )
+            end
+        end
+    end
+
+    # Save as GIF
+    gif(anim, filename; fps=fps)
+
+    return anim
+end
+
 # """
 # $TYPEDSIGNATURES
 
diff --git a/test/dynamic_assortment.jl b/test/dynamic_assortment.jl
index 4d9db08..b057c44 100644
--- a/test/dynamic_assortment.jl
+++ b/test/dynamic_assortment.jl
@@ -272,7 +272,7 @@ end
     b = DynamicAssortmentBenchmark(N=5, d=2, K=3, max_steps=20)
 
     # Generate test data
-    dataset = generate_dataset(b, 5; seed=0)
+    dataset = generate_dataset(b, 10; seed=0)
     environments = generate_environments(b, dataset)
 
     # Get policies
@@ -284,8 +284,8 @@ end
     @test greedy.name == "Greedy"
 
     # Test policy evaluation
-    r_expert, d = evaluate_policy!(expert, environments)
-    r_greedy, _ = evaluate_policy!(greedy, environments)
+    r_expert, d = evaluate_policy!(expert, environments, 10)
+    r_greedy, _ = evaluate_policy!(greedy, environments, 10)
 
     @test length(r_expert) == length(environments)
     @test length(r_greedy) == length(environments)
diff --git a/test/dynamic_vsp_plots.jl b/test/dynamic_vsp_plots.jl
new file mode 100644
index 0000000..cc7c962
--- /dev/null
+++ b/test/dynamic_vsp_plots.jl
@@ -0,0 +1,40 @@
+@testitem "Dynamic VSP Plots" begin
+    using DecisionFocusedLearningBenchmarks.DynamicVehicleScheduling
+    const DVSP = DecisionFocusedLearningBenchmarks.DynamicVehicleScheduling
+    using Plots
+
+    # Create test benchmark and data (similar to scripts/a.jl)
+    b = DynamicVehicleSchedulingBenchmark(; two_dimensional_features=true)
+    dataset = generate_dataset(b, 3)
+    environments = generate_environments(b, dataset; seed=0)
+    env = environments[1]
+
+    # Test basic plotting functions
+    fig1 = DVSP.plot_instancee(env)
+    @test fig1 isa Plots.Plot
+
+    instance = dataset[1].instance
+    scenario = generate_scenario(b, instance; seed=0)
+    v, y = generate_anticipative_solution(b, env, scenario; nb_epochs=3, reset_env=true)
+
+    fig2 = DVSP.plot_epochs(y)
+    @test fig2 isa Plots.Plot
+
+    policies = generate_policies(b)
+    lazy = policies[1]
+    _, d = evaluate_policy!(lazy, env)
+    fig3 = DVSP.plot_routes(d[1].instance, d[1].y_true)
+    @test fig3 isa Plots.Plot
+
+    # Test animation
+    temp_filename = tempname() * ".gif"
+    try
+        anim = DVSP.animate_epochs(y; filename=temp_filename, fps=1)
+        @test anim isa Plots.AnimatedGif || anim isa Plots.Animation
+        @test isfile(temp_filename)
+    finally
+        if isfile(temp_filename)
+            rm(temp_filename)
+        end
+    end
+end