Skip to content

Commit 778e7ec

Browse files
committed
Corrected some bugs and added documentation
1 parent 9cfc83b commit 778e7ec

11 files changed

Lines changed: 217 additions & 101 deletions

File tree

Project.toml

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,6 @@ uuid = "2fbe496a-299b-4c81-bab5-c44dfc55cf20"
33
authors = ["Members of JuliaDecisionFocusedLearning"]
44
version = "0.4.0"
55

6-
[workspace]
7-
projects = ["docs", "test"]
8-
96
[deps]
107
Colors = "5ae59095-9a9b-59fe-a467-6f913c188581"
118
Combinatorics = "861a8166-3701-5b0c-9a16-15d98fcdc6aa"
@@ -23,6 +20,7 @@ Ipopt = "b6b21f68-93f8-5de0-b562-5493be1d77c9"
2320
IterTools = "c8e1da08-722c-5040-9ed9-7db0dc04731e"
2421
JSON = "682c06a0-de6a-54ab-a142-c8b1cf79cde6"
2522
JuMP = "4076af6c-e467-56ae-b986-b466b2749572"
23+
JuliaFormatter = "98e50ef6-434e-11e9-1051-2b60c6c9e899"
2624
LaTeXStrings = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f"
2725
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
2826
Metalhead = "dbeba491-748d-5e0e-a39e-b530a07fa0cc"
@@ -54,6 +52,7 @@ Ipopt = "1.6"
5452
IterTools = "1.10.0"
5553
JSON = "1"
5654
JuMP = "1.22"
55+
JuliaFormatter = "1.0.62"
5756
LaTeXStrings = "1.4.0"
5857
LinearAlgebra = "1"
5958
Metalhead = "0.9.4"
@@ -68,3 +67,6 @@ SparseArrays = "1"
6867
Statistics = "1"
6968
StatsBase = "0.34.4"
7069
julia = "1.10"
70+
71+
[workspace]
72+
projects = ["docs", "test"]

docs/src/api.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,18 @@ Modules = [DecisionFocusedLearningBenchmarks.FixedSizeShortestPath]
7272
Public = false
7373
```
7474

75+
## Maintenance
76+
77+
```@autodocs
78+
Modules = [DecisionFocusedLearningBenchmarks.Maintenance]
79+
Private = false
80+
```
81+
82+
```@autodocs
83+
Modules = [DecisionFocusedLearningBenchmarks.Maintenance]
84+
Public = false
85+
```
86+
7587
## Portfolio Optimization
7688

7789
```@autodocs

docs/src/benchmarks/maintenance.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# Maintenance problem with resource constraint
2+
3+
The Maintenance problem with resource constraint is a sequential decision-making benchmark where an agent must repeatedly decide which components to maintain over time. The goal is to minimize total expected cost while accounting for independent degradation of components and limited maintenance capacity.
4+
5+
6+
## Problem Description
7+
8+
### Overview
9+
10+
In this benchmark, a system consists of $N$ identical components, each of which can degrade over $n$ discrete states. State $1$ means that the component is new, state $n$ means that the component is failed. At each time step, the agent can maintain up to $K$ components.
11+
12+
This forms an endogenous multistage stochastic optimization problem, where the agent must plan maintenance actions over the horizon.
13+
14+
### Mathematical Formulation
15+
16+
The maintenance problem can be formulated as a finite-horizon Markov Decision Process (MDP) with the following components:
17+
18+
**State Space** $\mathcal{S}$: At time step $t$, the state $s_t \in [1:n]^N$ is the degradation state for each component.
19+
20+
**Action Space** $\mathcal{A}$: The action at time $t$ is the set of components that are maintained at time $t$:
21+
```math
22+
a_t \subseteq \{1, 2, \ldots, N\} \text{ such that } |a_t| \leq K
23+
```
24+
### Transition Dynamics
25+
26+
The state transitions depend on whether a component is maintained or not:
27+
28+
For each component \(i\) at time \(t\):
29+
30+
- **Maintained component** (\(i \in a_t\)):
31+
32+
\[
33+
s_{t+1}^i = 1 \quad \text{(perfect maintenance)}
34+
\]
35+
36+
- **Unmaintained component** (\(i \notin a_t\)):
37+
38+
\[
39+
s_{t+1}^i =
40+
\begin{cases}
41+
\min(s_t^i + 1, n) & \text{with probability } p,\\
42+
s_t^i & \text{with probability } 1-p.
43+
\end{cases}
44+
\]
45+
46+
Here, \(p\) is the degradation probability, \(s_t^i\) is the current state of component \(i\), and \(n\) is the maximum (failed) state.
47+
48+
---
49+
50+
### Cost Function
51+
52+
The immediate cost at time \(t\) is:
53+
54+
$$
55+
c(s_t, a_t) = \Big( c_m \cdot |a_t| + c_f \cdot \#\{ i : s_t^i = n \} \Big)
56+
$$
57+
58+
Where:
59+
60+
- $c_m$ is the maintenance cost per component.
61+
- $|a_t|$ is the number of components maintained.
62+
- $c_f$ is the failure cost per failed component.
63+
- $\#\{ i : s_t^i = n \}$ counts the number of components in the failed state.
64+
65+
This formulation captures the total cost for maintaining components and penalizing failures.
66+
67+
**Objective**: Find a policy $\pi: \mathcal{S} \to \mathcal{A}$ that minimizes the expected cumulative cost:
68+
```math
69+
\min_\pi \mathbb{E}\left[\sum_{t=1}^T c(s_t, \pi(s_t)) \right]
70+
```
71+
72+
**Terminal Condition**: The episode terminates after $T$ time steps, with no terminal reward.
73+
74+
## Key Components
75+
76+
### [`MaintenanceBenchmark`](@ref)
77+
78+
The main benchmark configuration with the following parameters:
79+
80+
- `N`: number of components (default: 2)
81+
- `K`: maximum number of components that can be maintained simultaneously (default: 1)
82+
- `n`: number of degradation states per component (default: 3)
83+
- `p`: degradation probability (default: 0.2)
84+
- `c_f`: failure cost (default: 10.0)
85+
- `c_m`: maintenance cost (default: 3.0)
86+
- `max_steps`: Number of time steps per episode (default: 80)
87+
88+
### Instance Generation
89+
90+
Each problem instance includes:
91+
92+
- **Starting State**: Random starting degradation state in $[1,n]$ for each components.
93+
94+
### Environment Dynamics
95+
96+
The environment tracks:
97+
- Current time step
98+
- Current degradation state.
99+
100+
**State Observation**: Agents observe a normalized feature vector containing the degradation state of each component.
101+
102+
## Benchmark Policies
103+
104+
### Greedy Policy
105+
106+
Greedy policy that maintains components in the last two degradation states, up to the maintenance capacity. This provides a simple baseline.
107+

src/DynamicAssortment/environment.jl

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -199,17 +199,21 @@ Features observed by the agent at current step, as a concatenation of:
199199
All features are normalized by dividing by 10.
200200
201201
State
202+
Return as a tuple:
203+
- `env.features`: the current feature matrix (feature vector for all items).
204+
- `env.purchase_history`: the purchase history over the most recent steps.
202205
"""
203206
function Utils.observe(env::Environment)
204207
delta_features = env.features[2:3, :] .- env.instance.starting_hype_and_saturation
205-
features = vcat(
206-
env.features,
207-
env.d_features,
208-
delta_features,
209-
ones(1, item_count(env)) .* (env.step / max_steps(env) * 10),
210-
) ./ 10
211-
212-
state = (env.features, env.purchase_history)
208+
features =
209+
vcat(
210+
env.features,
211+
env.d_features,
212+
delta_features,
213+
ones(1, item_count(env)) .* (env.step / max_steps(env) * 10),
214+
) ./ 10
215+
216+
state = (copy(env.features), copy(env.purchase_history))
213217

214218
return features, state
215219
end

src/Maintenance/Maintenance.jl

Lines changed: 15 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ using Combinatorics: combinations
1515
$TYPEDEF
1616
1717
Benchmark for a standard maintenance problem with resource constraints.
18-
Components are identical and degrade idependently over time.
18+
Components are identical and degrade independently over time.
1919
A high cost is incurred for each component that reaches the final degradation level.
2020
A cost is also incurred for maintaining a component.
2121
The number of simultaneous maintenance operations is limited by a maintenance capacity constraint.
@@ -39,6 +39,14 @@ struct MaintenanceBenchmark <: AbstractDynamicBenchmark{true}
3939
c_m::Float64
4040
"number of steps per episode"
4141
max_steps::Int
42+
43+
function MaintenanceBenchmark(N, K, n, p, c_f, c_m, max_steps)
44+
@assert K <= N "number of maintained components $K > number of components $N"
45+
@assert K >= 0 && N >= 0 "number of components should be positive"
46+
@assert 0 <= p <= 1 "degradation probability $p is not in [0, 1]"
47+
# ...
48+
return new(N, K, n, p, c_f, c_m, max_steps)
49+
end
4250
end
4351

4452
"""
@@ -49,26 +57,15 @@ end
4957
p=0.2
5058
c_f=10.0,
5159
c_m=3.0,
52-
max_steps=10,
60+
max_steps=80,
5361
)
5462
5563
Constructor for [`MaintenanceBenchmark`](@ref).
5664
By default, the benchmark has 2 components, maintenance capacity 1, number of degradation levels 3,
57-
degradation probability 0.2, failure cost 10.0, maintenance cost 3.0, 10 steps per episode, and is exogenous.
65+
degradation probability 0.2, failure cost 10.0, maintenance cost 3.0, 80 steps per episode, and is exogenous.
5866
"""
59-
60-
function MaintenanceBenchmark(;
61-
N=2,
62-
K=1,
63-
n=3,
64-
p=0.2,
65-
c_f=10.0,
66-
c_m=3.0,
67-
max_steps=80,
68-
)
69-
return MaintenanceBenchmark(
70-
N, K, n, p, c_f, c_m, max_steps
71-
)
67+
function MaintenanceBenchmark(; N=2, K=1, n=3, p=0.2, c_f=10.0, c_m=3.0, max_steps=80)
68+
return MaintenanceBenchmark(N, K, n, p, c_f, c_m, max_steps)
7269
end
7370

7471
# Accessor functions
@@ -90,9 +87,7 @@ $TYPEDSIGNATURES
9087
9188
Outputs a data sample containing an [`Instance`](@ref).
9289
"""
93-
function Utils.generate_sample(
94-
b::MaintenanceBenchmark, rng::AbstractRNG=MersenneTwister(0)
95-
)
90+
function Utils.generate_sample(b::MaintenanceBenchmark, rng::AbstractRNG)
9691
return DataSample(; instance=Instance(b, rng))
9792
end
9893

@@ -105,7 +100,7 @@ The model is a small neural network with one hidden layer no activation function
105100
function Utils.generate_statistical_model(b::MaintenanceBenchmark; seed=nothing)
106101
Random.seed!(seed)
107102
N = component_count(b)
108-
return Chain(Dense(N => N), Dense(N => N), vec)
103+
return Chain(Dense(N => N), Dense(N => N), vec)
109104
end
110105

111106
"""

src/Maintenance/environment.jl

Lines changed: 8 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -26,40 +26,33 @@ $TYPEDSIGNATURES
2626
Creates an [`Environment`](@ref) from an [`Instance`](@ref) of the maintenance benchmark.
2727
"""
2828
function Environment(instance::Instance; seed=0, rng::AbstractRNG=MersenneTwister(seed))
29-
degradation_state = starting_state(instance)
30-
env = Environment(;
31-
instance,
32-
step=1,
33-
degradation_state,
34-
rng=rng,
35-
seed=seed,
36-
)
29+
degradation_state = copy(starting_state(instance))
30+
env = Environment(; instance, step=1, degradation_state, rng=rng, seed=seed)
3731
Utils.reset!(env; reset_rng=true)
3832
return env
3933
end
4034

41-
component_count(env::Environment) = component_count(env.instance)
35+
component_count(env::Environment) = component_count(env.instance)
4236
maintenance_capacity(env::Environment) = maintenance_capacity(env.instance)
4337
degradation_levels(env::Environment) = degradation_levels(env.instance)
4438
degradation_probability(env::Environment) = degradation_probability(env.instance)
4539
failure_cost(env::Environment) = failure_cost(env.instance)
4640
maintenance_cost(env::Environment) = maintenance_cost(env.instance)
47-
max_steps(env::Environment) = max_steps(env.instance)
41+
max_steps(env::Environment) = max_steps(env.instance)
4842
starting_state(env::Environment) = starting_state(env.instance)
4943

50-
5144
"""
5245
$TYPEDSIGNATURES
5346
Draw random degradations for all components.
5447
"""
55-
5648
function degrad!(env::Environment)
57-
N = component_count(env)
49+
N = component_count(env)
5850
n = degradation_levels(env)
5951
p = degradation_probability(env)
52+
rng = env.rng
6053

6154
for i in 1:N
62-
if env.degradation_state[i] < n && rand() < p
55+
if env.degradation_state[i] < n && rand(rng) < p
6356
env.degradation_state[i] += 1
6457
end
6558
end
@@ -71,9 +64,8 @@ end
7164
$TYPEDSIGNATURES
7265
Maintain components.
7366
"""
74-
7567
function maintain!(env::Environment, maintenance::BitVector)
76-
N = component_count(env)
68+
N = component_count(env)
7769

7870
for i in 1:N
7971
if maintenance[i]
@@ -99,12 +91,10 @@ $TYPEDSIGNATURES
9991
Compute degradation cost.
10092
"""
10193
function degradation_cost(env::Environment)
102-
N = component_count(env)
10394
n = degradation_levels(env)
10495
return failure_cost(env) * count(==(n), env.degradation_state)
10596
end
10697

107-
10898
"""
10999
$TYPEDSIGNATURES
110100
@@ -163,5 +153,3 @@ function Utils.step!(env::Environment, maintenance::BitVector)
163153
env.step += 1
164154
return cost
165155
end
166-
167-

src/Maintenance/instance.jl

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,9 @@ Instance of the maintenance problem.
66
# Fields
77
$TYPEDFIELDS
88
"""
9-
@kwdef struct Instance{B<:MaintenanceBenchmark}
9+
@kwdef struct Instance{MaintenanceBenchmark}
1010
"associated benchmark"
11-
config::B
11+
config::MaintenanceBenchmark
1212
"starting degradation states"
1313
starting_state::Vector{Int}
1414
end
@@ -21,16 +21,16 @@ Generates an instance with random starting degradation states uniformly in [1, n
2121
function Instance(b::MaintenanceBenchmark, rng::AbstractRNG)
2222
N = component_count(b)
2323
n = degradation_levels(b)
24-
starting_state = rand(rng, 1:n, N)
25-
return Instance(; config=b, starting_state)
24+
starting_state = rand(rng, 1:n, N)
25+
return Instance(; config=b, starting_state=starting_state)
2626
end
2727

2828
# Accessor functions
29-
component_count(b::Instance) = component_count(b.config)
29+
component_count(b::Instance) = component_count(b.config)
3030
maintenance_capacity(b::Instance) = maintenance_capacity(b.config)
3131
degradation_levels(b::Instance) = degradation_levels(b.config)
3232
degradation_probability(b::Instance) = degradation_probability(b.config)
3333
failure_cost(b::Instance) = failure_cost(b.config)
3434
maintenance_cost(b::Instance) = maintenance_cost(b.config)
35-
max_steps(b::Instance) = max_steps(b.config)
35+
max_steps(b::Instance) = max_steps(b.config)
3636
starting_state(b::Instance) = b.starting_state

0 commit comments

Comments
 (0)