policy for restless multi-armed bandits
C52920
concept
A policy for restless multi-armed bandits is a decision rule that dynamically selects which arms to activate over time when each arm’s state evolves continuously regardless of being played, aiming to optimize a long-term performance criterion under these non-stationary conditions.
All labels observed (1)
| Label | Occurrences |
|---|---|
| policy for restless multi-armed bandits canonical | 1 |
Instances (1)
| Instance | Via concept surface |
|---|---|
| Whittle index | — |