A row of slot machines in Las VegasIn probability theory and machine learning, the multi-armed bandit problem (sometimes called the K-[1] or N-armed bandit problem[2]) is named from imagining a gambler at a row of slot machines (sometimes known as "one-armed bandits"), who has to decide which machines to play, how many times to play each machine and in which order to play them, and whether to cont


{{#tags}}- {{label}}
{{/tags}}