Skip to main content
Causal contextual bandits with targeted interventions

Causal contextual bandits with targeted interventions

Date15th Nov 2021

Time11:00 AM

Venue Online Meeting

PAST EVENT

Details

We study a contextual bandit setting where the learning agent has the ability to perform interventions on targeted subsets of the population, apart from possessing qualitative causal side-information. This novel formalism captures intricacies in real-world scenarios such as software product experimentation where targeted experiments can be conducted. However, this fundamentally changes the set of options that the agent has, compared to standard contextual bandit settings, necessitating new techniques. This is also the first work that integrates causal side-information in a contextual bandit setting, where the agent aims to learn a policy that maps contexts to arms (as opposed to just identifying one best arm). We propose a new algorithm, which we show empirically performs better than baselines on experiments that use purely synthetic data and on real world-inspired experiments. We also prove a bound on regret that theoretically guards performance.

Speakers

Chandrasekar Subramanian (CS19D010)

Computer Science and Engg.