Skip to main content
  • Home
  • Happenings
  • Events
  • Relational Contextual Bandits in Real-World User Interaction
Relational Contextual Bandits in Real-World User Interaction

Relational Contextual Bandits in Real-World User Interaction

Date17th May 2021

Time02:00 PM

Venue Google Meet (see link).

PAST EVENT

Details

Contextual bandit algorithms have shown great promise in real-world user interaction problems, such as Recommendations, A/B testing, clinical trials, or personalization. The majority of work is in the propositional domain, assuming independent and identical distribution (IID) of the data. However, most environments are inherently relational and characterized by complex relational structures, such as Social Networks and interaction graphs. Additionally, rich meta-information is available in interactions and attribute-value relationships. Incorporating these relations often helps the model to learn a better representation. Statistical Relational Learning (SRL) is one of the emerging research areas and builds on the idea of representation, modelling and learning in the relational domain. More interpretable and explainable models are obtained by converting relations into a first-order logical format in predicates than conventional machine learning models.

This work proposes a novel online contextual bandit framework in the relational domain, titled `Relational Boosted Bandits' (RB2). RB2 leverages relational regression tree gradient boosting to estimate the context-reward relationships for using various exploration techniques. These trees can be re-represented as a single tree that is interpretable using a simple addition operation. This work also proposes a parameter-free sampling algorithm for the relational domain. The proposed algorithm helps achieve faster convergence and faster model improvement, essentially due to the intelligent sampling technique that uses fewer samples. Experiments are done on benchmark real-world relational datasets on tasks such as link prediction, relational classification, and user interactions; and also on some simulated datasets. Rb2 significantly outperforms the baseline models across benchmarks. Informed sampling reduces data samples significantly by focusing on data regions essential to the convergence of the model.

Speakers

Ashutosh Kakadiya (CS18S013)

Computer Science and Engg.