Q-learning-based Actor-Critic Framework for Automatic Question Generation

Home
ताजा घटनाएं
कार्यक्रम
Q-learning-based Actor-Critic Framework for Automatic Question Generation

Date11th Apr 2022

Time11:00 AM

Venue Google Meet (see link).

PAST EVENT

Details

Existing approaches in Automatic Question Generation (AQG) train sequence-to-sequence models in a supervised setup and generate questions from given passages and answers. However, such methods suffer from exposure bias and mismatch in the evaluation measures used for training and testing. Several works have used reinforcement learning-based techniques such as the Policy Gradient (PG) based REINFORCE algorithm to address some of these issues and fine-tune the model on a specific metric. However, these techniques work based on a global reward function where the model needs to wait till the generation of the entire sequence to update its parameters, limiting the model updates to only broader guidance.

In our work, we address these inherent issues in text generation problems like AQG by introducing a Q-learning-based actor-critic framework that uses fluency and semanticity-based stepwise rewards to aid sequence-to-sequence models in the task of question generation. However, the high dimensional discrete action space arising from the extensive vocabulary makes it challenging to leverage Q-learning-based methods for solving large-scale, real-world text generation problems. To address this issue, we use a two-step training procedure comprised of supervised pre-training of the actor followed by Q-learning-based joint training of the actor and critic. We also address the global reward problem by training a critic that can generate appropriate Q-values for question sub-sequences, making it possible to leverage such partially generated sequences to update the model parameters.

We present empirical results from experiments on the SQuAD dataset to establish our proposed framework's efficacy. We also analyze the questions generated using our proposed approach and compare them against the questions generated by the supervised learning based-baseline method to show the improvements provided by our proposed framework in predicting better quality questions concerning fluency and semanticity.

Speakers

Debargha Bhattacharjee

Computer Science and Engg.