Generating Tailored Multimodal Fragments from Documents and Videos
Date25th Mar 2022
Time03:00 PM
Venue Hybrid: CRC 302 and virtual
PAST EVENT
Details
Multimodal content is central to digital communications and has been shown to increase user engagement – making them indispensable in today's digital economy. Image-text combination is a common multimodal manifestation seen in several digital forums, e.g. banners, online ads, social posts and have been shown to be effective for both communication and cognition. The specific choice of a specific image-text combination is dictated by the information to be represented, the strength of the image and text modalities in representing the information, and the requirements of the underlying task. In this talk, I will walk through some of our recent works on automatic synthesis of such multimodal fragments to generate teasers to an article, to answer questions on a multimodal document, and effective navigation of long videos.
Speakers
Balaji Vasan
Robert Bosch Center for Data Science and Artificial Intelligence