Wednesday 21 April 2021, 1.35PM to 2.25pm
Speaker(s): Dina Damen Associate Professor (Reader) at the Department of Computer Science, Visual Information Laboratory, University of Bristol
This talk aims to argue for a fine(r)-grained perspective onto human-object interactions, from video sequences, captured in an egocentric perspective (i.e. first-person footage).
Using multi-modal input (appearance, motion, audio, language), I will present approaches for determining skill or expertise from video sequences [CVPR 2019], assessing action ‘completion’ – i.e. when an interaction is attempted but not completed [BMVC 2018], few-shot learning [CVPR2021], dual-domain [CVPR 2020] as well as multi-modal fusion using vision, audio and language [CVPR 2021, CVPR 2020, ICCV 2019]. See all project details.
Location: Zoom (online)