Scaling Up Sign Spotting Through Sign Language Dictionaries
Published version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Abstract
jats:titleAbstract</jats:title>jats:pThe focus of this work isjats:italicsign spotting</jats:italic>–given a video of an isolated sign, our task is to identifyjats:italicwhether</jats:italic>andjats:italicwhere</jats:italic>it has been signed in a continuous, co-articulated sign language video. To achieve this sign spotting task, we train a model using multiple types of available supervision by: (1)jats:italicwatching</jats:italic>existing footage which is sparsely labelled using mouthing cues; (2)jats:italicreading</jats:italic>associated subtitles (readily available translations of the signed content) which provide additionaljats:italicweak-supervision</jats:italic>; (3)jats:italiclooking up</jats:italic>words (for which no co-articulated labelled examples are available) in visual sign language dictionaries to enable novel sign spotting. These three tasks are integrated into a unified learning framework using the principles of Noise Contrastive Estimation and Multiple Instance Learning. We validate the effectiveness of our approach on low-shot sign spotting benchmarks. In addition, we contribute a machine-readable British Sign Language (BSL) dictionary dataset of isolated signs,jats:scBslDict</jats:sc>, to facilitate study of this task. The dataset, models and code are available at our project page.</jats:p>
Description
Keywords
Journal Title
Conference Name
Journal ISSN
1573-1405