Synthetically Supervised Feature Learning for Scene Text Recognition
View / Open Files
Authors
Liu, Y
Wang, Z
Jin, H
Wassell, I
Publication Date
2018Journal Title
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Conference Name
European Conference on Computer Vision
ISSN
0302-9743
ISBN
9783030012274
Publisher
Springer International Publishing
Volume
11209 LNCS
Pages
449-465
Type
Conference Object
This Version
AM
Metadata
Show full item recordCitation
Liu, Y., Wang, Z., Jin, H., & Wassell, I. (2018). Synthetically Supervised Feature Learning for Scene Text Recognition. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11209 LNCS 449-465. https://doi.org/10.1007/978-3-030-01228-1_27
Abstract
We address the problem of image feature learning for scene text recognition. The image features in the state-of-the-art methods are learned from large-scale synthetic image datasets. However, most meth- ods only rely on outputs of the synthetic data generation process, namely realistically looking images, and completely ignore the rest of the process. We propose to leverage the parameters that lead to the output images to improve image feature learning. Specifically, for every image out of the data generation process, we obtain the associated parameters and render another “clean” image that is free of select distortion factors that are ap- plied to the output image. Because of the absence of distortion factors, the clean image tends to be easier to recognize than the original image which can serve as supervision. We design a multi-task network with an encoder-discriminator-generator architecture to guide the feature of the original image toward that of the clean image. The experiments show that our method significantly outperforms the state-of-the-art methods on standard scene text recognition benchmarks in the lexicon-free cate- gory. Furthermore, we show that without explicit handling, our method works on challenging cases where input images contain severe geometric distortion, such as text on a curved path.
Identifiers
External DOI: https://doi.org/10.1007/978-3-030-01228-1_27
This record's URL: https://www.repository.cam.ac.uk/handle/1810/286387
Rights
Licence:
http://www.rioxx.net/licenses/all-rights-reserved
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk