Show simple item record

dc.contributor.authorGal, Yarinen
dc.contributor.authorChen, Yutianen
dc.contributor.authorGhahramani, Zoubinen
dc.date.accessioned2015-08-28T11:46:50Z
dc.date.available2015-08-28T11:46:50Z
dc.date.issued2015en
dc.identifier.citationProceedings of The 32nd International Conference on Machine Learning 2015, 645–654.en
dc.identifier.issn2640-3498
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/250393
dc.description.abstractMultivariate categorical data occur in many applications of machine learning. One of the main difficulties with these vectors of categorical variables is sparsity. The number of possible observations grows exponentially with vector length, but dataset diversity might be poor in comparison. Recent models have gained significant improvement in supervised tasks with this data. These models embed observations in a continuous space to capture similarities between them. Building on these ideas we propose a Bayesian model for the unsupervised task of distribution estimation of multivariate categorical data. We model vectors of categorical variables as generated from a non-linear transformation of a continuous latent space. Non-linearity captures multi-modality in the distribution. The continuous representation addresses sparsity. Our model ties together many existing models, linking the linear categorical latent Gaussian model, the Gaussian process latent variable model, and Gaussian process classification. We derive inference for our model based on recent developments in sampling based variational inference. We show empirically that the model outperforms its linear and discrete counterparts in imputation tasks of sparse data.
dc.description.sponsorshipYG is supported by the Google European fellowship in Machine Learning.
dc.languageEnglishen
dc.language.isoenen
dc.publisherMicrotome Publishing
dc.rightsAttribution-NonCommercial 2.0 UK: England & Wales*
dc.rights.urihttp://creativecommons.org/licenses/by-nc/2.0/uk/*
dc.rights.urihttp://creativecommons.org/licenses/by-nc/2.0/uk/
dc.sourceAttribution-NonCommercial 2.0 UK: England & Wales
dc.titleLatent Gaussian Processes for Distribution Estimation of Multivariate Categorical Dataen
dc.typeArticle
dc.description.versionThis is the final version of the article. It first appeared from Microtome Publishing via http://jmlr.org/proceedings/papers/v37/gala15.htmlen
prism.publicationDate2015en
prism.publicationNameJournal of Machine Learning Researchen
rioxxterms.licenseref.urihttp://www.rioxx.net/licenses/all-rights-reserveden
rioxxterms.licenseref.startdate2015en
dc.contributor.orcidGhahramani, Zoubin [0000-0002-7464-6475]
rioxxterms.typeJournal Article/Reviewen
dc.identifier.urlhttp://jmlr.org/proceedings/papers/v37/gala15.htmlen
rioxxterms.freetoread.startdate2015-08-28


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial 2.0 UK: England & Wales
Except where otherwise noted, this item's licence is described as Attribution-NonCommercial 2.0 UK: England & Wales