Q-PrOP: Sample-efficient policy gradient with an off-policy critic
View / Open Files
Publication Date
2017-01-01Journal Title
5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings
Type
Conference Object
This Version
AM
Metadata
Show full item recordCitation
Gu, S., Lillicrap, T., Ghahramani, Z., Turner, R., & Levine, S. (2017). Q-PrOP: Sample-efficient policy gradient with an off-policy critic. 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings https://doi.org/10.17863/CAM.21294
Sponsorship
EPSRC (EP/M026957/1)
EPSRC (EP/J012300/1)
EPSRC (via University of Sheffield) (EP/N014162/1)
Alan Turing Institute (EP/N510129/1)
Identifiers
This record's DOI: https://doi.org/10.17863/CAM.21294
This record's URL: https://www.repository.cam.ac.uk/handle/1810/274195
Rights
Licence:
http://www.rioxx.net/licenses/all-rights-reserved