Repository logo
 

Head-Related Transfer Function Upsampling Using an Autoencoder-Based Generative Adversarial Network With Evaluation Framework

Accepted version
Peer-reviewed

Loading...
Thumbnail Image

Change log

Abstract

Accurate head-related transfer functions (HRTFs) are essential for delivering realistic 3D audio experiences. However, obtaining personalized, high-resolution HRTFs for individual users is a time-consuming and costly process, typically requiring extensive acoustic measurements. To address this, spatial upsampling techniques have been developed to estimate high-resolution HRTFs from sparse, low-resolution acoustic measurements. This paper presents a novel approach that leverages the spherical harmonic domain and an autoencoder generative adversarial network to tackle the HRTF upsampling problem. Comprehensive evaluations are conducted using both perceptual models and objective spectral metrics to validate the accuracy and realism of the upsampled HRTFs. The results show that the proposed approach outperforms traditional barycentric interpolation in terms of log-spectral distortion, particularly in extreme sparsity scenarios involving fewer than 12 measurements. These results go some way to justifying that the proposed autoencoder generative adversarial network approach is able to create high-quality, high-resolution HRTFs from only a few acoustic measurements, helping pave the way for more accessible personalized spatial audio across a range of applications.

Description

Journal Title

Journal of the Audio Engineering Society

Conference Name

Journal ISSN

1549-4950

Volume Title

73

Publisher

Audio Engineering Society

Rights and licensing

Except where otherwised noted, this item's license is described as Attribution 4.0 International