Domain-Specific Analog Physical Computing Accelerators
Repository URI
Repository DOI
Change log
Authors
Abstract
This dissertation provides applications in the form of Monte Carlo simulations and Bayesian inference as motivation for the fast and efficient generation of non-uniform random variates in hardware. Using simulations and real-world empirical measurements this dissertation shows that software non-uniform random number generation is slow and inefficient, and discusses why this is the case. This dissertation presents the idea that we can offload the task of non-uniform random number generation from the digital electronic processor, leaving it free to perform other computations. This dissertation shows that it is possible to produce samples from a Gaussian distribution with any arbitrary mean and standard deviation using a simple transform and a source of samples from a Gaussian distribution. Our new hardware architecture can then produce samples from any arbitrary one-dimensional probability distribution by decomposing it into a mixture of Gaussians using a kernel density.
To illustrate the novelty of our approach compared to the existing literature this dissertation presents terminology to describe non-uniform random number generators and a taxonomy that categorizes the state of the art of hardware non-uniform random number generators based on the physical process and measurement hardware they use. We focus on the hardware implementation of the measurement mechanism because it places a fundamental limit on the speed of generation. Based on this analysis we show that field-programmable gate array-based non-uniform random number generators produce the highest sample rates on average. The fastest generator in our study uses a photodiode but it is an outlier for that architecture. This indicates that a field-programmable gate array or optical-based design will produce the fastest generator. The majority of hardware uniform random number generators obtain their randomness from a non-uniform random physical process and then transform or truncate this process to produce a uniform distribution. Therefore, the hardware uniform random number generators are subject to the same fundamental speed constraints as hardware non-uniform random number generators. This suggests that our non-uniform random number generation architecture also has the potential to produce a uniform random number generator that is faster than the state-of-the-art.
After describing the reasons that we would want a hardware non-uniform random number generator, this dissertation presents a new non-uniform random number generator architecture with the potential for greater speed and efficiency than the state-of-the-art non-uniform random number generator methods. We present results from two implementations of the architecture which both involve sampling a random physical process that varies in time. The physical processes which the system samples are: 1) Microelectromechanical system (MEMS) sensor noise. 2) Electron tunneling noise. This dissertation shows that the mean and standard deviation of both the MEMS- and electron tunneling noise-based programmable random variate accelerators depend upon their temperature and supply voltage and proposes an architecture to compensate for this effect. This dependence is important when designing a non-uniform random number generator based on electronic noise. The random number generator will either need to measure and compensate for the temperature or be kept in a constant temperature environment. Chapter 5 concludes our discussion of programmable non-uniform random number generators.
Chapters 6 to 8 of this dissertation focus on accelerating Fourier transform and convolution operations using an optical computing accelerator. This hardware has the sole purpose of accelerating Fourier transform and convolution operations. Chapter 6 describes the theory of Fourier transforms and the empirical results of experiments that use the theory. Chapter 7 presents a theoretical benchmark analysis of 27 end-to-end applications that would benefit from running over an optical Fourier transform and convolution accelerator. We find that the optical accelerator can only produce a speedup of > 10× for two applications (pure Fourier transforms and convolutions). We built a prototype optical Fourier transform accelerator using off-the-shelf hardware to illustrate the data movement bottleneck that occurs in any computing accelerator that moves data between analog and digital computing devices. Chapter 8 proposes a new computer architecture which we call (Mia) memory in aperture. Mia mitigates the data movement bottleneck using hybrid analog-digital memories in the physical address space of the inevitable digital electronic processor in the computing system. Mia will reduce the data movement bottleneck in any computer architecture that frequently moves data between analog and digital computing devices. Mia holds particular future promise for interfacing analog neuromorphic computing architectures and quantum computers with digital electronic processors.