P2-05: White Box Search Over Audio Synthesizer Parameters
Yuting Yang (Princeton University)*, Zeyu Jin (Adobe Research), Adam Finkelstein (Princeton University), Connelly Barnes (Adobe Research)
Subjects (starting with primary): Applications -> music composition, performance, and production ; MIR and machine learning for musical acoustics -> applications of machine learning to musical acoustics ; MIR and machine learning for musical acoustics ; MIR and machine learning for musical acoustics -> applications of musical acoustics to signal synthesis ; MIR fundamentals and methodology -> music signal processing
Presented In Person: 4-minute short-format presentation
Synthesizer parameter inference searches for a set of patch connections and parameters to generate audio that best matches a given target sound. Such optimization tasks benefit from access to accurate gradients. However, typical audio synths incorporate components with discontinuities – such as sawtooth or square waveforms, or a categorical search over discrete parameters like a choice among such waveforms – that thwart conventional automatic differentiation (AD). AD libraries in frameworks like TensorFlow and PyTorch typically ignore discontinuities, providing incorrect gradients at such locations. Thus, SOTA parameter inference methods avoid differentiating the synth directly, and resort to workarounds such as genetic search or neural proxies. Instead, we adapt and extend recent computer graphics methods for differentiable rendering to directly differentiate the synth as a white box program, and thereby optimize its parameters using gradient descent. We evaluate our framework using a generic FM synth with ADSR, noise, and IIR filters, adapting its parameters to match a variety of target audio clips. Our method outperforms baselines in both quantitative and qualitative evaluations.
If the video does not load properly please use the direct link to video