Audio Deepdream: Optimizing raw audio with convolutional networks.
International Society for Music Information Retrieval Conference, Google Brain
Adam Roberts, Cinjon Resnick, Diego Ardila, Doug Eck
The hallucinatory images of DeepDream opened up the floodgates for a recent wave of
artwork generated by neural networks. In this work, we take first steps to applying
this to audio. We believe a key to solving this problem is training a deep neural
network to perform a music perception task on raw audio. Consequently, we have
followed in the footsteps of Van den Oord et al and trained a network to predict
embeddings that were themselves the result of a collaborative filtering model. A
key difference is that we learn features directly from the raw audio, which creates
a chain of differentiable functions from raw audio to high level features. We then
use gradient descent on the network to extract samples of "dreamed" audio.