Main menu

Translating Speech into Birdsong, or, Deep Dream Generator with Sound?


Hello! I’m a composer and a DPhil in Music candidate. I’m searching for a collaborator for a project that
aims to explore the musical and artistic potential of AI.
My question is: can we do something like Google’s Deep Dream Generator (see example below) but
with sound, and could this be realised in real time, in the form of an interactive installation? More
specifically, just as DDG can feed the objects it recognises in an image back into the image itself, I’d like
to transform live speech or instrumental sounds by translating them into birdsong. As a sound very
familiar to us and full of cultural associations, birdsong seems like a fruitful and diverse sound type to
begin toying with; one recognisable even after extreme transformations.
From an aesthetic perspective I’m particularly interested in the interfaces between humanity, technology,
and nature. I’d like to explore through art and music questions of machine agency, how we can
appreciate their experience of the world, and the interferences between these machine perceptions of
reality and our own; I’d especially like to see what happens when they’re mixed and confused together.
I’m fascinated by technologies that replicate nature but that are slightly imperfect, glitching, surreal, or
even somewhat psychedelic (like the image below). In this sense I’m interested in working on something
more sci-fi (e.g. Black Mirror) or experimental (Stockhausen or Squarepusher) in spirit, rather than
something that’s technically flawless.
The idea is in a very germinal and open state. I’d be happy to meet with anyone interested and grateful
for any advice on possible ways forward.

Nicholas Moroz - DPhil in Music (Composition) –

Image: Hieronymus Bosch. Detail from The Garden of Earthly Delights (central panel) .Processed with Deep Dream Generator.
Clockwise from top left: Dream 0, 1, 30, 65.