For the past two years I've been developing an audio synthesis engine
named Nemi. Sound is produced by non-linear feedback networks which
have been evolved using genetic algorithms, as the equations these networks
represent are too complex to hand craft.
Even simple voices are astoundingly complex, and it is nearly impossible
to determine what a given network will produce beforehand. Key state, note,
and velocity can all dramatically alter the character of the sound.
Most networks create noise, or silence. Only after several generations of
refinement do useable instruments emerge.
The Nemi sound engine can be driven by MIDI, tcp connections, or
data files. Beowolf clustering works, but still kludgy at this point, and
each machine must launch it's own daemon before hand.
Realtime audio is achieved by timing execution speed for each "K" or control
cycle, then cutting (if necessary) the longest, non-sustaining voices.
I'm getting good results with 440 K cycles per second, that winds up producing
a little more than 2 ms of latency, which is good enough for what I do.
I've included a few clips in mp3 and ogg format to illustrate.
Evil Machine.ogg
Illuminati5.mp3
Midi6.mp3
For info on instrument modellng and waveguides visit Planet CCRMA.
A neat project I've been working on is Enscribe which transforms
color photographs into audio watermarks. It's neat, check it out.
Another interesting field is that of audio resynthesis. A Fourier Transform
converts data (be it audio, visual, or other type of data) into discrete
frequencies. I ran into interesting problems while trying to create
a noise removal filter...
JPEG image files use a sister of the Fourier Transform called the Discrete
Cosine Transform or DCT or achieve high compression ratios. If the compression
is set too high, the image starts too resemble a mosaic of blurry blocks.
The blur and the wavy lines around edges (Gibbs Phenomena), are caused by
a loss of low amplitude spectra. A similar effect can be done with
audio, creating a smeared sound image. If the smear is delayed it resembles
reverb. This is exactly what a poor implementation of an mpeg encoder
would produce.
I've included a sample program code to illustrate audio programming here. It show the basics of opening an OSS
audio device in full duplex with open and ioctl.
There's also example use of a POSIX timer and some simple FFT code.
It tries to remove noise by parabolically
cutting frequency spectra below a certain threshold in real time. Bump
the cutoff threshold up a few orders of magnitude and you get
those "fading" artifacts.
Some "code in process" that tries to remap frequencies (heterodyning) in
realtime is here, but it's not pretty.
Here are some examples of the heterodyne work, sorry but they are low volume:
Piano with remapping from 100 bins above
Piano with sqrt of 2 remapping
Square root remapping-flameco
guitar
Square root regular guitar
Square root horns and strings
Cheesy synth to the 3.1 power
All materials © 2002-2008 Jason Downer