How I Do It

My Music Creation Process

Oct 07, 2023

Caveat: I can’t find my original research, so this is a re-creation using an old brain with faulty memory. I can remember phone numbers from 1972, but not what I did last Tuesday. What’s up with that?

Anyway, way back during the last millennium, I began using MIDI to create music. At first, I used a software sequencer and the Atari 520ST, with good-enough results, eventually adding a keyboard to the process (which made things oh-so-much easier), and finally moving on to the PC when the Atari gave up the ghost.

Toward the end of the 90s, I started playing around with math-based music, notably Paul Whalley’s QuasiFractal Composer and Lars Kindermann’s Music In The Numbers programs. I created some lovely pieces using both of them, “Carpool Mom With A Secret Life” and “Cats In High Places”, for instance. (Available at Older Works - Persongo)

All that started me thinking about the math of MIDI. MIDI values, at least in the original version of MIDI, use bytes, i.e. the 0-127 range. And the pixel values of digital images are similarly constrained to 0-255 per RGB color, so each R or G or B value is basically the double of the MIDI range.

The math is simple – just divided by two and round things up or down. Since a pixel is actually a set of three values, red | green | blue (even greyscale or black & white), they can be separately extracted from a pixel.

So…would using pixel values to make MIDI notes result in something resembling music, or just sound like a big mess? And is there a relationship between a “pretty” picture and “pretty” music?

[Here’s where the lost research into what makes an image pleasing to the eye would go, if it hadn’t been lost. But it’s lost. So, it’s not here.]

There were, and still are, programs which convert the “frequency” of an image’s pixels to sonic frequencies. Interesting, but not usually what most would call “music”. For my purposes, I needed to constrain the MIDI values extracted from an image to music scales like Major, Minor, Lydian, Locrian, Phrygian, Pentatonic, etc.

It was easier than I had expected. I just needed to create arrays of “acceptable” notes for a given scale, and match the MIDI values the program had extracted. If the values matched, it was kept - if it didn’t, it was discarded or kept as a musical “rest”. Adjacent same-values could be set to be combined into longer notes or kept as separate notes.

Since MIDI information also contains note length and velocity (loudness) information, the same 0-255 pixel values could be set to set those values, too, after conversion to MIDI.

What I did find was that images with subtle changes in color and intensity were more “musical” than images with wider ranges in color or brightness, and photographs worked much better than illustrations or geometric designs. A photo of a person “sounded” better than a cubist painting.

Skin is especially good for music, since tonal changes are more likely to result in notes which don’t vary too widely or have abrupt changes.

The original version of the program was called “The Loxound Musical Pixelator”, but it was soon changed to “MIDImage”, a portmanteau of MIDI and Image, and has gone through three major versions since its inception in 1999 with additional features and enhancements over the years.

It has a number of users across the world and, I’m proud to say, has even seen use in university settings in their experimental music classes.

MIDImage was initially designed as a jumping off point, to be used to create a base for more traditional composition, but I soon found myself liking the raw output so much that I’ve left a lot of the created music as-is, with only minor editing for length or changing a note here or there.

If you want to go exploring, see The Door Into Summer and TDIS Music. Most of the music in both places were created with MIDImage, either alone or in combination with traditional composition methods. It’s also up at Spotify, iTunes, and Amazon, etc.

Daniel’s Substack

Discussion about this post