The color of music: why not discover python's ecosystem the right way?

It all begun, a nice sunny day of summer on #python-fr

Well, not that sunny, but a story always begin on a cheerful note. A numerous minority of beginners that comes on #python-fr (usually from PHP) often make a first project to discover python which belongs to the following categories:
  • recoding a network client/server without using twisted or tornado;
  • recoding an HTML parser without using an html parser (yes with regexps). 
When we usually tell them it is wrong, we are in the best case ignored, in some exceptional cases flamed for being castrating their creativity.

How do we tell them that reinventing the square wheel is not standing on the giants' shoulders?

Well, this is so disheartening that I asked myself, what would I do as a first project? And since I sux at deciphering music, I told myself: How hard is it to make a sonogram?

If you know a little music, you know that a note is associated with a frequency (the primary). Every instruments when you play a note have a primary vibration which is strong and a stationary vibration and additional stationary vibrations of k * primary  frequency but of lower amplitude called harmonics. So I want to see the music in a colourful way to be able to tell which notes are played by finding the lowest vibration with the biggest amplitude. No rocket science, occidental tempered modes are a gift from the Greek before they were broke. (We never paid them for this gift, but who cares they don't have IP lawyers).

How hard is it to make a spectrogram in python?

Let me google it for you:
spectrogram music python, choose : stackoverflow link,
I chose audiolab instead of waves because I was reading the author's article a few day agos on planet python.

So you are redirected on which gives you the bases for a first code and some explanations.

Fast Fourier Transform is an outdated (since this year after years of good services) way of transforming a time dependent serie in a frequency serie. But the Fourier transform is still a must.

The truth is you have to make a spectre of frequency, which is not a Fourier fransform, but closely related (I'll skip the theory of temperate distributions). As with a radar you convolve the signal with itself on a time window to have a power representation (autocorrelation). We have a nice trick with Fourier avoiding very costly and boring integrals of overlap over time. As I have not opened a book of signal processing for 20 years, I do strongly advise you if interested to double check what I am saying. I am pretty much a school failure so I must have made a mistake somewhere.

I did not liked the old code on macdev because I had more than 20 lines of codes to write. And I told myself, «it would be very amazing there is not a ready made class for spectrograms in python». And I reread carefully the Stackoverflow answer (googgling is fine, reading is a must). And I discovered that spectrograms are directly included in matplotlib

Plus my first result was crappy so I googled : improve resolution spectrogram python. And here I had the base to do a fine code. Because parameters were explained.

I did not even had to code or scratch my head to have a functioning code. Did I learnt something? Yes: always check your problem has not a solution before coding head first. And code less.

The fun in coding is about adding your salt

Well: spectrograms are fine but for decoding music reading raw frequencies sux. So here I came with my improvement that does not interest physicists: changing the numerical scales by readable notes.

Step 1: copy paste on internet an HTML table of note names vs frequency in a file.
Step 2: adding it to the graph.

Here is my code:

And here is the result:
Köln concert PartII of Keith Jareith from seconds 10 to 20.

It is a piano instrumental, you clearly see the harmonics, the notes (No I joke low frequency is a mess to read), and the tempo. My note scales are not the same than Keith's because I loaded a scale based on la=440Hz where it must be la=442Hz. You guess the arpeggio at offset = 1.8second.

Well, we python devs, are sometimes bashed because what we do is «too easy». Well, I find it pathetic someone who wets his pants pissing thousands of lines of code when you can have the same result with 20 lines.

AC/DC back in black from 5 to 10 seconds
Multi instrumental song makes it a pain to read the notes. You may guess however an E3 bass line with a tempo of 60bpm begining at offset = 6 seconds.

Why didn't I make a logarithmic scale? Because spectrograms are not compatible with log scales in matplotlib. That's the end of it.

So what is the conclusion?

If you begin in python, rather than re walking already more than known paths for which you'll invariably code a simple neat and wrong solution just explore what is the bleeding edge and do some fun stuffs, you'll discover we need new ideas and that you can help.

For instance in this case:
  • you could make a matplolib application where you can zoom part of the  whole song in another window;
  • you could try to code a colormap in matplotlib that can support logarithmic scale;
  • you could try to had some filtering to erase the harmonics;
  • you could had a bpm frequency detector and play with bpm to higher the resolution and show the beat in xticks to higher readability ;
  • you could play with the window function and the color scale to higher the resolution;
  • you could make a 3D graphs with frequency bins ...

There is always room for improvements as long as you target the right problems.

Well, computer industry is an amazing field to work in because there are always new problems to solve. Beginning in computer should not be about mimicking wrongly old boring problems (parsing HTML with regexps, network daemons, parallel computing) but about finding your place in the borders of an imaginary land. And remember if as I you begin in python: Always stand on the giants' shoulders, you'll see further than if you stay a dwarf in a crowd of giants.

No comments: