**Starting with this article** I am introducing a series on an original (at least for me and this blog) topic. **Firstly **because of the** programming language** that I’m using to show my idea and **secondly **because of the subject – **melody modification**, or to be more precise, **sound processing using Python language.**

I have written a thesis on this subject in my native language (polish) already, therefore I decided to start this series for the worldwide audience to be able to learn about it.

## Theory

First of all we need to introduce a **few key concepts** so we can transform them into a working application. We need to understand the domains that are **sound **and** signal processing**.

### Sound

Every **sound **that we can hear can **is really a wave** that propagates through the air. It is a** change in pressure of air over time**.

Those changes in pressure, amplitudes **can be measured and saved as a file to the computer** (as a WAVE file for instance). WAVE file contains a clear information about the amplitudes.

### Signal processing

#### FT & FFT

Ok, we have skimmed the topic already but how to **find what frequency we’re hearing at a particular moment in time? **We need to use a brilliant method called **Fourier Transform**** (FT) **or to be more precise its **computer-suited algorithm version called Fast Fourier Transform** **(FFT)**. A naive version of this algorithm **(FT) **is **really ineffective **and would take too much time to process sound data so** that’s where FFT comes in**. The history of **FT **and **FFT **is a great idea for a different article, though.

Thanks to wiki we can vizualize the idea of **FT **on charts.

The first figure shows well already known **amplitude over time **chart for function:

The latter presents** the result of the Fourier Transform operator **(mathematically FT speaking it’s a linear operator) applied to the sample input shown in the first chart. As a result we get **frequencies that make up the input wave and their power in dB**.

#### STFT

We have frequencies of the entire sample but **we still don’t know what frequency (note) the sound has at a certain point in time**. For that problem, we have to use an another briliant method with a cool acronym** Short Time Fourier Transform (STFT)**.** STFT allows us to calculate the frequency and its power in dB at a certain point in time. **

**A note**, for instance a **C7 **sound, **is a sound wave** that has the freqeuency of **2093.00 Hz**. Table available at https://pages.mtu.edu/~suits/notefreqs.html shows freqiencies for most of the sound notes that a human ear can hear.

**The conclusion is that** **having a value of frequency and its power at a certain point in time allows us to reproduce the melody of the entire sample!**

The figure above depicts the **result of STFT** applied to **the sound sample** – a spectral analysis. (it’s a different sample, not the same wave as in two previous figures). The green line, labeled as **F0**, is a chart of the melody computed by the pYIN algorithm. **By modyfing the frequencies at certain points in time we can change the melody.** Then we just need to use **iSTFT** **(reverse opertion)** to acquire modified sound.

#### Fundamental frequency

One last piece of theory. What is the **fundamental frequency **labeled as **F0**?

After wikipedia* “the fundamental is the musical pitch of a note that is perceived as the lowest partial present.”* **Pitch **can me mapped to **frequency **in our terms. Looking at the* Fig. 3* we can see the brithest area spanning from around *0.06* to *0.36 s*. – the same that has been marked with blue line. **This is the lowest frequency of all harmonics at certain point in time. **All the higher harmonics (F1, F2) can also be seen in the figure as the brighter stripes (higher power in dB) in higher frequencies.

### Implementation

That’s a lot of theory going on behind this concept. Luckily, we have a wonderful Python library that does the math for us so we can just focus on our idea instead of implementing it by hand (that would be fun too though). **librosa **is a rich library for music and audio analysis. We just need to get Python, install it, then get librosa and install it too. I am using JetBrains’ PyCharm which makes it quite easy to do but you are free to choose your favorite IDE.

#### PyCharm

Once you have acquired **PyCharm **go to **File -> Settings.**

Then choose **Python interpreter **and click on a **‘plus’** sign. Type in ‘librosa’ and choose the latest version (0.8.0 at the time of writing this). Click **Install package**, then **OK**, wait for it to be installed and you are good to go.

Another libs that **should be **installed for showing a plot are ‘matplotlib‘ and ‘numpy‘. Repeat the steps above and install the newest version.

#### Testing librosa

Create a new .py file and write the code below.

import librosa

from librosa import display

import matplotlib.pyplot as plt

import numpy as np

fig, ax = plt.subplots()

y, sr = librosa.load(librosa.ex('trumpet'))

stft_absolute_values = np.abs(librosa.stft(y))

img = display.specshow(librosa.amplitude_to_db(stft_absolute_values, ref=np.max), y_axis='log', x_axis='time', ax=ax)

ax.set_title('Power spectrogram')

fig.colorbar(img, ax=ax, format="%+2.0f dB")

plt.show()

Run the code and voila! We have quickly transformed sound waves to frequencies and created a really neat diagram using Python and librosa. Good job.

This is where I stop. In the **next part **I am going to show you an actual implementation of **melody modification code**, samples conversion using STFT and vice versa. Explain thoroughly how to read, understand end modify the results of Fourier Transformation. Hope to see you there!

Should you any questions or remarks, feel free to reach me!

[…] In the second part of the series I am going to present you an implementation of the melody modification method I’ve described in the first part. Available here. […]