Speech/Music classification of audio files using machine learning techniques.

505 B


This dataset was downloaded from Marsyas website. It is the famous GTZAN dataset. A direct download link is this.

From Marsyas' website:

A dataset which was collected for the purposes of music/speech discrimination. The dataset consists of 120 tracks, each 30 seconds long. Each class (music/speech) has 60 examples. The tracks are all 22050Hz Mono 16-bit audio files in .wav format.