Cleanup and fixes, Add new README, Add LICENSE

7 years ago · c0343e12a7
8 changed files with 90 additions and 47 deletions
--- a/LICENSE.md
+++ b/LICENSE.md
@ -0,0 +1,7 @@
 "THE BEER-WARE LICENSE"
 Copyright (c) 2018 Apostolof, charaldp, tsakonag
 As long as you retain this notice you can do whatever you want with this stuff.
 If we meet some day, and you think this stuff is worth it, you can buy us a
 beer in return.
--- a/classifier/README.md
+++ b/classifier/README.md
@ -0,0 +1,77 @@
 # Technology of the sound and image, AUTH
 > Speech/Music classification of audio files using machine learning techniques.
 ## Clone
 Clone this repo to your local machine using git:
 ```bash
 git clone https://github.com/laserscout/THE-Assignment.git
 ```
 ## Dependencies
 The way we recommend you run the scripts in this repository, in order to avoid python v.2/3 incompatibilities and/or other uncomfortable code breakage is setting up and using a virtual environment using python's module `venv` (or any other preferred) as described bellow:
 ```bash
 cd THE-Assignment/classifier/
 python3 -m venv myenv
 source myenv/bin/activate
 pip install -U scikit-learn
 pip install --upgrade pandas
 pip install numpy
 pip install seaborn
 pip install scipy
 pip install essentia
 ```
 ## Feature extraction
 The file `feature_extraction/feature_extractor` is a python module that uses the open-source library [Essentia](http://essentia.upf.edu/documentation/index.html) to extract audio features from an audio file in the path specified in the first parameter and save the features' values to a json file in the path specified in the second parameter.
 The module can be imported or executed as a script using the following command:
 ```bash
 python feature_extractor.py <audio_file_path> <extracted_features_file_path> <audio_file_sample_rate>
 ```
 A python script is also provided for a batch feature extraction. The script can be executed using the following command:
 ```bash
 python batch_feature_extractor.py <audio_files_directory> <feature_files_directory> <audio_files_sample_rate>
 ```
 ## Data preprocessing
 The file `preprocessing/data_preprocessing` is a python module that uses the open-source library [scikit-learn](https://scikit-learn.org/stable/) to perform several data preprocessing techniques to the data previously extracted.
 The module can be imported or executed as a script using the following command:
 ```bash
 python data_preprocessing.py <music_data_directory> <speech_data_directory>
 ```
 ## Model training
 The file `training/model_training` is a python module that uses the open-source library [scikit-learn](https://scikit-learn.org/stable/) to train several different models and one ensembles (Random Forest).
 The module can be imported or executed as a script using the following command:
 ```bash
 python model_training.py <dataset_pickle> <model_selection>
 ```
 Where:
 - *dataset_pickle* is the pandas pickle (.pkl) file of the dataset dataframe saved on the disk. This file is generated by the data_preprocessing module.
 - *model_selection* is a string denoting which model the script should use. It can be one of svm (SVM model), dtree (Decision tree), nn (Multi-layer Perceptron), bayes (Naive Bayes), rndForest (Random Forest).
 ## Pipelines (putting it all together)
 An example of how to use all the modules and functions provided can be seen reading the file `pipeline.py`.
 ## Support
 Reach out to us:
 - [apostolof's email](mailto:apotwohd@gmail.com "apotwohd@gmail.com")
 - [christina284's email](mailto:christtk@auth.gr "christtk@auth.gr")
 - [laserscout's email](mailto:frankgou@auth.gr "frankgou@auth.gr")
 ## License
 [![Beerware License](https://img.shields.io/badge/license-beerware%20%F0%9F%8D%BA-blue.svg)](https://github.com/laserscout/THE-Assignment/blob/master/LICENSE.md)
--- a/classifier/feature_extraction/README.md
+++ b/classifier/feature_extraction/README.md
@ -1,20 +0,0 @@
 # Feature extraction
 The file `feature_extractor` is a python module that uses the open-source library [Essentia](http://essentia.upf.edu/documentation/index.html) to extract audio features from an audio file in the path specified in the first parameter and save the features' values to a json file in the path specified in the second parameter.
 The module can be imported or executed as a script using one of the following commands
 `python feature_extractor.py <audio_file_path> <extracted_features_file_path> <audio_file_sample_rate>`
 or
 `python3 feature_extractor.py <audio_file_path> <extracted_features_file_path> <audio_file_sample_rate>`
 A python script is also provided for a batch feature extraction. The script can be executed using one of the following commands:
 `python batch_feature_extractor.py <audio_files_directory> <feature_files_directory> <audio_files_sample_rate>`
 or
 `python3 batch_feature_extractor.py <audio_files_directory> <feature_files_directory> <audio_files_sample_rate>`
 **Dependencies:**
 - essentia
 - numpy
 - scipy
 All dependencies are available both for python2 and python3 versions and can all be installed using the commands `pip install <package_name>` or `pip3 install <package_name>` for python2 and python3 respectively.
--- a/classifier/preprocessing/README.md
+++ b/classifier/preprocessing/README.md
@ -1,14 +0,0 @@
 # Data preprocessing
 The file `data_preprocessing` is a python module that uses the open-source library [scikit-learn](https://scikit-learn.org/stable/) to perform several data preprocessing techniques to the data previously extracted.
 The module can be imported or executed as a script using one of the following commands
 `python data_preprocessing.py <music_data_directory> <speech_data_directory>`
 or
 `python3 data_preprocessing.py <music_data_directory> <speech_data_directory>`
 **Dependencies:**
 - scikit-learn
 - numpy
 All dependencies are available both for python2 and python3 versions and can all be installed using the commands `pip install <package_name>` or `pip3 install <package_name>` for python2 and python3 respectively.
--- a/classifier/training/model_training.py
+++ b/classifier/training/model_training.py
@ -1,4 +1,5 @@
 import numpy as np
 import pandas as pd
 class bcolors:
 	BLUE = '\033[94m'
@ -119,9 +120,7 @@ print(bcolors.BLUE + 'model_training loaded' + bcolors.ENDC)
 # Enables executing the module as a standalone script
 if __name__ == "__main__":
 	import sys
-	dataset = np.load(sys.argv[1] + 'dataset.npy')
+	dataset = pd.read_pickle(sys.argv[1])
-	target = np.load(sys.argv[1] + 'target.npy')
+	target = dataset.pop('target')
 	featureKeys = np.load(sys.argv[1] + 'featureKeys.npy')
-	# simpleTrain(dataset, target)
+	kFCrossValid(dataset.values, target, sys.argv[2])
 	kFCrossValid(dataset, target, 'svm')
--- a/report/3.features_and_preprocessing.tex
+++ b/report/3.features_and_preprocessing.tex
@ -63,7 +63,7 @@ To spectral complexity ή αλλιώς η φασματική πολυπλοκό
 Τα σήματα ομιλίας έχουν χαρακτηριστικό μέγιστο στη διαμόρφωση ενέργειας γύρω στα 4Hz του ρυθμού συλλαβών. Για να μοντελοποιηθεί αυτή η ιδιότητα ακολουθείται η παρακάτω διαδικασία\footnote{\href{https://www.irit.fr/recherches/SAMOVA/FeaturesExtraction.html\#me4hz}{IRIT/SAMoVA - 4 Hz modulation energy}, last accessed: \today}: το σήμα τμηματοποιείται σε frames, εξάγονται οι συντελεστές Mel Frequency Spectrum και υπολογίζεται η ενέργεια σε 40 κανάλια αντίληψης. Αυτή η ενέργεια έπειτα φιλτράρεται με ένα ζωνοδιαβατό φίλτρο, κεντραρισμένο στα 4Hz. Η ενέργεια αθροίζεται για όλα τα κανάλια, και κανονικοποιείται με βάση τη μέση ενέργεια του κάθε frame. Η διαμόρφωση δίνεται από το κανονικοποιημένο άθροισμα της φιλτραρισμένης ενέργειας. Η φωνή περιέχει περισσότερη διαμόρφωση από την μουσική.
 \vspace{1em}
-Στα διαγράμματα [\ref{featureTable:table1}] και [\ref{featureTable:table2}] φαίνονται ενδεικτικά κάποια από τα χαρακτηριστικά που υπολογίστηκαν και το πόσο αποτελεσματικά είναι στον διαχωρισμό των κλάσεων.
+Στο διάγραμμα [\ref{featureTable:table1}] φαίνονται ενδεικτικά κάποια από τα χαρακτηριστικά που υπολογίστηκαν και το πόσο αποτελεσματικά είναι στον διαχωρισμό των κλάσεων.
 \begin{figure}[h]
 \centering
@ -71,12 +71,6 @@ To spectral complexity ή αλλιώς η φασματική πολυπλοκό
 \caption{Διαγράμματα συνδυασμών τεσσάρων χαρακτηριστικών ανά δύο}
 \label{featureTable:table1}
 \end{figure}
 \begin{figure}[ht]
 \centering
 \includegraphics[width=0.7\textwidth]{res/figure_2.png}
 \caption{Διαγράμματα συνδυασμών τεσσάρων χαρακτηριστικών ανά δύο}
 \label{featureTable:table2}
 \end{figure}
 \section{Προεπεξεργασία δεδομένων}
@ -109,7 +103,7 @@ To spectral complexity ή αλλιώς η φασματική πολυπλοκό
 Εν τέλει, αποφασίστηκε να χρησιμοποιηθεί μόνο η κλιμακοποίηση, επειδή η κανονικοποίηση δεν είχε κανένα αποτέλεσμα και το κέρδος σε ταχύτητα ταξινόμησης των παραπάνω τρόπων μείωσης του αριθμού χαρακτηριστικών δεν ήταν αρκετό συγκριτικά με την μείωση της ακρίβειας ώστε να δικαιολογήσει τη χρήση τους στην υλοποίηση. Βοήθησαν παρ' όλα αυτά στον προσδιορισμό των χαρακτηριστικών που υπερτερούν.
-Στη συνέχεια, σε μία προσπάθεια περαιτέρω κατανόησης και κατάταξης των χαρακτηριστικών με βάση τη διακριτική τους ικανότητα, απομονώθηκαν όλα και ελέγχθηκε η ακρίβειά τους ένα ένα. Τα αποτελέσματα έδειξαν ότι, τελικά, κανένα χαρακτηριστικό από μόνο του δεν είναι ικανό να δώσει ικανοποιητικό ποσοστό ακρίβειας. Ακόμα και αν πάρουμε το καλύτερο σε όρους ακρίβειας και το δοκιμάσουμε σε συνδυασμό με τα επόμενα καλύτερα, φαίνεται ότι η ακρίβεια αυξάνεται λίγο αλλά όχι αρκετά. Τέλος, αν επαναληφθεί ακόμα μία φορά η διαδικασία, φαίνεται ότι έχουμε και πάλι μια μικρή αύξηση στην ακρίβεια, η οποία όμως είναι αρκετά μακρυά από την ακρίβεια που επιτυγχάνεται χρησιμοποιώντας όλα τα χαρακτηριστικά.
+Στη συνέχεια, σε μία προσπάθεια περαιτέρω κατανόησης και κατάταξης των χαρακτηριστικών με βάση τη διακριτική τους ικανότητα, απομονώθηκαν όλα και ελέγχθηκαν ως προς την ακρίβειά τους ένα ένα. Τα αποτελέσματα έδειξαν ότι, τελικά, κανένα χαρακτηριστικό από μόνο του δεν είναι ικανό να δώσει ικανοποιητικό ποσοστό ακρίβειας. Ακόμα και αν πάρουμε το καλύτερο σε όρους ακρίβειας και το δοκιμάσουμε σε συνδυασμό με τα επόμενα καλύτερα, φαίνεται ότι η ακρίβεια αυξάνεται λίγο αλλά όχι αρκετά. Τέλος, αν επαναληφθεί ακόμα μία φορά η διαδικασία, φαίνεται ότι έχουμε και πάλι μια μικρή αύξηση στην ακρίβεια, η οποία όμως είναι αρκετά μακρυά από την ακρίβεια που επιτυγχάνεται χρησιμοποιώντας όλα τα χαρακτηριστικά.
 \begin{table}[H]
 	\centering
--- a/report/main.pdf
+++ b/report/main.pdf
--- a/report/res/figure_2.png
+++ b/report/res/figure_2.png