# Data preprocessing

The file `data_preprocessing` is a python module that uses the open-source library [scikit-learn](https://scikit-learn.org/stable/) to perform several data preprocessing techniques to the data previously extracted.

The module can be imported or executed as a script using one of the following commands
`python data_preprocessing.py <music_data_directory> <speech_data_directory>`
or
`python3 data_preprocessing.py <music_data_directory> <speech_data_directory>`

**Dependencies:**
- scikit-learn
- numpy

All dependencies are available both for python2 and python3 versions and can all be installed using the commands `pip install <package_name>` or `pip3 install <package_name>` for python2 and python3 respectively.