Exercise 2 for the course "Parallel and distributed systems" of THMMY in AUTH university.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

22 lines
1.1 KiB

6 years ago
Project includes 6 versions of a knn algorithm implementation:
Serial - space optimized
Serial - time optimized
MPI parallel - blocking communications - space optimized
MPI parallel - blocking communications - time optimized
MPI parallel - non blocking communications - space optimized
MPI parallel - non blocking communications - time optimized
Project folder also includes some test files and execution results (stats folder).
In folder testFiles there is a dataset of 60000 points of 30 dimensions each, as
well as three IDX files extracted from Matlab storing correctly sorted IDs.
These files are named according to the convention: numberOfPoints_k
after the number of points and selected k that were used to run the Matlab
script.
To run any version first run make. Then copy testFiles/data.bin and one of the
IDX test files into the folder. Finally run with:
mpiexec -np numTasks ./prog.out numPoints numDimensions k data.bin idxFileName
To extract a new IDX file from Matlab run knn with the dataset and then extract
IDX variable to an ods/excel. Open the generated file and save as csv.