Exercise 2 for the course "Parallel and distributed systems" of THMMY in AUTH university.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
Apostolos Fanakis f3ef16d3eb Squash commit 6 years ago
Blocking Squash commit 6 years ago
NonBlocking Squash commit 6 years ago
Serial Squash commit 6 years ago
dev Squash commit 6 years ago
stats Squash commit 6 years ago
testFiles Squash commit 6 years ago
README.md Squash commit 6 years ago

README.md

Project includes 6 versions of a knn algorithm implementation:
Serial - space optimized
Serial - time optimized
MPI parallel - blocking communications - space optimized
MPI parallel - blocking communications - time optimized
MPI parallel - non blocking communications - space optimized
MPI parallel - non blocking communications - time optimized

Project folder also includes some test files and execution results (stats folder).

In folder testFiles there is a dataset of 60000 points of 30 dimensions each, as well as three IDX files extracted from Matlab storing correctly sorted IDs. These files are named according to the convention: numberOfPoints_k after the number of points and selected k that were used to run the Matlab script.

To run any version first run make. Then copy testFiles/data.bin and one of the IDX test files into the folder. Finally run with: mpiexec -np numTasks ./prog.out numPoints numDimensions k data.bin idxFileName

To extract a new IDX file from Matlab run knn with the dataset and then extract IDX variable to an ods/excel. Open the generated file and save as csv.