From cf015ef415758f5209cd3c6620af67ece5fa62b7 Mon Sep 17 00:00:00 2001 From: anapt Date: Sun, 28 Jan 2018 20:29:38 +0200 Subject: [PATCH 1/2] deleted --- mean_shift_cuda/s4_loceye_time.txt | 137 ------------------------- mean_shift_serial/s4_serial_loceye.txt | 116 --------------------- 2 files changed, 253 deletions(-) delete mode 100644 mean_shift_cuda/s4_loceye_time.txt delete mode 100644 mean_shift_serial/s4_serial_loceye.txt diff --git a/mean_shift_cuda/s4_loceye_time.txt b/mean_shift_cuda/s4_loceye_time.txt deleted file mode 100644 index 9127610..0000000 --- a/mean_shift_cuda/s4_loceye_time.txt +++ /dev/null @@ -1,137 +0,0 @@ -Device chosen is "GeForce GTX 1070" -Device has 15 multi processors and compute capability 6.1 -Max threads per block supported are 1024 - -Reading dataset and labels... -Done. - -Device memory allocation wall clock time = 0.092249 - -calculate_kernel_matrix_kernel called with: -dimBlock.x = 32, dimBlock.y = 32 -dimGrid.x = 157, dimGrid.y = 157 - -calculate_denominator called with: -dimBlock.x = 1024, dimBlock.y = 1 -dimGrid.x = 5, dimGrid.y = 1 - -shift_points_kernel called with: -dimBlock.x = 512, dimBlock.y = 2 -dimGrid.x = 10, dimGrid.y = 1 - -Recursion n. 0, error 927692.420199 -Recursion n. 1, error 726832.071041 -Recursion n. 2, error 581943.008045 -Recursion n. 3, error 477910.173261 -Recursion n. 4, error 396205.409103 -Recursion n. 5, error 335504.131558 -Recursion n. 6, error 293282.465763 -Recursion n. 7, error 255931.074369 -Recursion n. 8, error 217176.502908 -Recursion n. 9, error 184225.597806 -Recursion n. 10, error 156900.657670 -Recursion n. 11, error 139244.876747 -Recursion n. 12, error 123863.788594 -Recursion n. 13, error 110606.661038 -Recursion n. 14, error 97241.407806 -Recursion n. 15, error 85097.097975 -Recursion n. 16, error 72834.204110 -Recursion n. 17, error 61189.351790 -Recursion n. 18, error 57114.776420 -Recursion n. 19, error 52113.903356 -Recursion n. 20, error 46683.503554 -Recursion n. 21, error 45627.257398 -Recursion n. 22, error 45462.962391 -Recursion n. 23, error 43617.801926 -Recursion n. 24, error 40957.621436 -Recursion n. 25, error 39169.454275 -Recursion n. 26, error 36642.554737 -Recursion n. 27, error 33234.170852 -Recursion n. 28, error 31251.037548 -Recursion n. 29, error 30550.469179 -Recursion n. 30, error 30200.632861 -Recursion n. 31, error 30105.126757 -Recursion n. 32, error 29497.004654 -Recursion n. 33, error 26733.326716 -Recursion n. 34, error 21718.883294 -Recursion n. 35, error 16688.390032 -Recursion n. 36, error 13392.435100 -Recursion n. 37, error 12081.463254 -Recursion n. 38, error 12013.260151 -Recursion n. 39, error 12125.640867 -Recursion n. 40, error 11979.901812 -Recursion n. 41, error 11861.625809 -Recursion n. 42, error 12699.745511 -Recursion n. 43, error 15836.123874 -Recursion n. 44, error 21830.150525 -Recursion n. 45, error 25973.448245 -Recursion n. 46, error 23114.136003 -Recursion n. 47, error 19656.849824 -Recursion n. 48, error 16376.259816 -Recursion n. 49, error 12821.108251 -Recursion n. 50, error 10245.687625 -Recursion n. 51, error 9512.017920 -Recursion n. 52, error 10503.986327 -Recursion n. 53, error 12893.633245 -Recursion n. 54, error 16395.473470 -Recursion n. 55, error 19662.055425 -Recursion n. 56, error 19394.169985 -Recursion n. 57, error 14735.790724 -Recursion n. 58, error 9736.876327 -Recursion n. 59, error 6673.528841 -Recursion n. 60, error 5378.600020 -Recursion n. 61, error 5284.264364 -Recursion n. 62, error 5872.926699 -Recursion n. 63, error 6832.238864 -Recursion n. 64, error 7984.739309 -Recursion n. 65, error 9126.007027 -Recursion n. 66, error 9953.932568 -Recursion n. 67, error 10204.319105 -Recursion n. 68, error 9864.246602 -Recursion n. 69, error 9020.797079 -Recursion n. 70, error 7649.327959 -Recursion n. 71, error 5901.336946 -Recursion n. 72, error 4179.350770 -Recursion n. 73, error 2789.661686 -Recursion n. 74, error 1798.661942 -Recursion n. 75, error 1138.260267 -Recursion n. 76, error 713.324040 -Recursion n. 77, error 444.743371 -Recursion n. 78, error 276.540458 -Recursion n. 79, error 171.704910 -Recursion n. 80, error 106.530024 -Recursion n. 81, error 66.066664 -Recursion n. 82, error 40.963588 -Recursion n. 83, error 25.395950 -Recursion n. 84, error 15.743686 -Recursion n. 85, error 9.759711 -Recursion n. 86, error 6.050105 -Recursion n. 87, error 3.750486 -Recursion n. 88, error 2.324944 -Recursion n. 89, error 1.441247 -Recursion n. 90, error 0.893441 -Recursion n. 91, error 0.553852 -Recursion n. 92, error 0.343339 -Recursion n. 93, error 0.212840 -Recursion n. 94, error 0.131942 -Recursion n. 95, error 0.081793 -Recursion n. 96, error 0.050705 -Recursion n. 97, error 0.031433 -Recursion n. 98, error 0.019487 -Recursion n. 99, error 0.012081 -Recursion n. 100, error 0.007490 -Recursion n. 101, error 0.004645 -Recursion n. 102, error 0.002881 -Recursion n. 103, error 0.001788 -Recursion n. 104, error 0.001110 -Recursion n. 105, error 0.000691 -Recursion n. 106, error 0.000431 -Recursion n. 107, error 0.000271 -Recursion n. 108, error 0.000172 -Recursion n. 109, error 0.000112 -Recursion n. 110, error 0.000075 - -Copying between device and host wall clock time = 4.297452 - -Total number of recursions = 110 -Mean Shift wall clock time = 7.432588 diff --git a/mean_shift_serial/s4_serial_loceye.txt b/mean_shift_serial/s4_serial_loceye.txt deleted file mode 100644 index 974f89f..0000000 --- a/mean_shift_serial/s4_serial_loceye.txt +++ /dev/null @@ -1,116 +0,0 @@ -Iteration n. 0, error 927827.679145 -Iteration n. 1, error 726816.223326 -Iteration n. 2, error 581769.204949 -Iteration n. 3, error 477408.630077 -Iteration n. 4, error 395485.897206 -Iteration n. 5, error 334651.158957 -Iteration n. 6, error 292079.617208 -Iteration n. 7, error 254134.878622 -Iteration n. 8, error 215114.115728 -Iteration n. 9, error 182607.082276 -Iteration n. 10, error 156266.959549 -Iteration n. 11, error 139994.419331 -Iteration n. 12, error 125521.301757 -Iteration n. 13, error 112218.794486 -Iteration n. 14, error 98203.683241 -Iteration n. 15, error 85490.183638 -Iteration n. 16, error 73443.000140 -Iteration n. 17, error 62609.489556 -Iteration n. 18, error 59077.977003 -Iteration n. 19, error 53892.807510 -Iteration n. 20, error 47565.861958 -Iteration n. 21, error 45535.865588 -Iteration n. 22, error 44789.582377 -Iteration n. 23, error 42402.349216 -Iteration n. 24, error 39130.442990 -Iteration n. 25, error 37194.415972 -Iteration n. 26, error 35206.437543 -Iteration n. 27, error 32203.737761 -Iteration n. 28, error 29549.317563 -Iteration n. 29, error 27893.877946 -Iteration n. 30, error 27707.173303 -Iteration n. 31, error 28305.702063 -Iteration n. 32, error 28536.722112 -Iteration n. 33, error 27381.782682 -Iteration n. 34, error 24461.926511 -Iteration n. 35, error 21388.206521 -Iteration n. 36, error 19411.085140 -Iteration n. 37, error 18062.429515 -Iteration n. 38, error 16313.720166 -Iteration n. 39, error 14149.621211 -Iteration n. 40, error 12735.640987 -Iteration n. 41, error 12904.542590 -Iteration n. 42, error 14638.353297 -Iteration n. 43, error 18306.190364 -Iteration n. 44, error 23544.214839 -Iteration n. 45, error 23553.140641 -Iteration n. 46, error 18392.083676 -Iteration n. 47, error 14694.879614 -Iteration n. 48, error 12225.420016 -Iteration n. 49, error 10739.847211 -Iteration n. 50, error 10534.182216 -Iteration n. 51, error 11687.348948 -Iteration n. 52, error 14062.339499 -Iteration n. 53, error 17369.101524 -Iteration n. 54, error 20559.831905 -Iteration n. 55, error 21136.519253 -Iteration n. 56, error 17377.395549 -Iteration n. 57, error 11721.238164 -Iteration n. 58, error 7238.102387 -Iteration n. 59, error 4490.416936 -Iteration n. 60, error 3033.925265 -Iteration n. 61, error 2436.691557 -Iteration n. 62, error 2353.336915 -Iteration n. 63, error 2533.497324 -Iteration n. 64, error 2856.349865 -Iteration n. 65, error 3287.585287 -Iteration n. 66, error 3830.184434 -Iteration n. 67, error 4497.835721 -Iteration n. 68, error 5294.470384 -Iteration n. 69, error 6181.873374 -Iteration n. 70, error 7025.865491 -Iteration n. 71, error 7550.297363 -Iteration n. 72, error 7412.828189 -Iteration n. 73, error 6485.358048 -Iteration n. 74, error 5057.608511 -Iteration n. 75, error 3602.756868 -Iteration n. 76, error 2419.456728 -Iteration n. 77, error 1570.460802 -Iteration n. 78, error 1000.682242 -Iteration n. 79, error 631.345226 -Iteration n. 80, error 396.219332 -Iteration n. 81, error 247.948883 -Iteration n. 82, error 154.922317 -Iteration n. 83, error 96.716125 -Iteration n. 84, error 60.351123 -Iteration n. 85, error 37.650098 -Iteration n. 86, error 23.485083 -Iteration n. 87, error 14.648430 -Iteration n. 88, error 9.136461 -Iteration n. 89, error 5.698500 -Iteration n. 90, error 3.554205 -Iteration n. 91, error 2.216796 -Iteration n. 92, error 1.382645 -Iteration n. 93, error 0.862377 -Iteration n. 94, error 0.537881 -Iteration n. 95, error 0.335487 -Iteration n. 96, error 0.209251 -Iteration n. 97, error 0.130515 -Iteration n. 98, error 0.081405 -Iteration n. 99, error 0.050775 -Iteration n. 100, error 0.031670 -Iteration n. 101, error 0.019754 -Iteration n. 102, error 0.012321 -Iteration n. 103, error 0.007685 -Iteration n. 104, error 0.004794 -Iteration n. 105, error 0.002991 -Iteration n. 106, error 0.001866 -Iteration n. 107, error 0.001165 -Iteration n. 108, error 0.000727 -Iteration n. 109, error 0.000455 -Iteration n. 110, error 0.000285 -Iteration n. 111, error 0.000179 -Iteration n. 112, error 0.000114 -Iteration n. 113, error 0.000073 -Total iterations = 113 -Mean Shift wall clock time = 78.257193 From a2f1cd9fae2e103a350cbd105d3a2d277a8b736e Mon Sep 17 00:00:00 2001 From: Apostolof Date: Sun, 28 Jan 2018 20:45:57 +0200 Subject: [PATCH 2/2] Readme update --- README.md | 29 +++++++++++++++++++++++---- mean_shift_cuda/meanshift_gpu_utils.h | 2 +- 2 files changed, 26 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index f370845..59547db 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,4 @@ + # Mean-shift [Mean-shift] is a mathematical procedure, adopted in algorithms, designed in the 70's by Fukunaga and Hostetler. The algorithm is used for: @@ -8,9 +9,15 @@ ## Repository -This repository provides a serial implementation of the algorithm in C language, as well as the parallel equivalent in CUDA. The project was undertaken as part of the "Parallel and distributed systems" course of AUTH university. +This repository provides a serial implementation of the algorithm in C language, as well as two versions of the parallel equivalent in CUDA, with and without the usage of shared memory. The project was undertaken as part of the "Parallel and distributed systems" course of AUTH university. + +A [Gaussian] kernel was used for the weighting function. The code was tested for different data sets and information regarding the execution time and correctness were extracted. In addition, the two versions of the parallel algorithm were tested and compared. + +## Dependencies + +For the serial algorithm only a compiler is needed (e.g. gcc). -A [Gaussian] kernel was used for the weighting function. The code was tested for different data sets and information regarding the execution time and correctness were extracted. In addition, two versions of the parallel algorithm were tested and compared, with and without the usage of shared memory respectively. +To compile the parallel versions, the standard CUDA toolkit installation instructions for the intended platform should be followed beforehand as described [here]. ## Compilation @@ -22,7 +29,20 @@ $ make ## Usage -blah blah, arguments needed etc +Run the code with the command: +```sh +$ ./meanshift h e N D Pd Pl +``` +where: + + 1. **h** is the desirable variance + 2. **e** is the min distance, between two points, that is taken into account in computations + 3. **N** is the the number of points + 4. **D** is the number of dimensions of each point + 5. **Pd** is the path of the dataset file + 6. **Pl** is the path of the labels file + 7. **--verbose** | **-v** is an optional flag to enable execution information output + 8. **--output** | **-o** is an optional flag to enable points output in each iteration **Free Software, Hell Yeah!** @@ -30,4 +50,5 @@ blah blah, arguments needed etc [//]: # (Links) [Mean-shift]: - [Gaussian]: \ No newline at end of file + [Gaussian]: + [here]: diff --git a/mean_shift_cuda/meanshift_gpu_utils.h b/mean_shift_cuda/meanshift_gpu_utils.h index 83f9784..ced2ce5 100644 --- a/mean_shift_cuda/meanshift_gpu_utils.h +++ b/mean_shift_cuda/meanshift_gpu_utils.h @@ -27,7 +27,7 @@ extern cudaDeviceProp device_properties; void set_GPU(); //Function meanshift recursively shifts original points according to the mean-shift algorithm saving -//the result to shiftedPoints, h is the desirable deviation +//the result to shiftedPoints, h is the desired deviation int meanshift(double **original_points, double ***shifted_points, int h); //Function init_device_memory allocates memory for necessary arrays in the device