log in |
1)
Message boards :
Development :
Doublevector always uses SSE path instead of AVX
(Message 2115)
Posted 27 Nov 2020 by Xavier Wallece I'm looking into it and see that it is running several sets. Each set runs for about 13 seconds. After that a new set begins. I'm going to test the avx/SSE doublevector code in the pc algorithem part. But it is allready heavely optimized. I also understand why avx does not always help, the algoritem parts it needs to go though are small so it cannot strech it legs. |
2)
Message boards :
Number crunching :
Compiling for AVX-512
(Message 2106)
Posted 25 Nov 2020 by Xavier Wallece After reading this topic, I started reviewing the code. I suggest you first compile the program as-is. run test_run2.sh so the file compare says the files are equal. I sugest you make a copy of test_run2.sh (and bin/pc) and change the parameter value '2470' to something large like 100000 the program runs longer so you should see the diffrence better. Output files will not be corrects since the input file does not contain that many entries. make a backup of the file bin/pc. With each make you will overwrite this file. For a quick and dirty tryout you need to: Look in de SIMD folder (under src) you notice that there are 4 code paths: NEON,scalar,SSE and AVX. change the file AvxDoubleVectorTraits.hpp in src/simd folder change VectorSize from 32 to 64 and DataAlignment from 32 to 64. change avx functions like _mm256_add_pd with the avx512 ones like _mm512_add_pd. see intels reference guide to do so: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=AVX_512&text=_pd&expand=127 Also do not forget to change the sum method. it adds only 4 variables (avx) instead of 8 (avx512) Also change DoubleVector.hpp (see http://gene.disi.unitn.it/test/forum_thread.php?id=302 ) In 2017 avx was not very performant on intel haswell so someone changed it back to SSE. This is the reason that avx code is as fast as the sse code.
After you have done that change the Makefile It seems that you are using AVX512F functions(see intel guide above) for intel skylake and icelake. If you compile the program you will see if it compiles for SSE or AVX. If it compiles for SSE add the "-mavx -mfma -mavx2 " parameters in the makefile in ARCH Compile the program and run via test_run2.sh if the results are equal you can try to run in with a larger set. Looking forward in seeing your results |
3)
Message boards :
Development :
Doublevector always uses SSE path instead of AVX
(Message 2105)
Posted 25 Nov 2020 by Xavier Wallece Good day Today, After looking into https://bitbucket.org/francesco-asnicar/pc-boinc/src/master/ and compiling the code myself in ubuntu, I noticed that de DoubleVector always uses the SSE code and not the AVX code.
It seems that haswell in 2017 was not very performant with AVX at the time and therefor it was disabled. Running the compiled verion as follows: bin/pc input/tile2.txt output/output2.txt 0.05 1 100000 0 The avx realtime is 22 second, the SSE version realtime is 32 seconds under ubuntu in hyper-v. This only helps if the 5th parameter (100000) is large enough. If it is small like the test_run.sh scripts it almost makes no difference. Could someone verify that the compiled gene_pcim_v1.11_win64__avx.exe is running in SSE instead of AVX mode? |