Compiling for AVX-512
log in

Advanced search

Message boards : Number crunching : Compiling for AVX-512

Previous · 1 · 2 · 3 · Next
Author Message
Aurum
Send message
Joined: 18 Jul 18
Posts: 50
Credit: 90,448,874
RAC: 651,523
United States
Message 2092 - Posted: 19 Nov 2020, 17:58:42 UTC

I applied for a Covid-19 research license to get the Intel compiler. Waiting to hear back.

Profile valterc
Project administrator
Project tester
Send message
Joined: 30 Oct 13
Posts: 489
Credit: 25,018,775
RAC: 12,995
Italy
Message 2093 - Posted: 19 Nov 2020, 18:28:30 UTC - in response to Message 2092.

I applied for a Covid-19 research license to get the Intel compiler. Waiting to hear back.

I suggest to use the latest free gcc on Linux (if you want to play with it), I cannot help you with the Intel compiler but others probably would. You may find the source code of the application here: https://bitbucket.org/francesco-asnicar/pc-boinc/

Aurum
Send message
Joined: 18 Jul 18
Posts: 50
Credit: 90,448,874
RAC: 651,523
United States
Message 2094 - Posted: 19 Nov 2020, 19:14:20 UTC

I checked and gcc comes as a default in LM 20:
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)

I also installed libraries: sudo apt-get install build-essential gdb

I'm assuming there's no need to edit or modify your code in any way. So now I'm learning to use gcc.

Aurum
Send message
Joined: 18 Jul 18
Posts: 50
Credit: 90,448,874
RAC: 651,523
United States
Message 2095 - Posted: 19 Nov 2020, 19:29:42 UTC - in response to Message 2093.
Last modified: 19 Nov 2020, 20:06:48 UTC

find the source code of the application here: https://bitbucket.org/francesco-asnicar/pc-boinc/

Thanks for that. BTW, libbz2-dev is now in the Ubuntu 20.04 repositories.
Following the instructions but I don't know the lingo, e.g. what does "clone this repository in the folder BOINC_dev/boinc/samples/" mean???
I think my problem is knowing what "this" refers to.
Next line, "Assuming you now have this repository inside BOINC_dev/boinc/samples/pc-boinc/"
Does "this" mean this download? https://bitbucket.org/francesco-asnicar/pc-boinc/downloads/

Edit: DLed that file and extracted it. Changed folder name to pc-boinc.
"you can compile it using the provided scripts inside the src folder"
These scripts might as well be in Klingon. I'll look for gcc tutorials and see if I can figure out what to do next.

Aurum
Send message
Joined: 18 Jul 18
Posts: 50
Credit: 90,448,874
RAC: 651,523
United States
Message 2096 - Posted: 24 Nov 2020, 20:12:25 UTC

It appears that after compiling I'll need to test it using the BOINC Anonymous Platform. https://boinc.berkeley.edu/wiki/Anonymous_platform
I ginned up an app_info.xml file by emulating the TN-Grid client_state.xml info:

<app_info> <app> <name>gene_pcim</name> </app> <file_info> <name>gene_pcim_v1.10_linux64__avx512</name> <executable/> </file_info> <app_version> <app_name>gene_pcim</app_name> <version_num>110</version_num> <api_version>7.9.0</api_version> <plan_class>avx512</plan_class> <flops>5784707816.501904</flops> <avg_ncpus>1.000000</avg_ncpus> <file_ref> <file_name>gene_pcim_v1.10_linux64__avx512</file_name> <main_program/> </file_ref> </app_version> </app_info>
If this looks ok I'll give it a whirl.
Do I need to delete the other executables and only have the avx512 executable in my project folder or will they just be ignored???

Profile valterc
Project administrator
Project tester
Send message
Joined: 30 Oct 13
Posts: 489
Credit: 25,018,775
RAC: 12,995
Italy
Message 2097 - Posted: 25 Nov 2020, 12:16:54 UTC - in response to Message 2096.
Last modified: 25 Nov 2020, 12:19:45 UTC

it seems to me (but I may be wrong) that you don't need to define the plan_class, instead you should define the platform (like x86_64-pc-linux-gnu)

When switching from regular use to anonymous platform I suggest to: wait until there are no workunits in the cache (or abort them all), exit boinc, copy the app_info.xml and the executable in the proper place (check its x bit, chmod a+x), start boinc.

Hope it works, I'm curios about it.

EDIT: which compiler switch did you use? -march=skylake-avx512 ?

Aurum
Send message
Joined: 18 Jul 18
Posts: 50
Credit: 90,448,874
RAC: 651,523
United States
Message 2098 - Posted: 25 Nov 2020, 15:16:14 UTC - in response to Message 2097.

it seems to me (but I may be wrong) that you don't need to define the plan_class, instead you should define the platform (like <platform>x86_64-pc-linux-gnu</platform>)
I don't know but I think BOINC learns everything it needs about my CPU in my client_state.xml <host_info> section.

When switching from regular use to anonymous platform I suggest to: wait until there are no workunits in the cache (or abort them all), exit boinc, copy the app_info.xml and the executable in the proper place (check its x bit, chmod a+x), start boinc.
This is where my head starts spinning. In client_state.xml I have these 3 statements:
<download_url>http://gene.disi.unitn.it/test/download/gene_pcim_v1.10_linux64__avx</download_url>
<download_url>http://gene.disi.unitn.it/test/download/gene_pcim_v1.10_linux64__fma</download_url>
<download_url>http://gene.disi.unitn.it/test/download/gene_pcim_v1.10_linux64__sse2</download_url>
Which makes me think I have to have something like them with __avx512 to get new WUs. So I was planning on testing with WUs I already have DLed.

EDIT: which compiler switch did you use? -march=skylake-avx512 ?
Yes, these are the two switches I came across it seems I need:
-march=skylake-avx512 -ftree-vectorize
But then not all AVX-512 CPUs are Skylake. E.g., i9-10980XE is formerly known as Cascade Lake: https://ark.intel.com/content/www/us/en/ark/products/198017/intel-core-i9-10980xe-extreme-edition-processor-24-75m-cache-3-00-ghz.html
Apparently they're called the Core X-series but there's not a switch for Core X. Also there's Knights Landing which I do not have. So I guess I just have to try them. If not all my AVX-512 CPUs work then I could try -march=cascadelake.
There's so many switches I don't know if I need any others or not?
https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html

I'm still trying to figure out my questions in message 2095. I hope to focus on this today.

Profile valterc
Project administrator
Project tester
Send message
Joined: 30 Oct 13
Posts: 489
Credit: 25,018,775
RAC: 12,995
Italy
Message 2099 - Posted: 25 Nov 2020, 16:00:57 UTC - in response to Message 2098.
Last modified: 25 Nov 2020, 16:02:48 UTC

About the 3 lines in client_state.xml: that's because your computer supports all three versions of the applications (sse2, avx, fma). BOINC tries all of them and eventually should choose the faster one.

I guess that at the moment you shouldn't specify avx512 in the app_info (using app_info you override the server logic for distributing applications). The name of the executable doesn't matter. A name like gene_pcim_v1.10_linux64__avx is simply useful because from that you can inherit its characteristics.

Also it seems to me that avx512 is actually a set of AVX512-whatever instructions, I don't know which one would be better to target, probably avx512f. There is some discussion about it here: https://github.com/BOINC/boinc/issues/3180. Is your BOINC client able to detect avx512?

Aurum
Send message
Joined: 18 Jul 18
Posts: 50
Credit: 90,448,874
RAC: 651,523
United States
Message 2100 - Posted: 25 Nov 2020, 16:38:03 UTC

I looked at that AVX-512 list and guessed that some could be excluded, e.g. AI, neural nets & Galois Fields. Beyond that I'm willing to try them all if we have a way to test and tell them apart. But we seem to only have 4 to chose from: F, VL, DQ & BW. CD is only for Xeon Phi. I try them in this order: F, BW, DQ & VL.

BTW, I don't think I've ever seen BOINC settle on just one. If I sort BoincTasks on (WU) Name and look for clusters running on the same computer with 2 or 3 instruction sets running I can't tell their speeds apart. One WU to another can differ by hours. Even within the parts of a WU I doubt they're the same, e.g.
175168_Hs_T119704-RHBDD1_wu-1_1606188238821
175168_Hs_T119704-RHBDD1_wu-2_1606188238821
175168_Hs_T119704-RHBDD1_wu-3_1606188238821
etc...
That's why having a Calibration Standard WU could be nice. It could be DLed from https://bitbucket.org/francesco-asnicar/pc-boinc/downloads/ but never ULed, just note the completion time for comparison. Then use BOINC Anonymous Platform to test versions one by one.

Aurum
Send message
Joined: 18 Jul 18
Posts: 50
Credit: 90,448,874
RAC: 651,523
United States
Message 2101 - Posted: 25 Nov 2020, 18:28:20 UTC

Ok, I edited makefile by adding:

# AVX512, 64-bit ARCH += -march=skylake-avx512 -mtune=skylake-avx512 -mavx512f -mpopcnt -maes -mpclmul -m64 -ftree-vectorize
Then in a terminal I ran:
/home/aurum/BOINC_dev/boinc/samples/pc-boinc/src/linux64_build.sh
Lots of lines scrolled by, terminal closed and I can't find an executable. It added a link to libstdc++.a that when I try to open it says archive type not supported (I have Linux Mint 20 Ubuntu 20.04 Focal).

What's next???

Profile valterc
Project administrator
Project tester
Send message
Joined: 30 Oct 13
Posts: 489
Credit: 25,018,775
RAC: 12,995
Italy
Message 2102 - Posted: 25 Nov 2020, 18:30:42 UTC - in response to Message 2100.
Last modified: 25 Nov 2020, 18:33:20 UTC

Regarding your previous message: You may look at this small validation suite for the gene pc-im application (a small and normal input file, with the outputs as reference). Hope it helps.

Profile valterc
Project administrator
Project tester
Send message
Joined: 30 Oct 13
Posts: 489
Credit: 25,018,775
RAC: 12,995
Italy
Message 2103 - Posted: 25 Nov 2020, 18:43:13 UTC - in response to Message 2101.

The build script you were using tries first to compile the needed BOINC libraries (which is not so easy). You can skip this part by using the ones I compiled some time ago (although not the latest ones), get them from here https://gene.disi.unitn.it/test/files/boinc_libs-x32-x64.7z. After that you may move inside the pc-boinc/src directory and issue a "make" from here. I don't expect that you will be successful at the first try (the executable, if built, will land in pc-boinc/bin)

Ok, I edited makefile by adding:
# AVX512, 64-bit ARCH += -march=skylake-avx512 -mtune=skylake-avx512 -mavx512f -mpopcnt -maes -mpclmul -m64 -ftree-vectorize
Then in a terminal I ran:
/home/aurum/BOINC_dev/boinc/samples/pc-boinc/src/linux64_build.sh
Lots of lines scrolled by, terminal closed and I can't find an executable. It added a link to libstdc++.a that when I try to open it says archive type not supported (I have Linux Mint 20 Ubuntu 20.04 Focal).

What's next???

Aurum
Send message
Joined: 18 Jul 18
Posts: 50
Credit: 90,448,874
RAC: 651,523
United States
Message 2104 - Posted: 25 Nov 2020, 19:35:35 UTC

I've been reading the 11,041 line config.log file and it keeps trying different things. I suspect part of the problem may be that when I cloned BOINC I got the dev version and not the 7.16.6 release:
| #define PACKAGE_NAME "BOINC"
| #define PACKAGE_TARNAME "boinc"
| #define PACKAGE_VERSION "7.17.0"

git clone https://github.com/BOINC/boinc boinc

How can I clone 7.16.6???

I'll look at your zip.

Xavier Wallece
Send message
Joined: 25 May 20
Posts: 3
Credit: 1,970,884
RAC: 7,210
Message 2106 - Posted: 25 Nov 2020, 23:29:20 UTC
Last modified: 25 Nov 2020, 23:31:35 UTC

After reading this topic, I started reviewing the code.

I suggest you first compile the program as-is.
run test_run2.sh so the file compare says the files are equal.

I sugest you make a copy of test_run2.sh (and bin/pc) and change the parameter value '2470' to something large like 100000
the program runs longer so you should see the diffrence better. Output files will not be corrects since the input file does not contain that many entries.

make a backup of the file bin/pc. With each make you will overwrite this file.

For a quick and dirty tryout you need to:

Look in de SIMD folder (under src) you notice that there are 4 code paths: NEON,scalar,SSE and AVX.
change the file AvxDoubleVectorTraits.hpp in src/simd folder

change VectorSize from 32 to 64
and DataAlignment from 32 to 64.

change avx functions like _mm256_add_pd with the avx512 ones like _mm512_add_pd.
see intels reference guide to do so:

https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=AVX_512&text=_pd&expand=127

Also do not forget to change the sum method. it adds only 4 variables (avx) instead of 8 (avx512)

Also change DoubleVector.hpp (see http://gene.disi.unitn.it/test/forum_thread.php?id=302 )
In 2017 avx was not very performant on intel haswell so someone changed it back to SSE. This is the reason that avx code is as fast as the sse code.



line 71: typedef AvxDoubleVector DoubleVectorLong;
line 72: typedef SseDoubleVector DoubleVector;
line 73: #elif defined (__SSE2__)

change line 72 to
typedef AvxDoubleVector DoubleVector;



After you have done that change the Makefile

It seems that you are using AVX512F functions(see intel guide above) for intel skylake and icelake.
If you compile the program you will see if it compiles for SSE or AVX. If it compiles for SSE add the "-mavx -mfma -mavx2 " parameters in the makefile in ARCH

Compile the program and run via test_run2.sh if the results are equal you can try to run in with a larger set.

Looking forward in seeing your results

Aurum
Send message
Joined: 18 Jul 18
Posts: 50
Credit: 90,448,874
RAC: 651,523
United States
Message 2110 - Posted: 26 Nov 2020, 12:25:08 UTC
Last modified: 26 Nov 2020, 12:49:59 UTC

I suggest you first compile the program as-is.
Ok, to me "as-is" means do not change anything just run it. Tried that and the only thing in ./BOINC_dev/boinc/samples/pc-boinc/bin is a.txt, a placeholder. Next, I uncommented AVX in ./pc-boinc/src/Makefile:
# SSE2, 64-bit ARCH += -march=core2 -mtune=core2 -m64 # AVX, 64-bit #ARCH += -march=core2 -mtune=generic -msse4.2 -mpopcnt -maes -mpclmul -mavx -m64 # AVX+FMA, 64-bit #ARCH += -march=core2 -mtune=generic -msse4.2 -mpopcnt -maes -mpclmul -mavx -mfma -m64 # AVX2+FMA, 64-bit #ARCH += -march=core2 -mtune=generic -msse4.2 -mpopcnt -maes -mpclmul -mavx -mfma -mavx2 -m64
And still got nothing.
It adds a Link to Archive: /home/aurum/BOINC_dev/boinc/samples/pc-boinc/src/libstdc++.a
/usr/lib/gcc/x86_64-linux-gnu/9/libstdc++.a
Suspect it's trying to tell me I'm missing a library. I searched the Synaptic Package Manager (SPM) for libstdc++ and it listed 374 packages with 6 of them installed:
lib32stdc++-9-dev
libstdc++-9-dev
libstdc++6
libstdc++6:i386
libx32stdc++-9-dev
libx32stdc++6
So I installed lib32stdc++-10-dev and SNP added libstdc++-10-dev. Deleted the Link to Archive and tried again. Still got nothing and the Link to Archive is back.
Any suggestions what I'm missing???
Edit: I bet if someone that knows what they're doing looked at my config.log output they could see what I need to fix or install. It's 11401 lines and has so many comments like:
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04) configure:4093: $? = 0 configure:4082: gcc -V >&5 gcc: error: unrecognized command line option '-V' gcc: fatal error: no input files compilation terminated. configure:4093: $? = 1 configure:4082: gcc -qversion >&5 gcc: error: unrecognized command line option '-qversion'; did you mean '--version'? gcc: fatal error: no input files compilation terminated. configure:4093: $? = 1

Profile Keith Myers
Send message
Joined: 26 Jun 20
Posts: 33
Credit: 2,136,660
RAC: 8,320
United States
Message 2111 - Posted: 26 Nov 2020, 17:27:37 UTC

First thing you need to correct in your app_info is the application version.
You can't use an already existing app version that the project already distributes.

So change to app version 999 or something well out of range of what the project might ever release.

And change the applications name accordingly of course

My special gpu application I compiled for Einstein BRP on the Nano is versioned 999.

Profile Keith Myers
Send message
Joined: 26 Jun 20
Posts: 33
Credit: 2,136,660
RAC: 8,320
United States
Message 2112 - Posted: 26 Nov 2020, 17:34:21 UTC - in response to Message 2104.

I've been reading the 11,041 line config.log file and it keeps trying different things. I suspect part of the problem may be that when I cloned BOINC I got the dev version and not the 7.16.6 release:
| #define PACKAGE_NAME "BOINC"
| #define PACKAGE_TARNAME "boinc"
| #define PACKAGE_VERSION "7.17.0"

git clone https://github.com/BOINC/boinc boinc

How can I clone 7.16.6???

I'll look at your zip.

Use the Tags branch at github. You can clone any point release of BOINC by going into the TAG tree.
https://github.com/BOINC/boinc/tree/client_release/7.16/7.16.6

Aurum
Send message
Joined: 18 Jul 18
Posts: 50
Credit: 90,448,874
RAC: 651,523
United States
Message 2113 - Posted: 26 Nov 2020, 18:02:59 UTC - in response to Message 2111.
Last modified: 26 Nov 2020, 18:04:01 UTC

First thing you need to correct in your app_info is the application version.
You can't use an already existing app version that the project already distributes.

So change to app version 999 or something well out of range of what the project might ever release.

And change the applications name accordingly of course
Liken unto so?
<app_info> <app> <name>gene_pcim</name> </app> <file_info> <name>gene_pcim_v6.66_linux64__avx512</name> <executable/> </file_info> <app_version> <app_name>gene_pcim</app_name> <version_num>666</version_num> <api_version>7.9.0</api_version> <plan_class>avx512</plan_class> <flops>5784707816.501904</flops> <avg_ncpus>1.000000</avg_ncpus> <file_ref> <file_name>gene_pcim_v6.66_linux64__avx512</file_name> <main_program/> </file_ref> </app_version> </app_info>

Profile Keith Myers
Send message
Joined: 26 Jun 20
Posts: 33
Credit: 2,136,660
RAC: 8,320
United States
Message 2114 - Posted: 26 Nov 2020, 19:58:11 UTC - in response to Message 2113.

Yes. This is my Einstein app_info for example.


<app_info> <app> <name>einsteinbinary_BRP4</name> </app> <file_info> <name>einsteinbinary_cuda64</name> <executable/> </file_info> <file_info> <name>einsteinbinary_cuda-db.dev</name> </file_info> <file_info> <name>einsteinbinary_cuda-dbhs.dev</name> </file_info> <file_info> <name>libcufft.so.8.0</name> </file_info> <file_info> <name>libcudart.so.8.0</name> </file_info> <app_version> <app_name>einsteinbinary_BRP4</app_name> <version_num>999</version_num> <api_version>7.2.2</api_version> <coproc> <type>CUDA</type> <count>1.0</count> </coproc> <file_ref> <file_name>einsteinbinary_cuda64</file_name> <main_program/> </file_ref> <file_ref> <file_name>einsteinbinary_cuda-db.dev</file_name> <open_name>db.dev</open_name> <copy_file/> </file_ref> <file_ref> <file_name>einsteinbinary_cuda-dbhs.dev</file_name> <open_name>dbhs.dev</open_name> <copy_file/> </file_ref> <file_ref> <file_name>libcufft.so.8.0</file_name> <copy_file/> </file_ref> <file_ref> <file_name>libcudart.so.8.0</file_name> <copy_file/> </file_ref> </app_version> </app_info>

Aurum
Send message
Joined: 18 Jul 18
Posts: 50
Credit: 90,448,874
RAC: 651,523
United States
Message 2116 - Posted: 27 Nov 2020, 16:12:25 UTC - in response to Message 2112.

Use the Tags branch at github. You can clone any point release of BOINC by going into the TAG tree.
https://github.com/BOINC/boinc/tree/client_release/7.16/7.16.6
I tried to no avail.
aurum@Rig-38:~$ git clone https://github.com/BOINC/boinc/tree/client_release/7.16/7.16.6 boinc Cloning into 'boinc'... fatal: repository 'https://github.com/BOINC/boinc/tree/client_release/7.16/7.16.6/' not found

Previous · 1 · 2 · 3 · Next
Post to thread

Message boards : Number crunching : Compiling for AVX-512


Main page · Your account · Message boards


Copyright © 2021 CNR-TN & UniTN