Optimization
log in

Advanced search

Message boards : Number crunching : Optimization

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · Next
Author Message
Misho
Send message
Joined: 7 Dec 16
Posts: 3
Credit: 104,201
RAC: 0
Bulgaria
Message 1067 - Posted: 20 May 2017, 7:40:29 UTC

Hello and from me.
I am new in this project and want to ask one thing: how can i reduce which app was download to my BOINC? Iwant only FMA or only AVX but the client start download all app: avx, ss2 and fma. What cind of app_config file must i put to reduce only fma app?

koschi
Send message
Joined: 22 Oct 16
Posts: 24
Credit: 3,053,098
RAC: 39
Germany
Message 1068 - Posted: 20 May 2017, 19:47:45 UTC
Last modified: 20 May 2017, 19:48:46 UTC

The server initially sends you all 3 flavors, after you return some results for each it decides which is the best for you and would usually send you WUs tagged for just one app version.

Please note, those are the default applications, not the latest updated ones by Daniel. Check back at http://gene.disi.unitn.it/test/forum_thread.php?id=135&postid=1031#1031 and download a copy from his link. If you have AVX or FMA capable CPUs, head straight for those apps. Each archive contains an app_info.xml, so once activated you get only WUs for that application.

Misho
Send message
Joined: 7 Dec 16
Posts: 3
Credit: 104,201
RAC: 0
Bulgaria
Message 1069 - Posted: 21 May 2017, 2:10:04 UTC - in response to Message 1068.

So....what is the difference between default app which i download automatic (for example gene_pcim_v1.02_win64__fma) and the same one fma from the link and the archive? If i saw right both are version 1.02? Which one i must use for better performance?

Profile Daniel
Volunteer developer
Send message
Joined: 19 Oct 16
Posts: 80
Credit: 2,202,886
RAC: 0
Poland
Message 1070 - Posted: 21 May 2017, 19:44:42 UTC - in response to Message 1069.

I asked rattorosso [Marche] to create apps for remaining 3 ARM architectures: armv6_vfp, armv7_vfpv3 and aarch64. He send them to me, and I uploaded them in usual place: https://bitbucket.org/sirzooro/pc-boinc/downloads/. Feel free to download and test them too.

So....what is the difference between default app which i download automatic (for example gene_pcim_v1.02_win64__fma) and the same one fma from the link and the archive? If i saw right both are version 1.02? Which one i must use for better performance?

Both internally specify the same version 1.02, but version from this thread has extra optimizations so it runs faster than official one. Valterc is going to take it and release as a new version of official one at end of May.
____________

Misho
Send message
Joined: 7 Dec 16
Posts: 3
Credit: 104,201
RAC: 0
Bulgaria
Message 1071 - Posted: 22 May 2017, 4:05:43 UTC - in response to Message 1070.

Both internally specify the same version 1.02, but version from this thread has extra optimizations so it runs faster than official one. Valterc is going to take it and release as a new version of official one at end of May.


I tested on my pc and there is the result (for some reason on my pc the SSE2 app is fastest app. CPU - i7 4790, 32GB RAM @1600MHz):
1. the official app SDSE2: Run time: 27 min 39 sec; CPU time: 27 min 30 sec;
2. the optimized app SSE2: Run time: 28 min 17 sec; CPU time: 27 min 43 sec;

[AF>Le_Pommier] Jerome_C2005
Send message
Joined: 12 May 14
Posts: 4
Credit: 176,072
RAC: 1
Mexico
Message 1076 - Posted: 27 May 2017, 18:12:28 UTC

Hi

any news regarding the Mac OS version ?

Partygott
Send message
Joined: 8 Apr 17
Posts: 3
Credit: 3,160,712
RAC: 112
Germany
Message 1077 - Posted: 29 May 2017, 18:01:26 UTC

I tested 3x the same 16 WUs on a Ryzen 1700 @ 3,8 GHz with SMT on.

avx: 28:30 - 28:48
sse: 28:56 - 29:14
fma: 29:43 - 30:01
standard: ~35min.

i5-3570 (3,2 GHz) standard: ~28min.

So there is a difference of ~18% per WU on the Ryzen, what is pretty nice.
But I wonder why the Ryzen isn't faster than the old i5, even with the optimized app and 600 MHz more clock speed. I saw this also on other Projects.
Seems to be that most of the apps from all projects are optimized for Intel.

Jim1348
Send message
Joined: 29 Dec 16
Posts: 6
Credit: 144,359
RAC: 0
United States
Message 1078 - Posted: 30 May 2017, 0:58:15 UTC - in response to Message 1077.

But I wonder why the Ryzen isn't faster than the old i5, even with the optimized app and 600 MHz more clock speed. I saw this also on other Projects.
Seems to be that most of the apps from all projects are optimized for Intel.

It is more because with the i5-3570 you are using a full core, whereas with the Ryzen you are using a virtual core. So it is not bad at all, though you may be right that it is optimized for Intel.

europe64
Send message
Joined: 25 Oct 16
Posts: 1
Credit: 1,004,637
RAC: 0
Message 1082 - Posted: 1 Jun 2017, 12:40:06 UTC - in response to Message 1078.

hi,
the FPU of the AMD processor is slow

Jay
Send message
Joined: 2 Jun 17
Posts: 2
Credit: 269,368
RAC: 4
Message 1083 - Posted: 4 Jun 2017, 0:46:55 UTC - in response to Message 1068.

It appears that initially sse2 was downloaded then it switched over to avx when it realized that those run better, I have an i7-4770. However, now it seems that it has switched back to sse2? There also seems to be no sign of it running FMA...

Profile valterc
Project administrator
Project tester
Send message
Joined: 30 Oct 13
Posts: 320
Credit: 16,287,174
RAC: 4,127
Italy
Message 1084 - Posted: 5 Jun 2017, 10:17:40 UTC - in response to Message 1083.
Last modified: 5 Jun 2017, 10:18:00 UTC

It appears that initially sse2 was downloaded then it switched over to avx when it realized that those run better, I have an i7-4770. However, now it seems that it has switched back to sse2? There also seems to be no sign of it running FMA...

The FMA version is still in 'beta' because of AMD Ryzen problems, if you want to try it you should enable 'receiving beta work' in your profile.

Jay
Send message
Joined: 2 Jun 17
Posts: 2
Credit: 269,368
RAC: 4
Message 1085 - Posted: 6 Jun 2017, 6:35:09 UTC - in response to Message 1084.

Thanks. I followed the instructions in message #1033 so now I'm running gene@home PC-IM (Opti v1.2) 1.02

Buro87 [Lombardia]
Send message
Joined: 23 Nov 16
Posts: 13
Credit: 661,697
RAC: 4,536
Italy
Message 1149 - Posted: 23 Oct 2017, 21:45:47 UTC - in response to Message 1085.

I saw many users are running v 1.03
is faster than (Opti v.1.2) 1.02 developed by Daniel?

Buro87 [Lombardia]
Send message
Joined: 23 Nov 16
Posts: 13
Credit: 661,697
RAC: 4,536
Italy
Message 1156 - Posted: 24 Oct 2017, 20:51:58 UTC - in response to Message 1149.

I saw many users are running v 1.03
is faster than (Opti v.1.2) 1.02 developed by Daniel?


sorry, my bad. the v1.03 is only for linux

Profile Daniel
Volunteer developer
Send message
Joined: 19 Oct 16
Posts: 80
Credit: 2,202,886
RAC: 0
Poland
Message 1157 - Posted: 24 Oct 2017, 20:58:06 UTC - in response to Message 1156.

I saw many users are running v 1.03
is faster than (Opti v.1.2) 1.02 developed by Daniel?


sorry, my bad. the v1.03 is only for linux

Official 1.0x apps are the same as Opti v.1.1 ones. Opti v.1.2 apps are not officially added yet.
____________

Profile Beyond
Avatar
Send message
Joined: 2 Nov 16
Posts: 24
Credit: 5,477,498
RAC: 24,123
United States
Message 1198 - Posted: 16 Nov 2017, 7:51:37 UTC - in response to Message 1076.

Hi

any news regarding the Mac OS version ?

Any news yet? valterc is waiting for the Mac version before he makes the new optimized app the default (have no idea why that's a condition).

Buro87 [Lombardia]
Send message
Joined: 23 Nov 16
Posts: 13
Credit: 661,697
RAC: 4,536
Italy
Message 1220 - Posted: 11 Dec 2017, 16:24:43 UTC - in response to Message 1198.

Hi, i'm trying my new r5-1600 stock (3.4ghz all core) and i've saw a big difference in CPU with other r5-1600
My task: around 3700s
http://gene.disi.unitn.it/test/results.php?userid=512&offset=0&show_names=0&state=4&appid=
others r5-1600: around 2300-2700s
http://gene.disi.unitn.it/test/results.php?hostid=16885&offset=0&show_names=0&state=4&appid=
http://gene.disi.unitn.it/test/results.php?hostid=18494&offset=0&show_names=0&state=4&appid=

it's almost 40% difference, running the same app (v1.10)
i tried to overclock to 3.8ghz but Cpu time is almost the same
how can speedup my WUs?
thanks

Krümel
Send message
Joined: 31 Oct 16
Posts: 15
Credit: 1,269,093
RAC: 519
Germany
Message 1221 - Posted: 11 Dec 2017, 18:14:42 UTC

How is your RAM-Speed?
On my R7 it makes a big differens between 2.133 MHz an 3.066 Mht (up to 30 minutes)

Buro87 [Lombardia]
Send message
Joined: 23 Nov 16
Posts: 13
Credit: 661,697
RAC: 4,536
Italy
Message 1222 - Posted: 11 Dec 2017, 21:49:57 UTC - in response to Message 1221.

How is your RAM-Speed?
On my R7 it makes a big differens between 2.133 MHz an 3.066 Mht (up to 30 minutes)


i have 1x8GB 2666mhz c16, maybe i try to overclock it

i have undersood the "problem": if i set CPU usage at 50% (or turning off SMT) the 6 simultaneus WUs take around 2400s to complete:mysteri solved

I don't understand why WUs don't speedup when cpu is at 3.8ghz

Profile Daniel
Volunteer developer
Send message
Joined: 19 Oct 16
Posts: 80
Credit: 2,202,886
RAC: 0
Poland
Message 1223 - Posted: 12 Dec 2017, 6:54:37 UTC - in response to Message 1222.
Last modified: 12 Dec 2017, 7:20:22 UTC

How is your RAM-Speed?
On my R7 it makes a big differens between 2.133 MHz an 3.066 Mht (up to 30 minutes)


i have 1x8GB 2666mhz c16, maybe i try to overclock it

i have undersood the "problem": if i set CPU usage at 50% (or turning off SMT) the 6 simultaneus WUs take around 2400s to complete:mysteri solved

I don't understand why WUs don't speedup when cpu is at 3.8ghz

TN-Grid app is very memory-intensive. One person from my team wrote that on his Xeon 14c/28t (I do not know exact model, probably it is E5-2683 v3) 4 TN-Grid WUs consumed all available memory bandwidth. So when you hit this limit, increasing CPU speed will not help, it will faster wait for memory ;)

Edit: when you set CPU usage to 50%, app will be able to get data from memory faster (less apps will compete for the same limited bandwidth), every app instance could use more cache (additionally helping with loading data), plus CPU resources will not be shared between two apps (SMT/HT does not improve speed twice, usually it is much less).

If you want to improve speeds, use fastest possible memory, and overclock it if possible.
____________

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · Next
Post to thread

Message boards : Number crunching : Optimization


Main page · Your account · Message boards


Copyright © 2017 CNR-TN & UniTN