Posts by [B@P] Daniel
log in
21) Message boards : Number crunching : Optimization (Message 1060)
Posted 4 May 2017 by Profile [B@P] Daniel
Hi daniel, i have a question for you

on my i5-6400 win7 64bit i recieve both avx an sse2 WU
if i want instal your optimization v1.2 which version i need to copy in the project folder?
- TN-Grid.windows-x86-64-avx-v1.2
- TN-Grid.windows-x86-64-sse2-v1.2
can i install both?

Your CPU also supports FMA instructions, so you can also try FMA app version: http://www.cpu-world.com/CPUs/Core_i5/Intel-Core%20i5-6400.html. In general FMA app version should be faster than AVX, which is faster than SSE one. However on some CPUs FMA versions for some reason is a bit slower than AVX one, so please try both.

It is possible to install few versions, but you would have to rename pc.exe files and modify app_info.xml to specify all app versions with proper plan classes. Files prepared by me are configured to run single app version only.
22) Message boards : Number crunching : Optimization (Message 1039)
Posted 9 Apr 2017 by Profile [B@P] Daniel
Win10 X64
PC-IM v1.02 (sse2) 1600-1700 sec range with i7-5820k 4.2GHz
v1.2 SSE2 1750-1755 sec range

Linux X64 ubuntu 16.10LTS PC-IM v1.03 (fma) 2200-2250 sec average with xeon 2696 v3
v.1.2 (fma) 1940-1980 sec range

Same instruction to compare but different to system/os. So for these win took longer but linux shorter.

Thanks for these numbers. I did my tests on Windows using AVX version and it was faster for me. I suspect that new SSE version is slower. but I have to perform additional tests to confirm this. I will let you know when I will have some results.

I did extra benchmarks using 10 blocks from some VV WU instead of 1 like before. On my Windows machine new SSE app has results similar to AVX app. I also tried to benchmark 32-bit SSE app version and that one was slower than official SSE app. Maybe you downloaded 32-bit app instead of 64-bit one?
23) Message boards : Number crunching : Optimization (Message 1038)
Posted 9 Apr 2017 by Profile [B@P] Daniel
Win10 X64
PC-IM v1.02 (sse2) 1600-1700 sec range with i7-5820k 4.2GHz
v1.2 SSE2 1750-1755 sec range

Linux X64 ubuntu 16.10LTS PC-IM v1.03 (fma) 2200-2250 sec average with xeon 2696 v3
v.1.2 (fma) 1940-1980 sec range

Same instruction to compare but different to system/os. So for these win took longer but linux shorter.

Thanks for these numbers. I did my tests on Windows using AVX version and it was faster for me. I suspect that new SSE version is slower. but I have to perform additional tests to confirm this. I will let you know when I will have some results.
24) Message boards : Number crunching : Optimization (Message 1037)
Posted 9 Apr 2017 by Profile [B@P] Daniel
OK, I did that, but now get the message:

Message from server: Your app_info.xml file doesn't have a usable version of gene@home PC-IM.


EDIT:
Also, it won't download more work, since it says the computer has reached the daily quota of 1 task.

I suspect that problem is caused by WUs which you downloaded using official app, which are still considered as in progress. Please try to delete app_info.xml, restart BOINC, wait until BOINC will re-download all these WUs, then abort them all and install optimized app again. Before aborting tasks please also suspend project or set it to "no new tasks" to avoid downloading new WUs in place of aborted ones.

BTW, i7-4770 also supports AVX and FMA, you can try these app versions too.
25) Message boards : Number crunching : Optimization (Message 1034)
Posted 9 Apr 2017 by Profile [B@P] Daniel
New app version is ready! It is available at the same place as usual: https://bitbucket.org/sirzooro/pc-boinc/downloads/. In order to install it, do following steps:
- finish or abort all existing tasks (they will be aborted after install automatically);
- stop BOINC;
- unpack selected version to project's directory (path like C:\Users\All Users\BOINC\projects\gene.disi.unitn.it_test\ on Windows, and /var/lib/boinc-client/projects/gene.disi.unitn.it_test on Linux);
- start BOINC again
After doing this, app name should change to "Gene Network Application (Opti v1.2)". You should also see message "Found app_info.xml; using anonymous platform" in event log for TN-Grid project.

I did all that, using the SSE2 version for Linux on my i7-4770, and got that message on reboot. But I am getting only errors.
http://gene.disi.unitn.it/test/results.php?hostid=6148

Error is "Permission denied". You need to execute following commands from root account in project dir to set appropriate permissions. If you cannot switch to root account using "su -", add "sudo " before each command.

chmod 755 pc chown boinc.boinc pc


After you do this, app should start working. You do not need to restart BOINC again.
26) Message boards : Number crunching : Optimization (Message 1031)
Posted 9 Apr 2017 by Profile [B@P] Daniel
New app version is ready! It is available at the same place as usual: https://bitbucket.org/sirzooro/pc-boinc/downloads/. In order to install it, do following steps:
- finish or abort all existing tasks (they will be aborted after install automatically);
- stop BOINC;
- unpack selected version to project's directory (path like C:\Users\All Users\BOINC\projects\gene.disi.unitn.it_test\ on Windows, and /var/lib/boinc-client/projects/gene.disi.unitn.it_test on Linux);
- start BOINC again
After doing this, app name should change to "Gene Network Application (Opti v1.2)". You should also see message "Found app_info.xml; using anonymous platform" in event log for TN-Grid project.

This time I used Gray code (not Grey!) to optimize app. This code is a number sequence with special property: every two consecutive numbers differs by one bit only. This concept can be generalized in various ways. One of them are Gray code combinations, where every two consecutive subsets differs by one element only. Here is example of 3-combinations of 5 element set, generated in Gray code order:

1 2 3 1 2 4 1 3 4 2 3 4 2 3 5 1 3 5 1 2 5 1 4 5 2 4 5 3 4 5


TN-Grid Gene app uses combinations generator, so I decided to replace it with new Gray code combinations, and exploit its special property to recalculate only values which depends on changed element. By doing so I reduced total calculations time. Savings depends on maximum L value, and increases with it:
- some old organism stored as "test" data, max L=8: time reduced from 0.559s to 0.534s (4.4%);
- current organism (VV), max L=12: time reduced from 2.092s to 1.815s (13.2%);
- other old organism stored as "test2" data (it was probably ECM), max L=18: time reduced from 14.401s to 9.254s (35.7%).

If you are interested in algorithm details, you can check "Combinatorial Generation" by Frank Ruskey (page 129, algorithm 5.8), available at http://www.1stworks.com/ref/ruskeycombgen.pdf.

New app also checks if CPU supports required instruction set, and will exit with error message like "AVX instructions are not supported by your CPU!" if CPU will not support them.
27) Message boards : Number crunching : Optimization (Message 1022)
Posted 2 Apr 2017 by Profile [B@P] Daniel
Is a GPU version still under consideration? I get the impression that it would work, with all the programming talent that Daniel (and others) bring to the project, but there may not be enough work to support it.

Where are we on that?

Yes, I am still going to create it. But first I would like to release new version of CPU app, it is almost ready.
28) Message boards : Number crunching : FMA problems (Ryzen and others?) (Message 1016)
Posted 28 Mar 2017 by Profile [B@P] Daniel
Any update on this? People on hwbot forum says that ASUS released new BIOS which resolved problem for them. Did you have change to test it, or one for your mainboard if available?
29) Message boards : Number crunching : FMA problems (Ryzen and others?) (Message 1014)
Posted 23 Mar 2017 by Profile [B@P] Daniel
Both Windows and Linux apps are compiled using gcc. Linux app was compiled using gcc 4.8.5. Windows one was compiled with gcc 5.4.0, so its code probably is better optimized than Linux one. There are also some system-specific changes, they also may play role here.

New app has new code to decompress input file, and to filter out some output results. Code which performs actual calculations was not changed. So previous version most probably would crash on Ryzen too.

Current Windows apps were compiled by me too, Valterc asked me to to this. I have downloaded FMA app version from TN-Grid server yesterday and verified that it is the same as one which I sent to him.
30) Message boards : Number crunching : FMA problems (Ryzen and others?) (Message 1009)
Posted 22 Mar 2017 by Profile [B@P] Daniel
I tried to disassemble compiled binary and things got interesting. All crash reports here mentions following error:

Privileged Instruction (0xc0000096) at address 0x00000000004f5458


When I checked instruction at this address, I got "STI" which is a sensitive instruction according to https://support.microsoft.com/en-nz/help/114473/intel-privileged-and-sensitive-instructions. But when I tried to disassemble whole app, it turned out that address 0x00000000004f5458 is invalid - valid instruction starts one byte earlier, at 0x00000000004f5457. Instruction at this address is "vmovsd" - it is an AVX instruction. This instruction maps to line 160 in pc.cpp. It looks like Ryzen decided to jump to some invalid address and executed some random instruction there which turned out to be an STI instruction.

Valterc, do you know if Windows 64-bit FMA app works fine on other CPUs? I wonder if this problem affects Ryzen CPUs only, or all users with FMA-capable CPUs and 64-bit Windows.
31) Message boards : Number crunching : FMA problems (Ryzen and others?) (Message 1000)
Posted 20 Mar 2017 by Profile [B@P] Daniel
FMA app uses FMA3 instructions, they are supported by both AMD and Intel CPUs. FMA4 is supported by AMD only, so FMA app does not use them.

Error "Reason: Privileged Instruction" is interesting. This error is reported when user-space app tries to execute some kernel-space instruction. Maybe Ryzen incorrectly thinks that some FMA3 instruction is a kernel-space one and raises this error? I suspect that this is another FMA-related bug in Ryzen, so microcode update would be needed here.

I also read about that conflict between few antiviruses or similar software may also cause this. Do you use few such programs?

Edit: please check if there is BIOS update for your motherboard. If yes, please install it, especially if release notes for it says that it provide microcode update.
32) Message boards : Macintosh : Compile the new optimized app for Mac (Message 940)
Posted 7 Mar 2017 by Profile [B@P] Daniel
I stopped using this function in my last optimized app version. For some reason my compilers did not complain about this, strange. You can remove it.
33) Message boards : Number crunching : No available WU (Message 915)
Posted 15 Feb 2017 by Profile [B@P] Daniel
We are almost done writing the new application. We had to make a new one because changing the output file format implies that there would be no cross-validation between results made by this and the old one (a simple version change don't work in this cases)
We are testing it locally, we will keep you all updated.

I recall that some projects which wanted to update app to new version which was not compatible with old one decided to stop generating new WUs and wait until all existing WUs were validated, then rolled out new app version. Your WUs have quite short deadline, maybe it would be worth to wait a few days more until all (or most) of existing WUs will be returned?
34) Message boards : Number crunching : don't get any wus on android (Message 905)
Posted 11 Feb 2017 by Profile [B@P] Daniel
I tried to use Android NDK to compile TN-Grid app and was able to prepare 3 app versions: ARM 32-bit PIE for Android 5+, ARM 32-bit non-PIE for older versions and AARCH64 PIE (minimum Android version here is 5, so non-PIE version is not needed).

AARCH64 version uses hard float ABI like ARM Linux versions. However for some reason I was not able to do the same for ARM 32-bit apps, linker complained that I cannot mix softfp and hard ABIs. According to https://developer.android.com/ndk/guides/standalone_toolchain.html this is critical for ABI compatibility. However this is slower than using -mfloat-abi=hard, so these apps will be slower than their ARM Linux counterparts.

I have installed AARCH64 version on my phone and completed successfully 17 WUs. I did not try 32-bit version, unfortunately there are no WUs now.

Apps are uploaded in the same place as other apps: https://bitbucket.org/sirzooro/pc-boinc/downloads. To install it first determine version which you need. To do this, open Event Log in BOINC Client and scroll down to the bottom. First message will be like "Starting BOINC client version 7.4.53 for aarch64-android-linux-gnu". If you will see "aarch64-android-linux-gnu" there, you should use app AARCH64 app. If you will see "arm-android-linux-gnu", you need one of apps for ARM (armv7a). Check which Android version you have. If it is Android 5 or never, use PIE version; if it is older, use non-PIE.

In order to install my app on Android first I copied it from Windows to phone using Windows Explorer. Then on phone I used Total Commander app to copy both files (app_info.xml and pc) to /data/data/edu.berkeley.boinc/client/projects/gene.disi.unitn.it_test/ dir. TC displayed some popup that it was using superuser permissions, so you may have to have rooted phone (I am not sure about this). I also used TC to change access permissions for these files (644 for app_info.xml, 755 for pc) and ownership (both user and group should be BOINC). Hint: check some other file there to get UID and GID values for user and groups. Android uses tons of users and groups, so finding proper ones using numbers instead of names will be faster. After doing this I had to restart BOINC (I restarted phone to do this).

Edit: most probably TN-Grid is not on project list in Android BOINC client now. To add it, go to the project list, then press and hold hamburger button (one with 3 horizontal lines) for a while, then enter http://gene.disi.unitn.it/test/ in box which will appear. You can also use account manager like BAM to do this.
35) Message boards : Number crunching : don't get any wus on android (Message 901)
Posted 10 Feb 2017 by Profile [B@P] Daniel
You could try to compile app for Android and install it using anonymous platform mechanism. Keep in mind that Android 5+ needs PIE app, and older version needs non-PIE.
36) Message boards : Number crunching : Output file size (and plans for the future) (Message 897)
Posted 10 Feb 2017 by Profile [B@P] Daniel
BTW I think running out of work is worse than having to big of an output file.


You have to think about the scale of things though.

Let's go with a user with 100 cores to offer(since I have around that many and that's how I did the math since it was easy),and each WU only takes 30 minutes and makes that 6MB output file.

That's 1.2GB to upload every hour. ~29GB every day. ~864GB per month.

For someone with a lot more compute power that could really scale out. It looks like you have ~500 cores,and while I'm sure that's spread out across projects if you only ran this project and those Vv WUs you would end up using terabytes of upload per month....and I'm pretty sure your ISP would be giving you a call at that point.


Not to mention what it would be like to the computer at the other end of that. The server status shows ~13K tasks underway,so imagining that as the core count of all end users,scale that out. That's terabytes of transfer PER DAY.

Good point. Please also keep in mind that one day GPU app may appear. It is hard to tell how much faster it may be, so lets assume it is from 10x to 50x. So for given machine with one GPU upload size will increase few times (actual value depends on CPU count and GPU app speed).

BTW, user with that ~500 cores may want to switch them all here if he will be participating in some TN-Grid challenge. Assuming 32 cores per machine, it will be 16 physical machines. And each of them may have one or more top GPUs. So upload size may easily be doubled or even tripled.

It would be good to let user configure max result size or something like this too. Or run two apps, one for tasks with small results and another for big results. Users with limited upload speed or who pay for transfer bandwidth could use 1st one only.
37) Message boards : Number crunching : Output file size (and plans for the future) (Message 894)
Posted 9 Feb 2017 by Profile [B@P] Daniel
Maybe you could store data in binary format? Or use more effective compressing algorithm like bz2 or even xz?

There is not a big difference between a binary format and the gzipped text we use. I just picked a random Vv output file, plain text, it is 1139829 lines, each line needs at least 4+1 bytes (indexes instead of gene names), a total of 5699145, which is almost the same as the gzipped version. On the other way gzip is internally supported by boinc. A dramatic change of the output format would also mean a lot of changes in the post-processing. ... But we are thinking about a solution like removing from the output file all the interactions that are present only once.

Compression libraries provides API which resembles C file API, so conversion of existing code should be pretty straightforward. Here is example how to use libbz2: http://linux.math.tifr.res.in/manuals/html/manual_3.html#SEC34. You can also use bzip2 command to uncompress file first and then process it as usual.
38) Message boards : Number crunching : Output file size (and plans for the future) (Message 890)
Posted 9 Feb 2017 by Profile [B@P] Daniel
Maybe you could store data in binary format? Or use more effective compressing algorithm like bz2 or even xz?
39) Message boards : Number crunching : Unknown error number (0xffffffffc000001d) (Message 881)
Posted 7 Feb 2017 by Profile [B@P] Daniel
I could define a minimum os version for avx apps. This will for sure solve this problem. Will check this tomorrow.

Well. It's not so easy (I should make another platform/plan_class etc...) This is something that should be checked by the Boinc client... I don't know why it may report unsupported cpu features.

Does someone know how the handle this at Asteroids@home? (the only project I know that uses "explicit" avx apps)

In this case CPU supports AVX but OS does not. This must be checked in different way, there is separate assembler instruction for this.

I checked my projects and found that Einstein@Home also has separate AVX app.

I also checked how plan classes are configured, and found that value for <os_version> is an regular expression. Something like this should do the trick:
Windows ([^7]|7 .*Service Pack)
40) Message boards : Number crunching : Unknown error number (0xffffffffc000001d) (Message 870)
Posted 6 Feb 2017 by Profile [B@P] Daniel
All of these computers are behind firewalls and can't be reached by unrequested incoming connections. They are no http servers or something like that. This is why I have to drive to them when I have to make changes. I think this is safe enough.

Windows by default opens few ports for file and printer sharing, please check that you block them too.

As the standard application chooses the processor-bounded module by itself, a check function could switch the application to SSE even on AVX-capable processors. This way the workunit will not crash. Of course, this can't be done for manually installed applications (which is not necessary because the user should know what he does when playing with optimized applications).

BOINC Client sends list of CPU capabilities to server, and server uses it to select app version which will be used. Additionally it can try to compute few WUs using every version supported by given CPU to find which one is the fastest one for it. This for sure could be improved a bit, to check if Win 7 has SP1 installed and do not sent AVX if it does not have it.
It is also possible to create app which will contains all 3 code versions, and
will check CPU capabilities during start to select appropriate one. However creation of such app is more difficult, also performance tests of different versions would be more complicated. So for me simple sanity check in app that required instruction set is available is more reasonable.

I updated the machine to SP1 but now I am getting errors that the workunits are committed to another platforms, maybe due to the newest standard application change.

This is resolved now, you can try again.

I'll be leaving TN Grid for now, maybe it is more stable and not changed constantly in a couple of months. If I have a single machine only to maintain, then this might be tolerable, but I haven't. I don't like projects which are modified frequently because of that.

Anyway, thanks for your support.

You can stick to official app versions, BOINC Client will take care of them. Usually this works fine, except rare situation like this missing SP1.


Previous 20 · Next 20

Main page · Your account · Message boards


Copyright © 2022 CNR-TN & UniTN