log in |
21)
Message boards :
Number crunching :
Optimization
(Message 1060)
Posted 4 May 2017 by [B@P] Daniel Hi daniel, i have a question for you Your CPU also supports FMA instructions, so you can also try FMA app version: http://www.cpu-world.com/CPUs/Core_i5/Intel-Core%20i5-6400.html. In general FMA app version should be faster than AVX, which is faster than SSE one. However on some CPUs FMA versions for some reason is a bit slower than AVX one, so please try both. It is possible to install few versions, but you would have to rename pc.exe files and modify app_info.xml to specify all app versions with proper plan classes. Files prepared by me are configured to run single app version only. |
22)
Message boards :
Number crunching :
Optimization
(Message 1039)
Posted 9 Apr 2017 by [B@P] Daniel Win10 X64 I did extra benchmarks using 10 blocks from some VV WU instead of 1 like before. On my Windows machine new SSE app has results similar to AVX app. I also tried to benchmark 32-bit SSE app version and that one was slower than official SSE app. Maybe you downloaded 32-bit app instead of 64-bit one? |
23)
Message boards :
Number crunching :
Optimization
(Message 1038)
Posted 9 Apr 2017 by [B@P] Daniel Win10 X64 Thanks for these numbers. I did my tests on Windows using AVX version and it was faster for me. I suspect that new SSE version is slower. but I have to perform additional tests to confirm this. I will let you know when I will have some results. |
24)
Message boards :
Number crunching :
Optimization
(Message 1037)
Posted 9 Apr 2017 by [B@P] Daniel OK, I did that, but now get the message: I suspect that problem is caused by WUs which you downloaded using official app, which are still considered as in progress. Please try to delete app_info.xml, restart BOINC, wait until BOINC will re-download all these WUs, then abort them all and install optimized app again. Before aborting tasks please also suspend project or set it to "no new tasks" to avoid downloading new WUs in place of aborted ones. BTW, i7-4770 also supports AVX and FMA, you can try these app versions too. |
25)
Message boards :
Number crunching :
Optimization
(Message 1034)
Posted 9 Apr 2017 by [B@P] Daniel New app version is ready! It is available at the same place as usual: https://bitbucket.org/sirzooro/pc-boinc/downloads/. In order to install it, do following steps: Error is "Permission denied". You need to execute following commands from root account in project dir to set appropriate permissions. If you cannot switch to root account using "su -", add "sudo " before each command. chmod 755 pc
chown boinc.boinc pc
After you do this, app should start working. You do not need to restart BOINC again. |
26)
Message boards :
Number crunching :
Optimization
(Message 1031)
Posted 9 Apr 2017 by [B@P] Daniel New app version is ready! It is available at the same place as usual: https://bitbucket.org/sirzooro/pc-boinc/downloads/. In order to install it, do following steps: - finish or abort all existing tasks (they will be aborted after install automatically); - stop BOINC; - unpack selected version to project's directory (path like C:\Users\All Users\BOINC\projects\gene.disi.unitn.it_test\ on Windows, and /var/lib/boinc-client/projects/gene.disi.unitn.it_test on Linux); - start BOINC again After doing this, app name should change to "Gene Network Application (Opti v1.2)". You should also see message "Found app_info.xml; using anonymous platform" in event log for TN-Grid project. This time I used Gray code (not Grey!) to optimize app. This code is a number sequence with special property: every two consecutive numbers differs by one bit only. This concept can be generalized in various ways. One of them are Gray code combinations, where every two consecutive subsets differs by one element only. Here is example of 3-combinations of 5 element set, generated in Gray code order: 1 2 3
1 2 4
1 3 4
2 3 4
2 3 5
1 3 5
1 2 5
1 4 5
2 4 5
3 4 5
TN-Grid Gene app uses combinations generator, so I decided to replace it with new Gray code combinations, and exploit its special property to recalculate only values which depends on changed element. By doing so I reduced total calculations time. Savings depends on maximum L value, and increases with it: - some old organism stored as "test" data, max L=8: time reduced from 0.559s to 0.534s (4.4%); - current organism (VV), max L=12: time reduced from 2.092s to 1.815s (13.2%); - other old organism stored as "test2" data (it was probably ECM), max L=18: time reduced from 14.401s to 9.254s (35.7%). If you are interested in algorithm details, you can check "Combinatorial Generation" by Frank Ruskey (page 129, algorithm 5.8), available at http://www.1stworks.com/ref/ruskeycombgen.pdf. New app also checks if CPU supports required instruction set, and will exit with error message like "AVX instructions are not supported by your CPU!" if CPU will not support them. |
27)
Message boards :
Number crunching :
Optimization
(Message 1022)
Posted 2 Apr 2017 by [B@P] Daniel Is a GPU version still under consideration? I get the impression that it would work, with all the programming talent that Daniel (and others) bring to the project, but there may not be enough work to support it. Yes, I am still going to create it. But first I would like to release new version of CPU app, it is almost ready. |
28)
Message boards :
Number crunching :
FMA problems (Ryzen and others?)
(Message 1016)
Posted 28 Mar 2017 by [B@P] Daniel Any update on this? People on hwbot forum says that ASUS released new BIOS which resolved problem for them. Did you have change to test it, or one for your mainboard if available? |
29)
Message boards :
Number crunching :
FMA problems (Ryzen and others?)
(Message 1014)
Posted 23 Mar 2017 by [B@P] Daniel Both Windows and Linux apps are compiled using gcc. Linux app was compiled using gcc 4.8.5. Windows one was compiled with gcc 5.4.0, so its code probably is better optimized than Linux one. There are also some system-specific changes, they also may play role here. New app has new code to decompress input file, and to filter out some output results. Code which performs actual calculations was not changed. So previous version most probably would crash on Ryzen too. Current Windows apps were compiled by me too, Valterc asked me to to this. I have downloaded FMA app version from TN-Grid server yesterday and verified that it is the same as one which I sent to him. |
30)
Message boards :
Number crunching :
FMA problems (Ryzen and others?)
(Message 1009)
Posted 22 Mar 2017 by [B@P] Daniel I tried to disassemble compiled binary and things got interesting. All crash reports here mentions following error: Privileged Instruction (0xc0000096) at address 0x00000000004f5458 When I checked instruction at this address, I got "STI" which is a sensitive instruction according to https://support.microsoft.com/en-nz/help/114473/intel-privileged-and-sensitive-instructions. But when I tried to disassemble whole app, it turned out that address 0x00000000004f5458 is invalid - valid instruction starts one byte earlier, at 0x00000000004f5457. Instruction at this address is "vmovsd" - it is an AVX instruction. This instruction maps to line 160 in pc.cpp. It looks like Ryzen decided to jump to some invalid address and executed some random instruction there which turned out to be an STI instruction. Valterc, do you know if Windows 64-bit FMA app works fine on other CPUs? I wonder if this problem affects Ryzen CPUs only, or all users with FMA-capable CPUs and 64-bit Windows. |
31)
Message boards :
Number crunching :
FMA problems (Ryzen and others?)
(Message 1000)
Posted 20 Mar 2017 by [B@P] Daniel FMA app uses FMA3 instructions, they are supported by both AMD and Intel CPUs. FMA4 is supported by AMD only, so FMA app does not use them. Error "Reason: Privileged Instruction" is interesting. This error is reported when user-space app tries to execute some kernel-space instruction. Maybe Ryzen incorrectly thinks that some FMA3 instruction is a kernel-space one and raises this error? I suspect that this is another FMA-related bug in Ryzen, so microcode update would be needed here. I also read about that conflict between few antiviruses or similar software may also cause this. Do you use few such programs? Edit: please check if there is BIOS update for your motherboard. If yes, please install it, especially if release notes for it says that it provide microcode update. |
32)
Message boards :
Macintosh :
Compile the new optimized app for Mac
(Message 940)
Posted 7 Mar 2017 by [B@P] Daniel I stopped using this function in my last optimized app version. For some reason my compilers did not complain about this, strange. You can remove it. |
33)
Message boards :
Number crunching :
No available WU
(Message 915)
Posted 15 Feb 2017 by [B@P] Daniel We are almost done writing the new application. We had to make a new one because changing the output file format implies that there would be no cross-validation between results made by this and the old one (a simple version change don't work in this cases) I recall that some projects which wanted to update app to new version which was not compatible with old one decided to stop generating new WUs and wait until all existing WUs were validated, then rolled out new app version. Your WUs have quite short deadline, maybe it would be worth to wait a few days more until all (or most) of existing WUs will be returned? |
34)
Message boards :
Number crunching :
don't get any wus on android
(Message 905)
Posted 11 Feb 2017 by [B@P] Daniel I tried to use Android NDK to compile TN-Grid app and was able to prepare 3 app versions: ARM 32-bit PIE for Android 5+, ARM 32-bit non-PIE for older versions and AARCH64 PIE (minimum Android version here is 5, so non-PIE version is not needed). AARCH64 version uses hard float ABI like ARM Linux versions. However for some reason I was not able to do the same for ARM 32-bit apps, linker complained that I cannot mix softfp and hard ABIs. According to https://developer.android.com/ndk/guides/standalone_toolchain.html this is critical for ABI compatibility. However this is slower than using -mfloat-abi=hard, so these apps will be slower than their ARM Linux counterparts. I have installed AARCH64 version on my phone and completed successfully 17 WUs. I did not try 32-bit version, unfortunately there are no WUs now. Apps are uploaded in the same place as other apps: https://bitbucket.org/sirzooro/pc-boinc/downloads. To install it first determine version which you need. To do this, open Event Log in BOINC Client and scroll down to the bottom. First message will be like "Starting BOINC client version 7.4.53 for aarch64-android-linux-gnu". If you will see "aarch64-android-linux-gnu" there, you should use app AARCH64 app. If you will see "arm-android-linux-gnu", you need one of apps for ARM (armv7a). Check which Android version you have. If it is Android 5 or never, use PIE version; if it is older, use non-PIE. In order to install my app on Android first I copied it from Windows to phone using Windows Explorer. Then on phone I used Total Commander app to copy both files (app_info.xml and pc) to /data/data/edu.berkeley.boinc/client/projects/gene.disi.unitn.it_test/ dir. TC displayed some popup that it was using superuser permissions, so you may have to have rooted phone (I am not sure about this). I also used TC to change access permissions for these files (644 for app_info.xml, 755 for pc) and ownership (both user and group should be BOINC). Hint: check some other file there to get UID and GID values for user and groups. Android uses tons of users and groups, so finding proper ones using numbers instead of names will be faster. After doing this I had to restart BOINC (I restarted phone to do this). Edit: most probably TN-Grid is not on project list in Android BOINC client now. To add it, go to the project list, then press and hold hamburger button (one with 3 horizontal lines) for a while, then enter http://gene.disi.unitn.it/test/ in box which will appear. You can also use account manager like BAM to do this. |
35)
Message boards :
Number crunching :
don't get any wus on android
(Message 901)
Posted 10 Feb 2017 by [B@P] Daniel You could try to compile app for Android and install it using anonymous platform mechanism. Keep in mind that Android 5+ needs PIE app, and older version needs non-PIE. |
36)
Message boards :
Number crunching :
Output file size (and plans for the future)
(Message 897)
Posted 10 Feb 2017 by [B@P] Daniel BTW I think running out of work is worse than having to big of an output file. Good point. Please also keep in mind that one day GPU app may appear. It is hard to tell how much faster it may be, so lets assume it is from 10x to 50x. So for given machine with one GPU upload size will increase few times (actual value depends on CPU count and GPU app speed). BTW, user with that ~500 cores may want to switch them all here if he will be participating in some TN-Grid challenge. Assuming 32 cores per machine, it will be 16 physical machines. And each of them may have one or more top GPUs. So upload size may easily be doubled or even tripled. It would be good to let user configure max result size or something like this too. Or run two apps, one for tasks with small results and another for big results. Users with limited upload speed or who pay for transfer bandwidth could use 1st one only. |
37)
Message boards :
Number crunching :
Output file size (and plans for the future)
(Message 894)
Posted 9 Feb 2017 by [B@P] Daniel Maybe you could store data in binary format? Or use more effective compressing algorithm like bz2 or even xz? Compression libraries provides API which resembles C file API, so conversion of existing code should be pretty straightforward. Here is example how to use libbz2: http://linux.math.tifr.res.in/manuals/html/manual_3.html#SEC34. You can also use bzip2 command to uncompress file first and then process it as usual. |
38)
Message boards :
Number crunching :
Output file size (and plans for the future)
(Message 890)
Posted 9 Feb 2017 by [B@P] Daniel Maybe you could store data in binary format? Or use more effective compressing algorithm like bz2 or even xz? |
39)
Message boards :
Number crunching :
Unknown error number (0xffffffffc000001d)
(Message 881)
Posted 7 Feb 2017 by [B@P] Daniel I could define a minimum os version for avx apps. This will for sure solve this problem. Will check this tomorrow. In this case CPU supports AVX but OS does not. This must be checked in different way, there is separate assembler instruction for this. I checked my projects and found that Einstein@Home also has separate AVX app. I also checked how plan classes are configured, and found that value for <os_version> is an regular expression. Something like this should do the trick: Windows ([^7]|7 .*Service Pack)
|
40)
Message boards :
Number crunching :
Unknown error number (0xffffffffc000001d)
(Message 870)
Posted 6 Feb 2017 by [B@P] Daniel All of these computers are behind firewalls and can't be reached by unrequested incoming connections. They are no http servers or something like that. This is why I have to drive to them when I have to make changes. I think this is safe enough. Windows by default opens few ports for file and printer sharing, please check that you block them too. As the standard application chooses the processor-bounded module by itself, a check function could switch the application to SSE even on AVX-capable processors. This way the workunit will not crash. Of course, this can't be done for manually installed applications (which is not necessary because the user should know what he does when playing with optimized applications). BOINC Client sends list of CPU capabilities to server, and server uses it to select app version which will be used. Additionally it can try to compute few WUs using every version supported by given CPU to find which one is the fastest one for it. This for sure could be improved a bit, to check if Win 7 has SP1 installed and do not sent AVX if it does not have it. It is also possible to create app which will contains all 3 code versions, and will check CPU capabilities during start to select appropriate one. However creation of such app is more difficult, also performance tests of different versions would be more complicated. So for me simple sanity check in app that required instruction set is available is more reasonable. I updated the machine to SP1 but now I am getting errors that the workunits are committed to another platforms, maybe due to the newest standard application change. This is resolved now, you can try again. I'll be leaving TN Grid for now, maybe it is more stable and not changed constantly in a couple of months. If I have a single machine only to maintain, then this might be tolerable, but I haven't. I don't like projects which are modified frequently because of that. You can stick to official app versions, BOINC Client will take care of them. Usually this works fine, except rare situation like this missing SP1. |