log in |
Message boards : Number crunching : Optimization
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 · Next
Author | Message |
---|---|
Well, I only doubled the size of the workunits (starting at 2016-12-30, 100 'blocks' instead of 50) Well, I already had the intention of doing this... The 'problem' is that the size of the output file is almost the same regardless of the number of blocks, so there is no reason of having very short workunits (just more stuff into the database and more network traffic). About timing, I will wait until the beginning of February, for deploying the new apps and also increase the workunit size. I also want, before doing this, to deploy a small batch of workunits related to another organism, just to check if everything is working well. | |
ID: 823 · Reply Quote | |
Impressive work. Thanks! It looks like the sse2+fma is only 0.659% faster. Is that even worth having another version? Looking at the user reporting for his AMD X8, his results show that the new fma app is actually running around 11% faster than the sse2 version. This is also what I'm seeing on my four AMD X8 CPUs. A useful increase. Once again, THANKS! | |
ID: 824 · Reply Quote | |
I found an host that is not able to run the linux x64 version because of missing shared libraries (http://gene.disi.unitn.it/test/show_host_detail.php?hostid=2990), too old kernel? (3.2.0-4-amd64). The error is version `GLIBC_2.15' not found, version `GLIBC_2.16' not found. | |
ID: 826 · Reply Quote | |
I found an host that is not able to run the linux x64 version because of missing shared libraries (http://gene.disi.unitn.it/test/show_host_detail.php?hostid=2990), too old kernel? (3.2.0-4-amd64). The error is version `GLIBC_2.15' not found, version `GLIBC_2.16' not found. If ldd no longer shows these libs, it should be OK. Although I am a bit reluctant about doing this - this particular kernel version was used by Debian Wheezy, which is now past its End of Life. This means that there are no new updates for this system version, especially no security updates for new security holes. By not providing app which will work there user may get convinced to upgrade system to some new version which will have support for few next years. I played with new app a bit trying to optimize it more. It turned out that using AVX for calculating square roots only was slower than using SSE only. I also tried to use values from one half of matrix only, but this slowed down app too. So it does not make sense to apply any of these changes. I also tried measure run time of app with SSE vectors on Haswell CPU, compiled with different instruction sets: SSE2 20,766
AVX 19,933
FMA 20,163
AVX2 20,355 It turned out that AVX version is faster than SSE2, probably thanks to some SSE3+ instructions or AVX used in code automatically vectorized by gcc. So this app version should be provided by project. FMA app is to my surprise slower than AVX and I do not have a good explanation for this now. AVX2 version also is slower. It would be good if someone with some new CPU like Skylake could perform some tests and post results here, maybe it will work better on such new CPUs. If not, existing versions (SSE2, AVX, FMA) would be sufficient. I have uploaded new versions of AVX and AVX2 apps for Linux and Windows, feel free to download and run them. ____________ | |
ID: 828 · Reply Quote | |
I have uploaded new versions of AVX and AVX2 apps for Linux and Windows, feel free to download and run them. Because your Windows avx version of 23rd of January was a bit slower than the sse2, I tried your newer avx version from yesterday. Average numbers of 8 tasks concurrently running on my i7 2600: elapsed 1:15:39 - cpu 1:14:34 efficiency 98,579% -- sse2 elapsed 1:13:29 - cpu 1:12:53 efficiency 99,186% -- avx | |
ID: 829 · Reply Quote | |
i7 6700T @ 3 GHz, HTT on (8 WU at a time) | |
ID: 830 · Reply Quote | |
Which version did you use for your 3770k? | |
ID: 841 · Reply Quote | |
Which version did you use for your 3770k? 59 minutes it ended up being. The other variations crashed. FMA/AVX2 | |
ID: 842 · Reply Quote | |
Which version did you use for your 3770k? Your CPU supports instructions up to AVX: http://www.cpu-world.com/CPUs/Core_i7/Intel-Core%20i7-3770K.html. It does not have FMA or AVX2, these apps will crash there. You can try AVX version, it should work for you. You can also use CPU-Z to check this. ____________ | |
ID: 843 · Reply Quote | |
I think he better should try the SSE2 version. | |
ID: 846 · Reply Quote | |
Its about 49-55 minutes per WU using SSE2 v1.1 | |
ID: 849 · Reply Quote | |
Why am I getting SSE2 work units for my AVX-capable CPUs? Is't that a waste of resources? | |
ID: 1019 · Reply Quote | |
Why am I getting SSE2 work units for my AVX-capable CPUs? Is't that a waste of resources? At the beginning the server will send both apps (sse,avx), gathering statistics. After some time if there​ is a clear winner you will just get that, if not you will continue to get both. This means that there is not a big difference running sse or avx in your computer. | |
ID: 1020 · Reply Quote | |
Is a GPU version still under consideration? I get the impression that it would work, with all the programming talent that Daniel (and others) bring to the project, but there may not be enough work to support it. | |
ID: 1021 · Reply Quote | |
Is a GPU version still under consideration? I get the impression that it would work, with all the programming talent that Daniel (and others) bring to the project, but there may not be enough work to support it. Yes, I am still going to create it. But first I would like to release new version of CPU app, it is almost ready. ____________ | |
ID: 1022 · Reply Quote | |
Outstanding, I will try the new CPU app on both Windows and Ubuntu as a baseline for the GPU app. | |
ID: 1023 · Reply Quote | |
Why am I getting SSE2 work units for my AVX-capable CPUs? Is't that a waste of resources? Yes, that is what I thought. However, the reality is, that my Core i7-4770K is getting exclusively sse2 units. Nothing else. No choice. In the recorded history of 456 units, it was sent an avx unit only once. Well, whatever. I just thought that avx units should be faster on this CPU. | |
ID: 1024 · Reply Quote | |
Why am I getting SSE2 work units for my AVX-capable CPUs? Is't that a waste of resources? This may be something hidden inside the boinc scheduler' decisions. I also have one I7-4770K running windows (http://gene.disi.unitn.it/test/results.php?hostid=3241). It got some sse2 and avx work at the beginning, right now it gets only fma work (having opted to accept beta work in my profile). If I remember correctly boinc will repeat the 'performance test' after some time. | |
ID: 1025 · Reply Quote | |
New app version is ready! It is available at the same place as usual: https://bitbucket.org/sirzooro/pc-boinc/downloads/. In order to install it, do following steps: 1 2 3
1 2 4
1 3 4
2 3 4
2 3 5
1 3 5
1 2 5
1 4 5
2 4 5
3 4 5
TN-Grid Gene app uses combinations generator, so I decided to replace it with new Gray code combinations, and exploit its special property to recalculate only values which depends on changed element. By doing so I reduced total calculations time. Savings depends on maximum L value, and increases with it: - some old organism stored as "test" data, max L=8: time reduced from 0.559s to 0.534s (4.4%); - current organism (VV), max L=12: time reduced from 2.092s to 1.815s (13.2%); - other old organism stored as "test2" data (it was probably ECM), max L=18: time reduced from 14.401s to 9.254s (35.7%). If you are interested in algorithm details, you can check "Combinatorial Generation" by Frank Ruskey (page 129, algorithm 5.8), available at http://www.1stworks.com/ref/ruskeycombgen.pdf. New app also checks if CPU supports required instruction set, and will exit with error message like "AVX instructions are not supported by your CPU!" if CPU will not support them. ____________ | |
ID: 1031 · Reply Quote | |
New app version is ready! It is available at the same place as usual: https://bitbucket.org/sirzooro/pc-boinc/downloads/. In order to install it, do following steps: I did all that, using the SSE2 version for Linux on my i7-4770, and got that message on reboot. But I am getting only errors. http://gene.disi.unitn.it/test/results.php?hostid=6148 | |
ID: 1033 · Reply Quote | |
Message boards :
Number crunching :
Optimization