log in |
1)
Message boards :
Number crunching :
Gene application for GNU/Linux on ARM devices
(Message 729)
Posted 6 Jan 2017 by fractal I have just read that AARCH64 CPUs has new NEON SIMD instructions with double precision support, so it should be possible to get additional speed boost by using them. Probably it is time to get some Odroid C2 and play with it a bit :) The Odroid C2 is a fantastic product. I love mine. Well made, well supported, solid performer. But there may be better AARCH64 SBC's if you can only have one. My main objection to the C2 is the lack of AES instructions. It only has the following extension: fp asimd crc32 As for TN-Grid, here are tests on a C2. me@odroid-c2:~/TN-Grid$ ./test_run.sh
-> pc_armv6zk_vfp
real 5m2.251s
user 4m51.740s
sys 0m0.080s
-> pc_armv7_vfpv3
real 4m35.022s
user 4m32.840s
sys 0m0.080s
-> pc_armv7_vfpv4
real 4m39.926s
user 4m37.720s
sys 0m0.100s
-> pc_armv8-a
real 5m14.590s
user 5m12.330s
sys 0m0.100s
The version compiled for the armv7 vfpv3 architecture is a bit faster than the armv8 version. A less expensive alternative that does have the aes instructions is the pine64 board. It has the following extension: fp asimd aes pmull sha1 sha2 crc32. My main issue with it is cooling as it can not run flat out without overheating. A simple stick-on heat sink from a RPI helps some but not enough. It is also a bit slower than the C2. Performance on it looks like: ubuntu@pine64:~/boinc/samples/TN-Grid$ ./test_run.sh
-> pc_armv6zk_vfp
real 6m31.015s
user 6m22.620s
sys 0m0.150s
-> pc_armv7_vfpv3
real 6m8.191s
user 5m59.540s
sys 0m0.440s
-> pc_armv7_vfpv4
real 6m12.538s
user 6m4.400s
sys 0m0.100s
-> pc_armv8-a
real 6m59.002s
user 6m50.040s
sys 0m0.210s
with vfp3 enjoying a slight advantage as well. |
2)
Message boards :
Number crunching :
Optimization
(Message 667)
Posted 20 Dec 2016 by fractal It took a while but tasks on my x5675 dropped from 2 1/2 hours to 1 1/2 hours with the sse custom app. It does not have any of the newer instructions. |