log in |
Message boards : Number crunching : SSE3 optimization and Android binary
Author | Message |
---|---|
Hi all,
WITH SSE3
bin/pc input/tile.txt output/output.txt 0.05 1 393 0
Loading: 0.164
computeStandardDeviations: 0.001
computeCorrelations: 0.350
pcAlgorithm, l 0: 0.013
pcAlgorithm, l 1: 0.698
pcAlgorithm, l 2: 0.349
pcAlgorithm, l 3: 0.046
pcAlgorithm, l 4: 0.015
pcAlgorithm, l 5: 0.004
pcAlgorithm, l 6: 0.001
pcAlgorithm, l 7: 0.000
pcAlgorithm, l 8: 0.000
I file output/output.txt e output/ref_output.txt sono identici
WITHOUT SSE3
bin/pc input/tile.txt output/output.txt 0.05 1 393 0
Loading: 0.167
computeStandardDeviations: 0.001
computeCorrelations: 0.374
pcAlgorithm, l 0: 0.013
pcAlgorithm, l 1: 0.691
pcAlgorithm, l 2: 0.355
pcAlgorithm, l 3: 0.048
pcAlgorithm, l 4: 0.015
pcAlgorithm, l 5: 0.004
pcAlgorithm, l 6: 0.001
pcAlgorithm, l 7: 0.000
pcAlgorithm, l 8: 0.000
I file output/output.txt e output/ref_output.txt sono identici
WITHOUT SSE3
bin/pc input/tile2.txt output/output2.txt 0.05 1 2470 0
Loading: 0.354
computeStandardDeviations: 0.001
computeCorrelations: 0.083
pcAlgorithm, l 0: 0.001
pcAlgorithm, l 1: 0.026
pcAlgorithm, l 2: 0.440
pcAlgorithm, l 3: 3.057
pcAlgorithm, l 4: 6.627
pcAlgorithm, l 5: 7.104
pcAlgorithm, l 6: 5.675
pcAlgorithm, l 7: 3.771
pcAlgorithm, l 8: 3.020
pcAlgorithm, l 9: 1.796
pcAlgorithm, l 10: 1.053
pcAlgorithm, l 11: 0.550
pcAlgorithm, l 12: 0.270
pcAlgorithm, l 13: 0.098
pcAlgorithm, l 14: 0.029
pcAlgorithm, l 15: 0.006
pcAlgorithm, l 16: 0.001
pcAlgorithm, l 17: 0.000
pcAlgorithm, l 18: 0.000
I file output/output2.txt e output/ref_output2.txt sono identici
WITH SSE3:
bin/pc input/tile2.txt output/output2.txt 0.05 1 2470 0
Loading: 0.229
computeStandardDeviations: 0.001
computeCorrelations: 0.086
pcAlgorithm, l 0: 0.001
pcAlgorithm, l 1: 0.027
pcAlgorithm, l 2: 0.445
pcAlgorithm, l 3: 3.089
pcAlgorithm, l 4: 6.959
pcAlgorithm, l 5: 7.226
pcAlgorithm, l 6: 5.636
pcAlgorithm, l 7: 3.744
pcAlgorithm, l 8: 2.985
pcAlgorithm, l 9: 1.796
pcAlgorithm, l 10: 1.055
pcAlgorithm, l 11: 0.556
pcAlgorithm, l 12: 0.272
pcAlgorithm, l 13: 0.104
pcAlgorithm, l 14: 0.031
pcAlgorithm, l 15: 0.007
pcAlgorithm, l 16: 0.001
pcAlgorithm, l 17: 0.000
pcAlgorithm, l 18: 0.000
I file output/output2.txt e output/ref_output2.txt sono identici Actually, it looks that the version of pc provided by Tn-grid, although it only supports SSE2, runs like the version I compiled for SSE3 using -march=core2 -mtune=core2 -m64 -msse3 -mssse3 . Should I leave the idea to optimize "pc" for SSE3 or there is some option I can add to achieve some tangible improvement?I take the occasion to ask someone more experienced than me if it is easy to compile pc for Android-x86_64 and Android-arm. I would like to do it and of course I would give the binary to anyone if I succeed. I compiled the android toolchain from the boinc package but then I do not know in which way should I modify the "make" file in the pc-boinc source to point to the android libraries and compiler instead of the standard g++ compiler. Sorry for my confusion but I am not a programmer, I just wanted to make a tentative in order to crunch Tn-grid's WUs on Android devices too. Considered that it looks like a very easy task for someone who know how to do it: is there any volunteer who wants to compile tn-grid for Android instead of me? Or perhaps suggesting me the needed steps? I think it could be of great value for the whole community! | |
ID: 1743 · Reply Quote | |
The optimized versions (SSE2, AVX, FMA) of the application were built not only by giving the right directives to the compiler but also by writing some specific assembly code. Nevertheless the obtained speed gain was relatively small. The dramatic speed increase relative to the first version was mainly obtained by cleverly rewriting the source code. It seems to me that, because of the 'intrinsic' nature of our code, SIMD optimizations are not *very* useful. | |
ID: 1746 · Reply Quote | |
Ok so regarding the SIMD optimization I'll remain stick to the package provided by the server. It's fine as long as I am using my (slow) cpu efficiently. #!/bin/sh
set -e
#
# See: http://boinc.berkeley.edu/trac/wiki/AndroidBuildApp
#
# Script to compile a generic application on Android
export ANDROID_TC="${ANDROID_TC:-$HOME/android-tc}"
export ANDROIDTC="${ANDROID_TC_X86:-$ANDROID_TC/x86}"
export TCBINARIES="$ANDROIDTC/bin"
export TCINCLUDES="$ANDROIDTC/i686-linux-android"
export TCSYSROOT="$ANDROIDTC/sysroot"
export STDCPPTC="$TCINCLUDES/lib/libstdc++.a"
export PATH="$TCBINARIES:$TCINCLUDES/bin:$PATH"
export CC=i686-linux-android-gcc
export CXX=i686-linux-android-g++
export LD=i686-linux-android-ld
export CFLAGS="--sysroot=$TCSYSROOT -DANDROID -Wall -I$TCINCLUDES/include -O3 -fomit-frame-pointer -fPIE"
export CXXFLAGS="--sysroot=$TCSYSROOT -DANDROID -Wall -O3 -fomit-frame-pointer -fPIE"
export LDFLAGS="-L$TCSYSROOT/usr/lib -L$TCINCLUDES/lib -llog -fPIE -pie"
export GDB_CFLAGS="--sysroot=$TCSYSROOT -Wall -g -I$TCINCLUDES/include"
make clean
if [ -e "./configure" ]; then
./configure --host=i686-linux --prefix="$TCINCLUDES" --libdir="$TCINCLUDES/lib" --disable-shared --enable-static
fi
make
all the files called by the script are created by other scripts included in the directory.[/code] | |
ID: 1747 · Reply Quote | |
Eventually, we can try to compile it together? I have some expiriance with BOINC and BOINC science project porting to several platforms. But no Android device available and also less expiriance with Android. I stopped just because the script provided by Boinc assumes that your app has a configure script, while "pc" has only a script that calls g++ and make and I do not know how to fix this, but just because I am ignorant: I recognize that it just matter of typing the right commands (calling the c++ compiler from the android toolchain) ;) Am I understood you right, that you was able to comile the BOINC API and Library and the liniking of the "pc" binary produce errors? For the macOS application, I have copied it to boinc/samples/tngrid, compiled it there. On errors on the link stage (in the tngrid direcetroy), can you please issue a "gmake -n" to see the command which be issued for the linking of "pc". | |
ID: 1771 · Reply Quote | |
I built Android apps in the past, here is link to my post with more details: http://gene.disi.unitn.it/test/forum_thread.php?id=158&postid=905#905 ARCH += -march=armv7-a -mtune=cortex-a7 -mfpu=vfpv4 -mfloat-abi=softfp
LDFLAGS += -Wl,--fix-cortex-a8
PIE ?= 0
$(info Using PIE=$(PIE))
ifeq ($(PIE),1)
CFLAGS += -fPIE
LDFLAGS += -fPIE -pie
BOINC_DIR = ../../_boinc32pie/
else
LDFLAGS += -fno-PIE -no-pie
BOINC_DIR = ../../_boinc32nonpie/
endif
TOOLPATH = ../../$(TOOLDIR)
CFLAGS = --sysroot=c:/tn-grid/android/_arm32/sysroot/ -DANDROID -DDECLARE_TIMEZONE -Ic:/tn-grid/android/_arm32/include/c++/4.9.x/arm-linux-androideabi/
LDFLAGS = --sysroot=c:/tn-grid/android/_arm32/sysroot/
CC = ../../_arm32/bin/arm-linux-androideabi-gcc
CXX = ../../_arm32/bin/arm-linux-androideabi-g++
From Makefile for 64-bit Android app: CFLAGS = --sysroot=c:/tn-grid/android/_arm64/sysroot/ -DANDROID -DANDROID_64 -DDECLARE_TIMEZONE -fPIE
LDFLAGS = --sysroot=c:/tn-grid/android/_arm64/sysroot -fPIE -pie
CC = ../../_arm64/bin/aarch64-linux-android-gcc
CXX = ../../_arm64/bin/aarch64-linux-android-g++
BOINC_DIR = ../../_boinc64/
I used these Makefiles from Cygwin. I hope that this will help you. In Makefile for 32-bit app I has to use -mfloat-abi=softfp instead of -mfloat-abi=hard. This was required by Android. You can check if is is possible now, otherwise app will be slower than corresponding ARM Linux app. ____________ | |
ID: 1780 · Reply Quote | |
Thank you for your hints, Daniel.
# SSE2, 64-bit
#ARCH += -march=core2 -mtune=core2 -m64
# AVX, 64-bit
#ARCH += -march=core2 -mtune=generic -msse4.2 -mpopcnt -maes -mpclmul -mavx -m64
# AVX+FMA, 64-bit
#ARCH += -march=core2 -mtune=generic -msse4.2 -mpopcnt -maes -mpclmul -mavx -mfma -m64
# AVX2+FMA, 64-bit
#ARCH += -march=core2 -mtune=generic -msse4.2 -mpopcnt -maes -mpclmul -mavx -mfma -mavx2 -m64
# 32-bit, no SIMD
#ARCH += -m32 -mno-sse
# SSE2, 32-bit
#ARCH += -march=core2 -mtune=core2 -m32
# AVX, 32-bit
#ARCH += -march=core2 -mtune=generic -msse4.2 -mpopcnt -maes -mpclmul -mavx -m32
# AVX+FMA, 32-bit
#ARCH += -march=core2 -mtune=generic -msse4.2 -mpopcnt -maes -mpclmul -mavx -mfma -m32
#--- nuovo ---
ARCH += -march=armv7-a -mtune=cortex-a7 -mfpu=vfpv4 -mfloat-abi=softfp
LDFLAGS += -Wl,--fix-cortex-a8
#PIE ?= 0
#$(info Using PIE=$(PIE))
#ifeq ($(PIE),1)
#CFLAGS += -fPIE
#LDFLAGS += -fPIE -pie
#BOINC_DIR = ../../_boinc32pie/
#else
LDFLAGS += -fno-PIE -no-pie
BOINC_DIR = /home/matteo/Software/boinc
#endif
#TOOLPATH = ../../$(TOOLDIR)
CFLAGS = --sysroot=/home/matteo/Software/android-ndk-r21/toolchains/llvm/prebuilt/linux-x86_64/sysroot/ -DANDROID -DDECLARE_TIMEZONE -Ic:"/home/matteo/Software/android-ndk-r21/toolchains/llvm/prebuilt/linux-x86_64/include/c++/4.9.x/"
LDFLAGS = --sysroot=/home/matteo/Software/android-ndk-r21/toolchains/llvm/prebuilt/linux-x86_64/sysroot/
CC = /home/matteo/Software/android-ndk-r21/toolchains/llvm/prebuilt/linux-x86_64/bin/clang
CXX = /home/matteo/Software/android-ndk-r21/toolchains/llvm/prebuilt/linux-x86_64/bin/clang++
#--- vecchio ---
#CC ?= gcc
#CXX ?= g++
#ARCH += -march=core2 -mtune=core2 -m64 -msse3 -mssse3
#BOINC_DIR ?= ../../..
BOINC_API_DIR ?= $(BOINC_DIR)/api
BOINC_LIB_DIR ?= $(BOINC_DIR)/lib
BOINC_ZIP_DIR ?= $(BOINC_DIR)/zip
BOINC_LIBS ?= $(BOINC_API_DIR)/libboinc_api.a $(BOINC_LIB_DIR)/libboinc.a
ifdef BOINC_STUB
BOINC_DIR = ../boinc_stub
BOINC_LIBS =
endif
FREETYPE_DIR = /usr/include/freetype2
CPPFLAGS += -I$(BOINC_DIR) -I$(BOINC_LIB_DIR) -I$(BOINC_API_DIR) -I$(BOINC_ZIP_DIR) -I$(FREETYPE_DIR) -Isimd
#CFLAGS += -c -O3 $(ARCH) -Wall -Wextra -pedantic -Werror $(VARIANTFLAGS) -MMD -MP
#CXXFLAGS += $(CFLAGS) -std=gnu++11
#LDFLAGS += $(ARCH) -L/usr/X11R6/lib -L.
LIBS ?= -static-libgcc -static-libstdc++ -pthread -Wl,-Bstatic -lbz2
CXXSOURCES = BoincFile.cpp Graph.cpp boinc_functions.cpp utility.cpp pc.cpp main.cpp
CSOURCES = erf.c
OBJECTS = $(CXXSOURCES:.cpp=.o) $(CSOURCES:.c=.o)
EXECUTABLE = ../bin/pc
all: $(EXECUTABLE)
$(EXECUTABLE): $(OBJECTS)
$(CXX) $(LDFLAGS) $(OBJECTS) -o $@ $(LIBS) $(BOINC_LIBS)
.cpp.o:
$(CXX) $(CPPFLAGS) $(CXXFLAGS) $< -o $@
.c.o:
$(CC) $(CPPFLAGS) $(CFLAGS) $< -o $@
clean:
rm -rf ../bin/$(EXECUTABLE) *.o *~ *.d
.PHONY: all clean
-include $(CXXSOURCES:.cpp=.d) $(CSOURCES:.c=.d)
Unfortunately, this does not work because it misses a "config.h" file needed by "parse.h": /home/matteo/Software/boinc/lib/parse.h:26:10: fatal error: 'config.h' file not found This is weird because I am using the official source tree of boinc and it should be error free. Daniel, do you still have the binary you compiled? Perhaps they are still valuable and we don't have to re-do the work again ;) They are no longer present in the archived you linked in your old post. | |
ID: 1789 · Reply Quote | |
double post | |
ID: 1790 · Reply Quote | |
Yes, this is known issue in BOINC. You need to manually copy config.h file from boinc source root dir to InstallPath/include/boinc after installing all files. Or if you use files from directly from boinc build dir, add /home/matteo/Software/boinc/ to include paths. | |
ID: 1791 · Reply Quote | |
I tried all day, but I am not able to do it. Now I receive BoincFile.cpp:22:18: fatal error: string: No such file or directory
#include <string>
^
compilation terminated.
Makefile:79: recipe for target 'BoincFile.o' failed
make: *** [BoincFile.o] Error 1
and this is weird because string.h is a basic C++ library.I give up the idea, I hope that someone more experienced than me may do it. In case, my only suggestion is to build for all the possible platforms: once you are able to compile it's a matter of minutes to compile for several platforms, and it would be an added value to tn-grid. For example, in this moment no projects are able to run on android-x86. [/code] | |
ID: 1792 · Reply Quote | |
The optimized versions (SSE2, AVX, FMA) of the application ... Somewhat off-topic, but are there any plans to add Windows FMA app? It would not impact my computer, but at least some of those Windows machines currently running AVX app should be able to benefit, right? | |
ID: 1809 · Reply Quote | |
The optimized versions (SSE2, AVX, FMA) of the application ... We had problems with v1.10 Windows FMA version (mainly for AMD cpus, early bioses) so we decided not to build it when switching to v1.11, also the speed increase was minimal (in some cases the AVX version performed better). I may try to rebuild it and put it on beta, have to think about it. | |
ID: 1810 · Reply Quote | |
Hi! | |
ID: 1831 · Reply Quote | |
It looks that bzip2 library was build with -D_FORTIFY_SOURCE, which is not supported by glibc on Android. You need to rebuild bzip2 with this flag disabled. Here is related question on StackOverflow: | |
ID: 1832 · Reply Quote | |
It looks that bzip2 library was build with -D_FORTIFY_SOURCE, which is not supported by glibc on Android. You need to rebuild bzip2 with this flag disabled. Here is related question on StackOverflow: Thanks, I build bzip2 and make with compiled lib and work fine!! /home/juanro/StudioProjects/boinc-android/src/boinc/samples/pc-boinc/bin/pc: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, with debug_info, not stripped
Now, how can I test this binary to view that ir work fine? ____________ Consulta como ver TV y leer noticias desde tu movil en una sola app con FeedTV. Apoya la investigación desde esta imagen | |
ID: 1834 · Reply Quote | |
You have to put in the project's director an app_info.xml file with the name of your executable and then restart boinc. | |
ID: 1835 · Reply Quote | |
You have to put in the project's director an app_info.xml file with the name of your executable and then restart boinc. Thanks! I try this and for the moment it works fine!! Definitive build script android64_build.sh https://bin.disroot.org/?cae7d50e268c2abd#4VXpES87gfvQFgW7QkHnCG5fyCTumuKC3hMYofNGPY2R Definitive makefile https://bin.disroot.org/?81265057234777a3#9tJmVrkUwAWRc5dVyrFQf49LRj5sWDcwo4nZAV8pYZK8 (optimiced for cortex a53, you can change for other -mtune value defined in https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/AArch64-Options.html#aarch64-feature-modifiers) I try this steps: Clone https://github.com/truboxl/boinc-android Run 00_prepare_sources.sh to download necesary files (boinc, android ndk...) Clone https://bitbucket.org/francesco-asnicar/pc-boinc/src/master/ in /boinc-android/src/boinc/samples copy /boinc-android/src/boinc/android/build_boinc_arm64.sh to /boinc-android/src/boinc/samples/pc-boinc/src/android64_build.sh Edit android64_build.sh to add --disable-client in configure, make pc-im... Edit makefile to add ARCH, toolchains and some Android flags
| |
ID: 1838 · Reply Quote | |
You have to put in the project's director an app_info.xml file with the name of your executable and then restart boinc. AARCH64 apps for Android always are PIE and have NEON enabled. ____________ | |
ID: 1844 · Reply Quote | |
Hi all! | |
ID: 1845 · Reply Quote | |
Hi, ARCH += -march=native -mtune=native -Ofast -fprofile-use -fno-signed-zeros -fno-trapping-math -frename-registers -funroll-loops I compiled the application twice: first time I used -fprofile-generated and ran the application for a while; it generated some .gcda files in the src directory: these files are profile files that enable to further optimize the binary, in order to have it very tailored to your system. So the second step was to use -fprofile-use in order to use the gcda files previously generated. Actually I didn't created very good profile files, I just ran four or five times the very rapid tests included in the source tree. I'd like to share this binary with you: https://drive.google.com/file/d/1nGfHSGxbS3QXKp8dIPIqGh5HyJLeRn4g/view?usp=sharing I have seen several hosts using cpu similar to mine on this project, so other users could use it to increase credit and science ;) | |
ID: 1883 · Reply Quote | |
Nice, | |
ID: 1898 · Reply Quote | |
Message boards :
Number crunching :
SSE3 optimization and Android binary