26

gcc takes the -mcpu= (or -march=) flags to produce optimized code for a particular CPU type. There's a plethora of arm variants, and the gcc Arm target also provides numerous FPU types.

Which -mcpu=/-march= flags and -mfpu= flags are the proper/native ones to use when compiling C code on a Raspberry Pi ?

nos
  • 1,106
  • 2
  • 10
  • 14

4 Answers4

11

The RPI ARM core is an ARM 1176jzf-S, the suitable flags should then be

-march=armv6zk -mcpu=arm1176jzf-s -mfloat-abi=hard -mfpu=vfp

Drop the -mfloat-abi=hard -mfpu=vfp when on a soft float distro.

These flags can be found by running gcc -mcpu=native -march=native -Q --help=target on gcc >= 4.7`

nos
  • 1,106
  • 2
  • 10
  • 14
11

This depends on what libraries you wish to link your code with. If you are using the Raspbian images, the architecture is "armhf" and the compilation flags are:

-march=armv6
-mfpu=vfp
-mfloat-abi=hard

If you are compiling for Debian "armel" arch, the compilation flags are different.

Nakedible
  • 1,561
  • 1
  • 13
  • 18
8

I found that this set should be the optimal:

-march=armv6 -mfloat-abi=hard -mfpu=vfp

(the -mfloat-abi=hard, of course only when you use an armhf distro)

ikku
  • 4,544
  • 1
  • 26
  • 28
3

I can't answer about -mfpu flag, but I'd suggest to try these first:

-march=native
-mcpu=native
-mtune=native

As reported in GCC manual page related to ARM options:

-march=native causes the compiler to auto-detect the architecture of the build computer. At present, this feature is only supported on Linux, and not all architectures are recognized. If the auto-detect is unsuccessful the option has no effect.

-mcpu=native causes the compiler to auto-detect the CPU of the build computer. At present, this feature is only supported on Linux, and not all architectures are recognized. If the auto-detect is unsuccessful the option has no effect.

-mtune=native causes the compiler to auto-detect the CPU of the build computer. At present, this feature is only supported on Linux, and not all architectures are recognized. If the auto-detect is unsuccessful the option has no effect.

You can then add -Q -v to your GCC flags to see what optimizations are enabled and proceed to further optimizations if necessary.

This is the output using -march=native with a sample program on my Raspberry Pi:

#> gcc -march=native -Q -v test.c -o test
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/4.6/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Debian 4.6.3-8+rpi1' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --disable-sjlj-exceptions --with-arch=armv6 --with-fpu=vfp --with-float=hard --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
Thread model: posix
gcc version 4.6.3 (Debian 4.6.3-8+rpi1) 
COLLECT_GCC_OPTIONS='-march=native' '-Q' '-v' '-o' 'test' '-mfloat-abi=hard' '-mfpu=vfp'
 /usr/lib/gcc/arm-linux-gnueabihf/4.6/cc1 -v -imultilib . -imultiarch arm-linux-gnueabihf test.c -dumpbase test.c -march=native -mfloat-abi=hard -mfpu=vfp -auxbase test -version -o /tmp/cc1rCJ4W.s
cc1: error: bad value (native) for -march switch
GNU C (Debian 4.6.3-8+rpi1) version 4.6.3 (arm-linux-gnueabihf)
    compiled by GNU C version 4.6.3, GMP version 5.0.5, MPFR version 3.1.0-p10, MPC version 0.9
GGC heuristics: --param ggc-min-expand=38 --param ggc-min-heapsize=15522
ignoring nonexistent directory "/usr/local/include/arm-linux-gnueabihf"
ignoring nonexistent directory "/usr/lib/gcc/arm-linux-gnueabihf/4.6/../../../../arm-linux-gnueabihf/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/arm-linux-gnueabihf/4.6/include
 /usr/local/include
 /usr/lib/gcc/arm-linux-gnueabihf/4.6/include-fixed
 /usr/include/arm-linux-gnueabihf
 /usr/include
End of search list.
GNU C (Debian 4.6.3-8+rpi1) version 4.6.3 (arm-linux-gnueabihf)
    compiled by GNU C version 4.6.3, GMP version 5.0.5, MPFR version 3.1.0-p10, MPC version 0.9
GGC heuristics: --param ggc-min-expand=38 --param ggc-min-heapsize=15522
options passed:  -v -imultilib . -imultiarch arm-linux-gnueabihf test.c
 -march=native -mfloat-abi=hard -mfpu=vfp
options enabled:  -fauto-inc-dec -fbranch-count-reg -fcommon
 -fdelete-null-pointer-checks -fdwarf2-cfi-asm -fearly-inlining
 -feliminate-unused-debug-types -ffunction-cse -fgcse-lm -fident
 -finline-functions-called-once -fira-share-save-slots
 -fira-share-spill-slots -fivopts -fkeep-static-consts -fleading-underscore
 -fmath-errno -fmerge-debug-strings -fmove-loop-invariants -fpeephole
 -fprefetch-loop-arrays -freg-struct-return -fsched-critical-path-heuristic
 -fsched-dep-count-heuristic -fsched-group-heuristic -fsched-interblock
 -fsched-last-insn-heuristic -fsched-rank-heuristic -fsched-spec
 -fsched-spec-insn-heuristic -fsched-stalled-insns-dep -fshow-column
 -fsigned-zeros -fsplit-ivs-in-unroller -fstrict-volatile-bitfields
 -ftrapping-math -ftree-cselim -ftree-forwprop -ftree-loop-if-convert
 -ftree-loop-im -ftree-loop-ivcanon -ftree-loop-optimize
 -ftree-parallelize-loops= -ftree-phiprop -ftree-pta -ftree-reassoc
 -ftree-scev-cprop -ftree-slp-vectorize -ftree-vect-loop-version
 -funit-at-a-time -fvar-tracking -fvar-tracking-assignments
 -fzero-initialized-in-bss -mglibc -mlittle-endian -msched-prolog

Execution times (seconds) TOTAL : 0.00 0.00 0.00 8 kB

Avio
  • 1,239
  • 2
  • 15
  • 27