const
.
--enable-3dnow
and --enable-k7
?
We have successfully used gcc
3.2.x on x86 and PPC, a recent Compaq C compiler for Alpha, version 6 of IBM's
xlc
compiler for AIX, Intel's icc
versions 5-7, and Sun WorkShop cc
version 6.
FFTW is likely to push compilers to their limits, however, and several compiler bugs have been exposed by FFTW. A partial list follows.
gcc
2.95.x for Solaris/SPARC produces incorrect code for
the test program (workaround: recompile the
libbench2
directory with -O2
).
NetBSD/macppc 1.6 comes with a gcc
version that also miscompiles the test program. (Please report a workaround if you know
one.)
gcc
3.2.3 for ARM reportedly crashes during compilation.
This bug is reportedly fixed in later versions of
gcc
.
Versions 8.0 and 8.1 of Intel's icc
falsely claim to be gcc
, so you should specify CC="icc -no-gcc"
; this is automatic in FFTW 3.1. icc-8.0.066
reportely produces incorrect code for FFTW 2.1.5, but is fixed in version 8.1.
icc-7.1
compiler build 20030402Z appears to produce
incorrect dependencies, causing the compilation to fail.
icc-7.1
build 20030307Z appears to work fine. (Use
icc -V
to check which build you have.) As of 2003/04/18,
build 20030402Z appears not to be available any longer on Intel's
website, whereas the older build 20030307Z is available.
ranlib
of GNU binutils
2.9.1 on Irix has been observed to corrupt the FFTW libraries, causing a link failure when
FFTW is compiled. Since ranlib
is completely superfluous on Irix, we suggest deleting it from your system and replacing it with
a symbolic link to /bin/echo
.
If support for SIMD instructions is enabled in FFTW, further compiler problems may appear:
gcc
3.4.[0123] for x86 produces incorrect SSE2 code for
FFTW when -O2
(the best choice for FFTW) is used, causing
FFTW to crash (make check
crashes). This bug is fixed in gcc
3.4.4. On x86_64 (amd64/em64t), gcc
3.4.4 reportedly still has a similar problem, but this is fixed as of
gcc
3.4.6.
gcc-3.2
for x86 produces incorrect SIMD code if
-O3
is used. The same compiler produces incorrect SIMD
code if no optimization is used, too. When using
gcc-3.2
, it is a good idea not to change the default
CFLAGS
selected by the configure
script.
Some 3.0.x and 3.1.x versions of gcc
on x86
may crash. gcc
so-called 2.96 shipping with RedHat 7.3 crashes
when compiling SIMD code. In both cases, please upgrade to
gcc-3.2
or later.
Intel's icc
6.0 misaligns SSE constants, but FFTW has a
workaround. icc
8.x fails to compile FFTW 3.0.x because it
falsely claims to be gcc
; we believe this to be a bug in icc
, but FFTW 3.1 has a workaround.
Visual C++ 2003 reportedly produces incorrect code for SSE/SSE2 when
compiling FFTW. This bug was reportedly fixed in VC++ 2005;
alternatively, you could switch to the Intel compiler. VC++ 6.0 also
reportedly produces incorrect code for the file
reodft11e-r2hc-odd.c
unless optimizations are disabled for that file.
gcc
2.95 on MacOS X miscompiles AltiVec code (fixed in
later versions). gcc
3.2.x miscompiles AltiVec permutations, but FFTW has a workaround.
gcc
4.0.1 on MacOS for Intel crashes when compiling FFTW; a workaround is to
compile one file without optimization: cd kernel; make CFLAGS=" " trig.lo
.
gcc
4.1.1 reportedly crashes when compiling FFTW for MIPS;
the workaround is to compile the file it crashes on
(t2_64.c
) with a lower optimization level.
gcc
versions 4.1.2 to 4.2.0 for x86 reportedly miscompile
FFTW 3.1's test program, causing make check
to crash (gcc
bug #26528). The bug was reportedly fixed in
gcc
version 4.2.1 and later. A workaround is to compile
libbench2/verify-lib.c
without optimization.
const
.
make
such as
"./fftw.h", line 88: warning: const is a keyword in ANSI
C
This is the case when the configure
script reports that const
does not work:
checking for working const... (cached) no
You should be aware that Solaris comes with two compilers, namely,
/opt/SUNWspro/SC4.2/bin/cc
and /usr/ucb/cc
. The latter compiler is non-ANSI. Indeed, it is a perverse shell script
that calls the real compiler in non-ANSI mode. In order
to compile FFTW, change your path so that the right
cc
is used.
To know whether your compiler is the right one, type
cc -V
. If the compiler prints ``ucbcc
'', as in
ucbcc: WorkShop Compilers 4.2 30 Oct 1996 C
4.2
then the compiler is wrong. The right message is something like
cc: WorkShop Compilers 4.2 30 Oct 1996 C
4.2
--enable-3dnow
and --enable-k7
?
--enable-k7
enables 3DNow! instructions on K7 processors
(AMD Athlon and its variants). K7 support is provided by assembly
routines generated by a special purpose compiler.
As of fftw-3.2, --enable-k7 is no longer supported.
--enable-3dnow
enables generic 3DNow! support using gcc
builtin functions. This works on earlier AMD
processors, but it is not as fast as our special assembly routines.
As of fftw-3.1, --enable-3dnow is no longer supported.
configure
script attempts to automatically guess which
version to use.
The FFTW 3.1 configure
script enables fma by default on PowerPC, Itanium, and PA-RISC, and disables it otherwise. You can
force one or the other by using the --enable-fma
or --disable-fma
flag for configure
.
Definitely use fma if you have a PowerPC-based system with
gcc
(or IBM xlc
). This includes all GNU/Linux systems for PowerPC and the older PowerPC-based MacOS systems. Also
use it on PA-RISC and Itanium with the HP/UX compiler.
Definitely do not use the fma version if you have an ia-32 processor (Intel, AMD, MacOS on Intel, etcetera).
For other architectures/compilers, the situation is not so clear. For
example, ia-64 has the fma instruction, but
gcc-3.2
appears not to exploit it correctly. Other compilers may do the right thing,
but we have not tried them. Please send us your feedback so that we
can update this FAQ entry.
genfft
, written in the Objective Caml dialect of ML. You do not need to know ML or to
have an Objective Caml compiler in order to use FFTW.
genfft
is provided with the FFTW sources, which means that
you can play with the code generator if you want. In this case, you
need a working Objective Caml system. Objective Caml is available
from the Caml web page.
By default, FFTW configures its Fortran interface to work with the
first compiler it finds, e.g. g77
. To configure for a different, incompatible Fortran compiler
foobar
, use ./configure F77=foobar
when installing FFTW. (In the case of g77
, however, FFTW 3.x also includes an extra set of
Fortran-callable routines with one less underscore at the end of
identifiers, which should cover most other Fortran compilers on Linux
at least.)
<complex>
template class is bit-compatible with FFTW's complex-number format
(see the FFTW manual for more details).
configure --enable-float
. On a non-Unix system: edit config.h
to #define
the symbol FFTW_SINGLE
(for FFTW 3.x). In both cases, you must then
recompile FFTW. In FFTW 3, all FFTW identifiers will then begin with
fftwf_
instead of fftw_
.
The fftw-3.1 release supports --enable-k7. This option only works on 32-bit x86 machines that implement 3DNow!, including the AMD Athlon and the AMD Opteron in 32-bit mode. --enable-k7 does not work on AMD Opteron in 64-bit mode. Use --enable-sse for x86-64 machines.
FFTW supports 3DNow! by means of assembly code generated by a special-purpose compiler. It is hard to produce assembly code that works in both 32-bit and 64-bit mode.
Matteo Frigo and Steven G. Johnson / fftw@fftw.org