Goto Chapter: Top 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 Bib Ind
 [Top of Book]  [Contents]   [Previous Chapter]   [Next Chapter] 

19 Floats
 19.1 A sample run
 19.2 Methods
 19.3 High-precision-specific methods
 19.4 Complex arithmetic
 19.5 Interval-specific methods

19 Floats

Starting with version 4.5, GAP has built-in support for floating-point numbers in machine format, and allows package to implement arbitrary-precision floating-point arithmetic in a uniform manner. For now, one such package, Float exists, and is based on the arbitrary-precision routines in mpfr.

A word of caution: GAP deals primarily with algebraic objects, which can be represented exactly in a computer. Numerical imprecision means that floating-point numbers do not form a ring in the strict GAP sense, because addition is in general not associative ((1.0e-100+1.0)-1.0 is not the same as 1.0e-100+(1.0-1.0), in the default precision setting).

Most algorithms in GAP which require ring elements will therefore not be applicable to floating-point elements. In some cases, such a notion would not even make any sense (what is the greatest common divisor of two floating-point numbers?)

19.1 A sample run

Floating-point numbers can be input into GAP in the standard floating-point notation:

gap> 3.14;
3.14
gap> last^2/6;
1.64327
gap> h := 6.62606896e-34;
6.62607e-34
gap> pi := 4*Atan(1.0);
3.14159
gap> hbar := h/(2*pi);
1.05457e-34

Floating-point numbers can also be created using Float, from strings or rational numbers; and can be converted back using String,Rat,Int.

GAP allows rational and floating-point numbers to be mixed in the elementary operations +,-,*,/. However, floating-point numbers and rational numbers may not be compared. Conversions are performed using the creator Float:

gap> Float("3.1416");
3.1416
gap> Float(355/113);
3.14159
gap> Rat(last);
355/113
gap> Rat(0.33333);
1/3
gap> Int(1.e10);
10000000000
gap> Int(1.e20);
100000000000000000000
gap> Int(1.e30);
1000000000000000019884624838656

19.2 Methods

Floating-point numbers may be directly input, as in any usual mathematical software or language; with the exception that every floating-point number must contain a decimal digit. Therefore .1, .1e1, -.999 etc. are all valid GAP inputs.

Floating-point numbers so entered in GAP are stored as strings. They are converted to floating-point when they are first used. This means that, if the floating-point precision is increased, the constants are reevaluated to fit the new format.

Floating-point numbers may be followed by an underscore, as in 1._. This means that they are to be immediately converted to the current floating-point format. The underscore may be followed by a single letter, which specifies which format/precision to use. By default, GAP has a single floating-point handler, with fixed (53 bits) precision, and its format specifier is 'l' as in 1._l. Higher-precision floating-point computations is available via external packages; float for example.

A record, FLOAT (19.2-5), contains all relevant constants for the current floating-point format; see its documentation for details. Typical fields are FLOAT.MANT_DIG=53, the constant FLOAT.VIEW_DIG=6 specifying the number of digits to view, and FLOAT.PI for the constant π. The constants have the same name as their C counterparts, except for the missing initial DBL_ or M_.

Floating-point numbers may be created using the single function Float (19.2-1), which accepts as arguments rational, string, or floating-point numbers. Floating-point numbers may also be created, in any floating-point representation, using NewFloat (19.2-1) as in NewFloat(IsIEEE754FloatRep,355/113), by supplying the category filter of the desired new floating-point number; or using MakeFloat (19.2-1) as in MakeFloat(1.0,355/113), by supplying a sample floating-point number.

Floating-point numbers may also be converted to other GAP formats using the usual commands Int (14.2-3), Rat (17.2-6), String (27.7-6).

Exact conversion to and from floating-point format may be done using external representations. The "external representation" of a floating-point number x is a pair [m,e] of integers, such that x=m*2^(-1+e-LogInt(AbsInt(m),2)). Conversion to and from external representation is performed as usual using ExtRepOfObj (79.8-1) and ObjByExtRep (79.8-1):

gap> ExtRepOfObj(3.14);
[ 7070651414971679, 2 ]
gap> ObjByExtRep(IEEE754FloatsFamily,last);
3.14

Computations with floating-point numbers never raise any error. Division by zero is allowed, and produces a signed infinity. Illegal operations, such as 0./0., produce NaN's (not-a-number); this is the only floating-point number x such that not EqFloat(x+0.0,x).

The IEEE754 standard requires NaN to be non-equal to itself. On the other hand, GAP requires every object to be equal to itself. To respect the IEEE754 standard, the function EqFloat (19.2-6) should be used instead of =.

The category a floating-point belongs to can be checked using the filters IsFinite (30.4-2), IsPInfinity (19.2-9), IsNInfinity (19.2-9), IsXInfinity (19.2-9), IsNaN (19.2-9).

Comparisons between floating-point numbers and rationals are explicitly forbidden. The rationale is that objects belonging to different families should in general not be comparable in GAP. Floating-point numbers are also approximations of real numbers, and don't follow the same rules; consider for example, using the default GAP implementation of floating-point numbers,

gap> 1.0/3.0 = Float(1/3);
true
gap> (1.0/3.0)^5 = Float((1/3)^5);
false

19.2-1 Float creators
‣ Float( obj )( function )
‣ NewFloat( filter, obj )( constructor )
‣ MakeFloat( sample, obj, obj )( operation )

Returns: A new floating-point number, based on obj

This function creates a new floating-point number.

If obj is a rational number, the created number is created with sufficient precision so that the number can (usually) be converted back to the original number (see Rat (Reference: Rat) and Rat (17.2-6)). For an integer, the precision, if unspecified, is chosen sufficient so that Int(Float(obj))=obj always holds, but at least 64 bits.

obj may also be a string, which may be of the form "3.14e0" or ".314e1" or ".314@1" etc.

An option may be passed to specify, it bits, a desired precision. The format is Float("3.14":PrecisionFloat:=1000) to create a 1000-bit approximation of 3.14.

In particular, if obj is already a floating-point number, then Float(obj:PrecisionFloat:=prec) creates a copy of obj with a new precision. prec

19.2-2 Rat
‣ Rat( f )( attribute )

Returns: A rational approximation to f

This command constructs a rational approximation to the floating-point number f. Of course, it is not guaranteed to return the original rational number f was created from, though it returns the most `reasonable' one given the precision of f.

Two options control the precision of the rational approximation: In the form Rat(f:maxdenom:=md,maxpartial:=mp), the rational returned is such that the denominator is at most md and the partials in its continued fraction expansion are at most mp. The default values are maxpartial:=10000 and maxdenom:=2^(precision/2).

19.2-3 Cyc
‣ Cyc( f[, degree] )( operation )

Returns: A cyclotomic approximation to f

This command constructs a cyclotomic approximation to the floating-point number f. Of course, it is not guaranteed to return the original rational number f was created from, though it returns the most `reasonable' one given the precision of f. An optional argument degree specifies the maximal degree of the cyclotomic to be constructed.

The method used is LLL lattice reduction.

19.2-4 SetFloats
‣ SetFloats( rec[, bits][, install] )( function )

Installs a new interface to floating-point numbers in GAP, optionally with a desired precision bits in binary digits. The last optional argument install is a boolean value; if false, it only installs the eager handler and the precision for the floateans, without making them the default.

19.2-5 FLOAT
‣ FLOAT( global variable )

This record contains useful floating-point constants:

DECIMAL_DIG

Maximal number of useful digits;

DIG

Number of significant digits;

VIEW_DIG

Number of digits to print in short view;

EPSILON

Smallest number such that 1≠1+ϵ;

MANT_DIG

Number of bits in the mantissa;

MAX

Maximal representable number;

MAX_10_EXP

Maximal decimal exponent;

MAX_EXP

Maximal binary exponent;

MIN

Minimal positive representable number;

MIN_10_EXP

Minimal decimal exponent;

MIN_EXP

Minimal exponent;

INFINITY

Positive infinity;

NINFINITY

Negative infinity;

NAN

Not-a-number,

as well as mathematical constants E, LOG2E, LOG10E, LN2, LN10, PI, PI_2, PI_4, 1_PI, 2_PI, 2_SQRTPI, SQRT2, SQRT1_2.

19.2-6 EqFloat
‣ EqFloat( x, y )( operation )

Returns: Whether the floateans x and y are equal

This function compares two floating-point numbers, and returns true if they are equal, and false otherwise; with the exception that NaN is always considered to be different from itself.

19.2-7 PrecisionFloat
‣ PrecisionFloat( x )( attribute )

Returns: The precision of x

This function returns the precision, counted in number of binary digits, of the floating-point number x.

19.2-8 SignBit
‣ SignBit( x )( attribute )
‣ SignFloat( x )( attribute )

Returns: The sign of x.

The first function SignBit returns the sign bit of the floating-point number x: true if x is negative (including -0.) and false otherwise.

The second function SignFloat returns the integer -1 if x<0, 0 if x=0 and 1 if x>0.

19.2-9 Infinity testers
‣ IsPInfinity( x )( property )
‣ IsNInfinity( x )( property )
‣ IsXInfinity( x )( property )
‣ IsFinite( x )( property )
‣ IsNaN( x )( property )

Returns true if the floating-point number x is respectively +∞, -∞, ±∞, finite, or `not a number', such as the result of 0.0/0.0.

19.2-10 Standard mathematical operations
‣ Sin( f )( attribute )
‣ Cos( f )( attribute )
‣ Tan( f )( attribute )
‣ Sec( f )( attribute )
‣ Csc( f )( attribute )
‣ Cot( f )( attribute )
‣ Asin( f )( attribute )
‣ Acos( f )( attribute )
‣ Atan( f )( attribute )
‣ Sinh( f )( attribute )
‣ Cosh( f )( attribute )
‣ Tanh( f )( attribute )
‣ Sech( f )( attribute )
‣ Csch( f )( attribute )
‣ Coth( f )( attribute )
‣ Asinh( f )( attribute )
‣ Acosh( f )( attribute )
‣ Atanh( f )( attribute )
‣ Log( f )( operation )
‣ Log2( f )( attribute )
‣ Log10( f )( attribute )
‣ Log1p( f )( attribute )
‣ Exp( f )( attribute )
‣ Exp2( f )( attribute )
‣ Exp10( f )( attribute )
‣ Expm1( f )( attribute )
‣ CubeRoot( f )( attribute )
‣ Square( f )( attribute )
‣ Atan2( y, x )( operation )
‣ Hypothenuse( x, y )( operation )
‣ Ceil( f )( attribute )
‣ Floor( f )( attribute )
‣ Round( f )( attribute )
‣ Trunc( f )( attribute )
‣ FrExp( f )( attribute )
‣ LdExp( f, exp )( operation )
‣ AbsoluteValue( f )( attribute )
‣ Norm( f )( attribute )
‣ Frac( f )( attribute )
‣ SinCos( f )( attribute )
‣ Erf( f )( attribute )
‣ Zeta( f )( attribute )
‣ Gamma( f )( attribute )

Standard math functions.

19.3 High-precision-specific methods

GAP provides a mechanism for packages to implement new floating-point numerical interfaces. The following describes that mechanism, actual examples of packages are documented separately.

A package must create a record with fields (all optional)

creator

a function converting strings to floating-point;

eager

a character allowing immediate conversion to floating-point;

objbyextrep

a function creating a floating-point number out of a list [mantissa,exponent];

filter

a filter for the new floating-point objects;

constants

a record containing numerical constants, such as MANT_DIG, MAX, MIN, NAN.

The package must install methods Int, Rat, String for its objects, and creators NewFloat(filter,IsRat), NewFloat(IsString).

It must then install methods for all arithmetic and numerical operations: SUM, Exp, ...

The user chooses that implementation by calling SetFloats (19.2-4) with the record as argument, and with an optional second argument requesting a precision in binary digits.

19.4 Complex arithmetic

Complex arithmetic may be implemented in packages, and is present in float. Complex numbers are treated as usual numbers; they may be input with an extra "i" as in -0.5+0.866i. They may also be created using NewFloat (19.2-1) with three arguments: the float filter, the real part, and the imaginary part.

Methods should then be implemented for Norm, RealPart, ImaginaryPart, ComplexConjugate, ...

19.4-1 Argument
‣ Argument( z )( attribute )

Returns the argument of the complex number z, namely the value Atan2(ImaginaryPart(z),RealPart(z)).

19.5 Interval-specific methods

Interval arithmetic may also be implemented in packages. Intervals are in fact efficient implementations of sets of real numbers. The only non-trivial issue is how they should be compared. The standard EQ tests if the intervals are equal; however, it is usually more useful to know if intervals overlap, or are disjoint, or are contained in each other.

Note the usual convention that intervals are compared as in [a,b]≤[c,d] if and only if a≤ c and b≤ d.

19.5-1 Sup
‣ Sup( x )( attribute )

Returns the supremum of the interval x.

19.5-2 Inf
‣ Inf( x )( attribute )

Returns the infimum of the interval x.

19.5-3 Mid
‣ Mid( x )( attribute )

Returns the midpoint of the interval x.

19.5-4 AbsoluteDiameter
‣ AbsoluteDiameter( x )( attribute )
‣ Diameter( x )( operation )

Returns the absolute diameter of the interval x, namely the difference Sup(x)-Inf(x).

19.5-5 RelativeDiameter
‣ RelativeDiameter( x )( attribute )

Returns the relative diameter of the interval x, namely (Sup(x)-Inf(x))/AbsoluteValue(Min(x)).

19.5-6 IsDisjoint
‣ IsDisjoint( x1, x2 )( operation )

Returns true if the two intervals x1, x2 are disjoint.

19.5-7 IsSubset
‣ IsSubset( x1, x2 )( operation )

Returns true if the interval x1 contains x2.

19.5-8 IncreaseInterval
‣ IncreaseInterval( x, delta )( operation )

Returns an interval with same midpoint as x but absolute diameter increased by delta.

19.5-9 BlowupInterval
‣ BlowupInterval( x, ratio )( operation )

Returns an interval with same midpoint as x but relative diameter increased by ratio.

19.5-10 BisectInterval
‣ BisectInterval( x )( operation )

Returns a list of two intervals whose union equals the interval x.

 [Top of Book]  [Contents]   [Previous Chapter]   [Next Chapter] 
Goto Chapter: Top 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 Bib Ind

generated by GAPDoc2HTML