Next: Debugging Options, Previous: Code-Level And API Options, Up: Scanner Options [Contents][Index]
controls the degree of table compression and, more generally, trade-offs between small scanners and fast scanners.
A lone ‘-C’ specifies that the scanner tables should be compressed but neither equivalence classes nor meta-equivalence classes should be used.
%option align
’(“align”) instructs flex to trade off larger tables in the generated scanner for faster performance because the elements of the tables are better aligned for memory access and computation. On some RISC architectures, fetching and manipulating longwords is more efficient than with smaller-sized units such as shortwords. This option can quadruple the size of the tables used by your scanner.
%option ecs
’directs flex
to construct equivalence classes, i.e., sets
of characters which have identical lexical properties (for example, if
the only appearance of digits in the flex
input is in the
character class “[0-9]” then the digits ’0’, ’1’, ..., ’9’ will all be
put in the same equivalence class). Equivalence classes usually give
dramatic reductions in the final table/object file sizes (typically a
factor of 2-5) and are pretty cheap performance-wise (one array look-up
per character scanned).
specifies that the full scanner tables should be generated -
flex
should not compress the tables by taking advantages of
similar transition functions for different states.
specifies that the alternate fast scanner representation (described above under the ‘--fast’ flag) should be used. This option cannot be used with ‘--c++’.
%option meta-ecs
’directs
flex
to construct
meta-equivalence classes,
which are sets of equivalence classes (or characters, if equivalence
classes are not being used) that are commonly used together. Meta-equivalence
classes are often a big win when using compressed tables, but they
have a moderate performance impact (one or two if
tests and one
array look-up per character scanned).
%option read
’causes the generated scanner to bypass use of the standard I/O
library (stdio
) for input. Instead of calling fread()
or
getc()
, the scanner will use the read()
system call,
resulting in a performance gain which varies from system to system, but
in general is probably negligible unless you are also using ‘-Cf’
or ‘-CF’. Using ‘-Cr’ can cause strange behavior if, for
example, you read from yyin using stdio
prior to calling
the scanner (because the scanner will miss whatever text your previous
reads left in the stdio
input buffer). ‘-Cr’ has no effect
if you define YY_INPUT()
(see The Generated Scanner).
The options ‘-Cf’ or ‘-CF’ and ‘-Cm’ do not make sense together - there is no opportunity for meta-equivalence classes if the table is not being compressed. Otherwise the options may be freely mixed, and are cumulative.
The default setting is ‘-Cem’, which specifies that flex
should generate equivalence classes and meta-equivalence classes. This
setting provides the highest degree of table compression. You can trade
off faster-executing scanners at the cost of larger tables with the
following generally being true:
slowest & smallest -Cem -Cm -Ce -C -C{f,F}e -C{f,F} -C{f,F}a fastest & largest
Note that scanners with the smallest tables are usually generated and compiled the quickest, so during development you will usually want to use the default, maximal compression.
‘-Cfe’ is often a good compromise between speed and size for production scanners.
%option full
’specifies
fast scanner.
No table compression is done and stdio
is bypassed.
The result is large but fast. This option is equivalent to
‘--Cfr’
%option fast
’specifies that the fast scanner table representation should be
used (and stdio
bypassed). This representation is about as fast
as the full table representation ‘--full’, and for some sets of
patterns will be considerably smaller (and for others, larger). In
general, if the pattern set contains both keywords and a
catch-all, identifier rule, such as in the set:
"case" return TOK_CASE; "switch" return TOK_SWITCH; ... "default" return TOK_DEFAULT; [a-z]+ return TOK_ID;
then you’re better off using the full table representation. If only the identifier rule is present and you then use a hash table or some such to detect the keywords, you’re better off using ‘--fast’.
This option is equivalent to ‘-CFr’. It cannot be used with ‘--c++’.
Next: Debugging Options, Previous: Code-Level And API Options, Up: Scanner Options [Contents][Index]