[Top]
|
Method sscanf()
- Method
sscanf
int sscanf(string data, string format, mixed ... lvalues)
- Description
The purpose of sscanf is to match a string data against a format
string and place the matching results into a list of variables. The list
of lvalues are destructively modified (which is only possible because
sscanf really is an opcode, rather than a pike function) with the values
extracted from the data according to the format specification. Only
the variables up to the last matching directive of the format string are
touched.
The format string can contain strings separated by special matching
directives like %d, %s%c and %f. Every such
directive corresponds to one of the lvalues , in order they are listed.
An lvalue is the name of a variable, a name of a local variable, an index
in an array, mapping or object. It is because of these lvalues that sscanf
can not be implemented as a normal function.
Whenever a percent character is found in the format string, a match is
performed, according to which operator and modifiers follow it:
"%b" | Reads a binary integer ("0101" makes 5 )
|
"%d" | Reads a decimal integer ("0101" makes 101 ).
|
"%o" | Reads an octal integer ("0101" makes 65 ).
|
"%x" | Reads a hexadecimal integer ("0101" makes 257 ).
|
"%D" | Reads an integer that is either octal (leading zero),
hexadecimal (leading 0x) or decimal. ("0101" makes
65 ).
|
"%c" | Reads one character and returns it as an integer
("0101" makes 48 , or '0' , leaving
"101" for later directives). Using the field width and
endianness modifiers, you can decode integers of any size and
endianness. For example "%-2c" decodes "0101"
into 12592 , leaving "01" fot later directives.
The sign modifiers can be used to modify the signature of the
data, making "%+1c" decode "ä" into
-28 .
|
"%n" | Returns the current character offset in data .
Note that any characters matching fields scanned with the
"!" -modifier are removed from the count (see below).
|
"%f" | Reads a float ("0101" makes 101.0).
|
"%F" | Reads a float encoded according to the IEEE single precision
binary format ("0101" makes 6.45e-10 ,
approximately). Given a field width modifier of 8 (4 is the
default), the data will be decoded according to the IEEE
double precision binary format instead. (You will however
still get a float, unless your pike was compiled with the
configure argument --with-double-precision.)
|
"%s" | Reads a string. If followed by %d, %s will only read non-numerical
characters. If followed by a %[], %s will only read characters not
present in the set. If followed by normal text, %s will match all
characters up to but not including the first occurrence of that text.
|
"%H" | Reads a Hollerith-encoded string, i.e. first reads the length
of the string and then that number of characters. The size and
byte order of the length descriptor can be modified in the
same way as %c. As an example "%2H" first reads
"%2c" and then the resulting number of characters.
|
"%[set]" | Matches a string containing a given set of characters (those given
inside the brackets). Ranges of characters can be defined by using
a minus character between the first and the last character to be
included in the range. Example: %[0-9H] means any number or 'H'.
Note that sets that includes the character '-' must have it first
(not possible in complemented sets, see below) or last in the brackets
to avoid having a range defined. Sets including the character ']' must
list this first too. If both '-' and ']' should be included
then put ']' first and '-' last. It is not possible to make a range
that ends with ']'; make the range end with '\' instead and put ']'
at the beginning of the set. Likewise it is generally not possible
to have a range start with '-'; make the range start with '.' instead
and put '-' at the end of the set. If the first character after the
[ bracket is '^' (%[^set]), and this character does not begin a
range, it means that the set is complemented, which is to say that
any character except those inside brackets is matched. To include '-'
in a complemented set, it must be put last, not first. To include '^'
in a non-complemented set, it can be put anywhere but first, or be
specified as a range ("^-^").
|
"%{format%}" | Repeatedly matches 'format' as many times as possible and assigns an
array of arrays with the results to the lvalue.
|
"%O" | Match a Pike constant, such as string or integer (currently only
integer, string and character constants are functional).
|
"%%" | Match a single percent character (hence this is how you quote the %
character to just match, and not start an lvalue matcher directive).
|
|
Similar to sprintf , you may supply modifiers between the % character
and the operator, to slightly change its behaviour from the default:
"*" | The operator will only match its argument, without assigning any
variable.
|
number | You may define a field width by supplying a numeric modifier.
This means that the format should match that number of
characters in the input data; be it a number characters
long string, integer or otherwise ("0101" using the
format %2c would read an unsigned short 12337 , leaving
the final "01" for later operators, for instance).
|
"-" | Supplying a minus sign toggles the decoding to read the data encoded
in little-endian byte order, rather than the default network
(big-endian) byte order.
|
"+" | Interpret the data as a signed entity. In other words,
"%+1c" will read "\xFF" as -1 instead
of 255 , as "%1c" would have.
|
"!" | Ignore the matched characters with respect to any following
"%n" .
|
|
- Note
Sscanf does not use backtracking. Sscanf simply looks at the format string
up to the next % and tries to match that with the string. It then proceeds
to look at the next part. If a part does not match, sscanf immediately
returns how many % were matched. If this happens, the lvalues for % that
were not matched will not be changed.
- Example
// a will be assigned "oo" and 1 will be returned
sscanf("foo", "f%s", a);
// a will be 4711 and b will be "bar", 2 will be returned
sscanf("4711bar", "%d%s", a, b);
// a will be 4711, 2 will be returned
sscanf("bar4711foo", "%*s%d", a);
// a will become "test", 2 will be returned
sscanf(" \t test", "%*[ \t]%s", a);
// Remove "the " from the beginning of a string
// If 'str' does not begin with "the " it will not be changed
sscanf(str, "the %s", str);
// It is also possible to declare a variable directly in the sscanf call;
// another reason for sscanf not to be an ordinary function:
sscanf("abc def", "%s %s", string a, string b);
- Returns
The number of directives matched in the format string. Note that a string
directive (%s or %[]) counts as a match even when matching just the empty
string (which either may do).
- See also
sprintf , array_sscanf
|