Pike v8.0 release 1738

Method MIME.tokenize()


Method tokenize

array(string|int) tokenize(string header, int|void flags)

Description

A structured header field, as specified by RFC 0822, is constructed from a sequence of lexical elements.

Parameter header

The header value to parse.

Parameter flags

An optional set of flags. Currently only one flag is defined:

TOKENIZE_KEEP_ESCAPES

Keep backslash-escapes in quoted-strings.

The lexical elements parsed are:

individual special characters

quoted-strings

domain-literals

comments

atoms

This function will analyze a string containing the header value, and produce an array containing the lexical elements.

Individual special characters will be returned as characters (i.e. ints).

Quoted-strings, domain-literals and atoms will be decoded and returned as strings.

Comments are not returned in the array at all.

Note

As domain-literals are returned as strings, there is no way to tell the domain-literal [127.0.0.1] from the quoted-string "[127.0.0.1]". Hopefully this won't cause any problems. Domain-literals are used seldom, if at all, anyway...

The set of special-characters is the one specified in RFC 1521 (i.e. "<", ">", "@", ",", ";", ":", "\", "/", "?", "="), and not the set specified in RFC 0822.

See also

MIME.quote(), tokenize_labled(), decode_words_tokenized_remapped().