- Method
set_unicode_encode_mode
int(0..1) set_unicode_encode_mode(int enable)
- Description
Enables or disables unicode encode mode.
In this mode, if the server supports UTF-8 and the connection
charset is latin1
(the default) or unicode
then
big_query handles wide unicode queries. Enabled by default.
Unicode encode mode works as follows: Eight bit strings are sent
as latin1
and wide strings are sent using utf8
.
big_query sends SET character_set_client
statements as
necessary to update the charset on the server side. If the server
doesn't support that then it fails, but the wide string query
would fail anyway.
To make this transparent, string literals with introducers (e.g.
_binary 'foo'
) are excluded from the UTF-8 encoding. This
means that big_query needs to do some superficial parsing of
the query when it is a wide string.
- Returns
1 | Unicode encode mode is enabled.
|
0 | Unicode encode mode couldn't be enabled because an
incompatible connection charset is set. You need to do
set_charset ("latin1") or
set_charset ("unicode") to enable it.
|
|
- Note
Note that this mode doesn't affect the MySQL system variable
character_set_connection
, i.e. it will still be set to
latin1
by default which means server functions like
UPPER()
won't handle non-latin1
characters
correctly in all cases.
To fix that, do set_charset ("unicode")
. That will
allow unicode encode mode to work while utf8
is fully
enabled at the server side.
Tip: If you enable utf8
on the server side, you need to
send raw binary strings as _binary'...'
. Otherwise they
will get UTF-8 encoded by the server.
- Note
When unicode encode mode is enabled and the connection charset
is latin1
, the charset accepted by big_query is not
quite Unicode since latin1
is based on cp1252
.
The differences are in the range 0x80..0x9f
where
Unicode has control chars.
This small discrepancy is not present when the connection
charset is unicode
.
- See also
set_unicode_decode_mode , set_charset