UTF-8

Functions for operating on byte bytes encoded as UTF-8

Note

In many cases, it is better to convert to str, operate on the strings, then convert back to UTF-8. str type can handle many of these functions itself. For those that it doesn’t (removing control characters from length calculations, for instance) the code to do so with a str type is often simpler.

Warning

All of the functions in this module are deprecated. Most of them have been replaced with functions that operate on unicode values in kitchen.text.display. kitchen.text.utf8.utf8_valid() has been replaced with a function in kitchen.text.misc.

kitchen.text.utf8.utf8_text_fill(text, *args, **kwargs)

Deprecated Similar to textwrap.fill() but understands utf-8 strings and doesn’t screw up lists/blocks/etc.

Use kitchen.text.display.fill() instead.

kitchen.text.utf8.utf8_text_wrap(text, width=70, initial_indent='', subsequent_indent='')

Deprecated Similar to textwrap.wrap() but understands utf-8 data and doesn’t screw up lists/blocks/etc

Use kitchen.text.display.wrap() instead

kitchen.text.utf8.utf8_valid(msg)

Deprecated Detect if a string is valid utf-8

Use kitchen.text.misc.byte_string_valid_encoding() instead.

kitchen.text.utf8.utf8_width(msg)

Deprecated Get the textual width of a utf-8 string

Use kitchen.text.display.textual_width() instead.

kitchen.text.utf8.utf8_width_chop(msg, chop=None)

Deprecated Return a string chopped to a given textual width

Use textual_width_chop() and textual_width() instead:

>>> msg = 'く ku ら ra と to み mi'
>>> # Old way:
>>> utf8_width_chop(msg, 5)
(5, 'く ku')
>>> # New way
>>> from kitchen.text.converters import to_bytes
>>> from kitchen.text.display import textual_width, textual_width_chop
>>> (textual_width(msg), to_bytes(textual_width_chop(msg, 5)))
(5, 'く ku')
kitchen.text.utf8.utf8_width_fill(msg, fill, chop=None, left=True, prefix='', suffix='')

Deprecated Pad a utf-8 string to fill a specified width

Use byte_string_textual_width_fill() instead