7.3.38. normalize

Note

This command is an experimental feature.

This command may be changed in the future.

7.3.38.1. Summary

normalize command normalizes text by the specified normalizer.

There is no need to create table to use normalize command. It is useful for you to check the results of normalizer.

7.3.38.2. Syntax

This command takes three parameters.

normalizer and string are required. Others are optional:

normalize normalizer
          string
          [flags=NONE]

7.3.38.3. Usage

Here is a simple example of normalize command.

Execution example:

normalize NormalizerAuto "aBcDe 123"
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   {
#     "normalized": "abcde 123",
#     "types": [],
#     "checks": []
#   }
# ]

7.3.38.4. Parameters

This section describes parameters of normalizer.

7.3.38.4.1. Required parameters

There are required parameters, normalizer and string.

7.3.38.4.1.1. normalizer

Specifies the normalizer name. normalize command uses the normalizer that is named normalizer.

See Normalizers about built-in normalizers.

Here is an example to use built-in NormalizerAuto normalizer.

TODO

If you want to use other normalizers, you need to register additional normalizer plugin by register command. For example, you can use MySQL compatible normalizer by registering groonga-normalizer-mysql.

7.3.38.4.1.2. string

Specifies any string which you want to normalize.

If you want to include spaces in string, you need to quote string by single quotation (') or double quotation (").

Here is an example to use spaces in string.

TODO

7.3.38.4.2. Optional parameters

There are optional parameters.

7.3.38.4.2.1. flags

Specifies a normalization customize options. You can specify multiple options separated by “|”. For example, REMOVE_BLANK|WITH_TYPES.

Here are available flags.

Flag

Description

NONE

Just ignored.

REMOVE_BLANK

TODO

WITH_TYPES

TODO

WITH_CHECKS

If we specify this flag, Groonga output position of character before normalizing. Note that these positions of character before normalizing are a relative position against a previous character.

REMOVE_TOKENIZED_DELIMITER

TODO

Here is an example that uses REMOVE_BLANK.

TODO

Here is an example that uses WITH_TYPES.

TODO

Here is an example that uses WITH_CHECKS.

Execution example:

normalize NormalizerAuto " A  B   C" WITH_CHECKS
#[
#  [
#    0,
#    0.0,
#    0.0
#  ],
#  {
#    "normalized":" a   b    c",
#    "types": [
#
#    ],
#    "checks": [
#      3,
#      1,
#      3,
#      3,
#      1,
#      3,
#      3,
#      3,
#      1
#    ]
#  }
#]

Here is an example that uses REMOVE_TOKENIZED_DELIMITER.

TODO

7.3.38.5. Return value

[HEADER, normalized_text]

HEADER

See Output format about HEADER.

normalized_text

normalized_text is an object that has the following attributes.

Name

Description

normalized

The normalized text.

types

An array of types of the normalized text. The N-th types shows the type of the N-th character in normalized.

7.3.38.6. See also