7.3.23. index_column_diff

7.3.23.1. Summary

New in version 9.0.1.

index_column_diff command check where indexes are broken or not.

We can found already a broken index by this command. Normally, we don’t found it unless Groonga refer, delete, or update it. However, it is possible that Groonga crashes or returns wrong search results by using it. it make us want to found it in advance. This command useful in this case.

Note

This command may use many memory and execution time depending on the size of the target index. Also, if we stop in the middle of execution of this command, the target index may break. Therefore, we suggest that we don’t execute this command on active system, but execute this command on standby system.

7.3.23.2. Syntax

This command takes two parameters. All parameters are required:

index_column_diff table index_column

7.3.23.3. Usage

Here is an example to check a index column in the database:

Execution example:

table_create Data TABLE_HASH_KEY ShortText
# [[0,1612416118.30525,0.003424882888793945],true]
table_create Terms TABLE_PAT_KEY ShortText \
  --default_tokenizer TokenNgram \
  --normalizer NormalizerNFKC130
# [[0,1612416136.049046,0.003507614135742188],true]
load --table Data
[
{"_key": "Hello World"},
{"_key": "Hello Groonga"}
]
# [[0,1612416155.418526,0.3676469326019287],2]
column_create \
  --table Terms \
  --name data_index \
  --flags COLUMN_INDEX|WITH_POSITION \
  --type Data \
  --source _key
# [[0,1612416424.515037,0.00576472282409668],true]
truncate Terms.data_index
# [[0,1612416439.925894,0.009646892547607422],true]
load --table Data
[
{"_key": "Good-by World"},
{"_key": "Good-by Groonga"}
]
# [[0,1612416450.429434,1.51789665222168],2]
index_column_diff Terms data_index
# [
#   [
#     0,
#     1612416577.921113,
#     0.006278038024902344
#   ],
#   [
#     {
#       "token": {
#         "id": 2,
#         "value": "hello"
#       },
#       "remains": [
#       ],
#       "missings": [
#         {
#           "record_id": 1,
#           "position": 0
#         },
#         {
#           "record_id": 2,
#           "position": 0
#         }
#       ]
#     },
#     {
#       "token": {
#         "id": 3,
#         "value": "world"
#       },
#       "remains": [
#       ],
#       "missings": [
#         {
#           "record_id": 1,
#           "position": 1
#         }
#       ]
#     },
#     {
#       "token": {
#         "id": 1,
#         "value": "groonga"
#       },
#       "remains": [
#       ],
#       "missings": [
#         {
#           "record_id": 2,
#           "position": 1
#         }
#       ]
#     }
#   ]
# ]

7.3.23.4. Parameters

This section describes all parameters.

7.3.23.4.1. table

Specifies the name of a table include check target of the index column.

7.3.23.4.2. index_column

Specifies the name of check target of the index column.

7.3.23.5. Return value

index_column_diff command returns result of check indexes:

[HEADER, CHECK_RESULT]

HEADER

See Output format about HEADER.

CHECK_RESULT

This command returns the result of compression between the current value of the index column and the result of tokenize when this command execute as below:

{
  "token": {
    "id": TOKEN_ID,
    "value": TOKEN_VALUE
  },
  "remains": [
    {
      "record_id": RECORD_ID
    }
  ],
  "missings": [
    {
      "record_id": RECORD_ID,
      "position": POSITION
    }
  ]
}

If there are something in remains, a token that Groonga was supposed to delete is remaining in a index.

If there are something in missing, a token that Groonga is supposing to remain in a index has been deleted from the index.

index_column_diff returns nothing as below When indexes haven’t broken:

index_column_diff --table table --name index_column
[[0,0.0,0.0],[]]

7.3.23.5.1. TOKEN_ID

TOKEN_ID is id of a broken token.

7.3.23.5.2. TOKEN_VALUE

TOKEN_VALUE is value of a broken token.

7.3.23.5.3. RECORD_ID

RECORD_ID is id of a record include a broken token.

7.3.23.5.4. POSITION

POSITION is appearing position of a broken token.