4.4. Various search conditions

Groonga supports to narrow down by using syntax like JavaScript, sort by the calculated value. Additionally, Groonga also supports to narrow down & sort search results by using location information (latitude & longitude).

4.4.1. Narrow down & Full-text search by using syntax like JavaScript

The filter parameter of select command accepts the search condition. There is one difference between filter parameter and query parameter, you need to specify the condition by syntax like JavaScript for filter parameter.

Execution example:

select --table Site --filter "_id <= 1" --output_columns _id,_key
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "http://example.org/"
#       ]
#     ]
#   ]
# ]

See the detail of above query. Here is the condition which is specified as filter parameter:

_id <= 1

In this case, this query returns the records which meets the condition that the value of _id is equal to or less than 1.

Moreover, you can use && for AND search, || for OR search.

Execution example:

select --table Site --filter "_id >= 4 && _id <= 6" --output_columns _id,_key
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ]
#       ],
#       [
#         4,
#         "http://example.net/afr"
#       ],
#       [
#         5,
#         "http://example.org/aba"
#       ],
#       [
#         6,
#         "http://example.com/rab"
#       ]
#     ]
#   ]
# ]
select --table Site --filter "_id <= 2 || _id >= 7" --output_columns _id,_key
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "http://example.org/"
#       ],
#       [
#         2,
#         "http://example.net/"
#       ],
#       [
#         7,
#         "http://example.net/atv"
#       ],
#       [
#         8,
#         "http://example.org/gat"
#       ],
#       [
#         9,
#         "http://example.com/vdw"
#       ]
#     ]
#   ]
# ]

If you specify query parameter and filter parameter at the same time, you can get the records which meets both of the condition as a result.

4.4.2. Sort by using scorer

select command accepts scorer parameter which is used to process each record of full-text search results.

This parameter accepts the conditions which is specified by syntax like JavaScript as same as filter parameter.

Execution example:

select --table Site --filter "true" --scorer "_score = rand()" --output_columns _id,_key,_score --sort_keys _score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         9
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         6,
#         "http://example.com/rab",
#         424238335
#       ],
#       [
#         9,
#         "http://example.com/vdw",
#         596516649
#       ],
#       [
#         7,
#         "http://example.net/atv",
#         719885386
#       ],
#       [
#         2,
#         "http://example.net/",
#         846930886
#       ],
#       [
#         8,
#         "http://example.org/gat",
#         1649760492
#       ],
#       [
#         3,
#         "http://example.com/",
#         1681692777
#       ],
#       [
#         4,
#         "http://example.net/afr",
#         1714636915
#       ],
#       [
#         1,
#         "http://example.org/",
#         1804289383
#       ],
#       [
#         5,
#         "http://example.org/aba",
#         1957747793
#       ]
#     ]
#   ]
# ]
select --table Site --filter "true" --scorer "_score = rand()" --output_columns _id,_key,_score --sort_keys _score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         9
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         4,
#         "http://example.net/afr",
#         783368690
#       ],
#       [
#         2,
#         "http://example.net/",
#         1025202362
#       ],
#       [
#         5,
#         "http://example.org/aba",
#         1102520059
#       ],
#       [
#         1,
#         "http://example.org/",
#         1189641421
#       ],
#       [
#         3,
#         "http://example.com/",
#         1350490027
#       ],
#       [
#         8,
#         "http://example.org/gat",
#         1365180540
#       ],
#       [
#         9,
#         "http://example.com/vdw",
#         1540383426
#       ],
#       [
#         7,
#         "http://example.net/atv",
#         1967513926
#       ],
#       [
#         6,
#         "http://example.com/rab",
#         2044897763
#       ]
#     ]
#   ]
# ]

‘_score’ is one of a pseudo column. The score of full-text search is assigned to it. See Pseudo column about ‘_score’ column.

In the above query, the condition of scorer parameter is:

_score = rand()

In this case, the score of full-text search is overwritten by the value of rand() function.

The condition of sort_keys parameter is:

_score

This means that sorting the search result by ascending order.

As a result, the order of search result is randomized.

4.4.3. Narrow down & sort by using location information

Groonga supports to store location information (Longitude & Latitude) and not only narrow down but also sort by using it.

Groonga supports two kind of column types to store location information. One is TokyoGeoPoint, the other is WGS84GeoPoint. TokyoGeoPoint is used for Japan geodetic system. WGS84GeoPoint is used for world geodetic system.

Specify longitude and latitude as follows:

  • “[latitude in milliseconds]x[longitude in milliseconds]”(e.g.: “128452975x503157902”)

  • “[latitude in milliseconds],[longitude in milliseconds]”(e.g.: “128452975,503157902”)

  • “[latitude in degrees]x[longitude in degrees]”(e.g.: “35.6813819x139.7660839”)

  • “[latitude in degrees],[longitude in degrees]”(e.g.: “35.6813819,139.7660839”)

Let’s store two location information about station in Japan by WGS. One is Tokyo station, the other is Shinjyuku station. Both of them are station in Japan. The latitude of Tokyo station is 35 degrees 40 minutes 52.975 seconds, the longitude of Tokyo station is 139 degrees 45 minutes 57.902 seconds. The latitude of Shinjyuku station is 35 degrees 41 minutes 27.316 seconds, the longitude of Shinjyuku station is 139 degrees 42 minutes 0.929 seconds. Thus, location information in milliseconds are “128452975x503157902” and “128487316x502920929” respectively. location information in degrees are “35.6813819x139.7660839” and “35.6909211x139.7002581” respectively.

Let’s register location information in milliseconds.

Execution example:

column_create --table Site --name location --type WGS84GeoPoint
# [[0, 1337566253.89858, 0.000355720520019531], true]
load --table Site
[
 {"_key":"http://example.org/","location":"128452975x503157902"}
 {"_key":"http://example.net/","location":"128487316x502920929"},
]
# [[0, 1337566253.89858, 0.000355720520019531], 2]
select --table Site --query "_id:1 OR _id:2" --output_columns _key,location
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "location",
#           "WGS84GeoPoint"
#         ]
#       ],
#       [
#         "http://example.org/",
#         "128452975x503157902"
#       ],
#       [
#         "http://example.net/",
#         "128487316x502920929"
#       ]
#     ]
#   ]
# ]

Then assign the value of geo distance which is calculated by geo_distance function to scorer parameter.

Let’s show geo distance from Akihabara station in Japan. In world geodetic system, the latitude of Akihabara station is 35 degrees 41 minutes 55.259 seconds, the longitude of Akihabara station is 139 degrees 46 minutes 27.188 seconds. Specify “128515259x503187188” for geo_distance function.

Execution example:

select --table Site --query "_id:1 OR _id:2" --output_columns _key,location,_score --scorer '_score = geo_distance(location, "128515259x503187188")'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "location",
#           "WGS84GeoPoint"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "http://example.org/",
#         "128452975x503157902",
#         2054
#       ],
#       [
#         "http://example.net/",
#         "128487316x502920929",
#         6720
#       ]
#     ]
#   ]
# ]

As you can see, the geo distance between Tokyo station and Akihabara station is 2054 meters, the geo distance between Akihabara station and Shinjyuku station is 6720 meters.

The return value of geo_distance function is also used for sorting by specifying pseudo _score column to sort_keys parameter.

Execution example:

select --table Site --query "_id:1 OR _id:2" --output_columns _key,location,_score --scorer '_score = geo_distance(location, "128515259x503187188")' --sort_keys -_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "location",
#           "WGS84GeoPoint"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "http://example.net/",
#         "128487316x502920929",
#         6720
#       ],
#       [
#         "http://example.org/",
#         "128452975x503157902",
#         2054
#       ]
#     ]
#   ]
# ]

Groonga also supports to narrow down by “a certain point within specified meters”.

In such a case, use geo_in_circle function in filter parameter.

For example, search the records which exists within 5000 meters from Akihabara station.

Execution example:

select --table Site --output_columns _key,location --filter 'geo_in_circle(location, "128515259x503187188", 5000)'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "location",
#           "WGS84GeoPoint"
#         ]
#       ],
#       [
#         "http://example.org/",
#         "128452975x503157902"
#       ]
#     ]
#   ]
# ]

There is geo_in_rectangle function which is used to search a certain point within specified region.