4.4. Various search conditions¶
Groonga supports to narrow down by using syntax like JavaScript, sort by the calculated value. Additionally, Groonga also supports to narrow down & sort search results by using location information (latitude & longitude).
4.4.1. Narrow down & Full-text search by using syntax like JavaScript¶
The filter
parameter of select
command accepts the search condition.
There is one difference between filter
parameter and query
parameter, you need to specify the condition by syntax like JavaScript for filter
parameter.
Execution example:
select --table Site --filter "_id <= 1" --output_columns _id,_key
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 1
# ],
# [
# [
# "_id",
# "UInt32"
# ],
# [
# "_key",
# "ShortText"
# ]
# ],
# [
# 1,
# "http://example.org/"
# ]
# ]
# ]
# ]
See the detail of above query. Here is the condition which is specified as filter
parameter:
_id <= 1
In this case, this query returns the records which meets the condition that the value of _id
is equal to or less than 1.
Moreover, you can use &&
for AND search, ||
for OR search.
Execution example:
select --table Site --filter "_id >= 4 && _id <= 6" --output_columns _id,_key
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 3
# ],
# [
# [
# "_id",
# "UInt32"
# ],
# [
# "_key",
# "ShortText"
# ]
# ],
# [
# 4,
# "http://example.net/afr"
# ],
# [
# 5,
# "http://example.org/aba"
# ],
# [
# 6,
# "http://example.com/rab"
# ]
# ]
# ]
# ]
select --table Site --filter "_id <= 2 || _id >= 7" --output_columns _id,_key
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 5
# ],
# [
# [
# "_id",
# "UInt32"
# ],
# [
# "_key",
# "ShortText"
# ]
# ],
# [
# 1,
# "http://example.org/"
# ],
# [
# 2,
# "http://example.net/"
# ],
# [
# 7,
# "http://example.net/atv"
# ],
# [
# 8,
# "http://example.org/gat"
# ],
# [
# 9,
# "http://example.com/vdw"
# ]
# ]
# ]
# ]
If you specify query
parameter and filter
parameter at the same time, you can get the records which meets both of the condition as a result.
4.4.2. Sort by using scorer
¶
select
command accepts scorer
parameter which is used to process each record of full-text search results.
This parameter accepts the conditions which is specified by syntax like JavaScript as same as filter
parameter.
Execution example:
select --table Site --filter "true" --scorer "_score = rand()" --output_columns _id,_key,_score --sort_keys _score
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 9
# ],
# [
# [
# "_id",
# "UInt32"
# ],
# [
# "_key",
# "ShortText"
# ],
# [
# "_score",
# "Int32"
# ]
# ],
# [
# 6,
# "http://example.com/rab",
# 424238335
# ],
# [
# 9,
# "http://example.com/vdw",
# 596516649
# ],
# [
# 7,
# "http://example.net/atv",
# 719885386
# ],
# [
# 2,
# "http://example.net/",
# 846930886
# ],
# [
# 8,
# "http://example.org/gat",
# 1649760492
# ],
# [
# 3,
# "http://example.com/",
# 1681692777
# ],
# [
# 4,
# "http://example.net/afr",
# 1714636915
# ],
# [
# 1,
# "http://example.org/",
# 1804289383
# ],
# [
# 5,
# "http://example.org/aba",
# 1957747793
# ]
# ]
# ]
# ]
select --table Site --filter "true" --scorer "_score = rand()" --output_columns _id,_key,_score --sort_keys _score
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 9
# ],
# [
# [
# "_id",
# "UInt32"
# ],
# [
# "_key",
# "ShortText"
# ],
# [
# "_score",
# "Int32"
# ]
# ],
# [
# 4,
# "http://example.net/afr",
# 783368690
# ],
# [
# 2,
# "http://example.net/",
# 1025202362
# ],
# [
# 5,
# "http://example.org/aba",
# 1102520059
# ],
# [
# 1,
# "http://example.org/",
# 1189641421
# ],
# [
# 3,
# "http://example.com/",
# 1350490027
# ],
# [
# 8,
# "http://example.org/gat",
# 1365180540
# ],
# [
# 9,
# "http://example.com/vdw",
# 1540383426
# ],
# [
# 7,
# "http://example.net/atv",
# 1967513926
# ],
# [
# 6,
# "http://example.com/rab",
# 2044897763
# ]
# ]
# ]
# ]
‘_score’ is one of a pseudo column. The score of full-text search is assigned to it. See Pseudo column about ‘_score’ column.
In the above query, the condition of scorer
parameter is:
_score = rand()
In this case, the score of full-text search is overwritten by the value of rand() function.
The condition of sort_keys
parameter is:
_score
This means that sorting the search result by ascending order.
As a result, the order of search result is randomized.
4.4.3. Narrow down & sort by using location information¶
Groonga supports to store location information (Longitude & Latitude) and not only narrow down but also sort by using it.
Groonga supports two kind of column types to store location information. One is TokyoGeoPoint
, the other is WGS84GeoPoint
. TokyoGeoPoint
is used for Japan geodetic system. WGS84GeoPoint
is used for world geodetic system.
Specify longitude and latitude as follows:
“[latitude in milliseconds]x[longitude in milliseconds]”(e.g.: “128452975x503157902”)
“[latitude in milliseconds],[longitude in milliseconds]”(e.g.: “128452975,503157902”)
“[latitude in degrees]x[longitude in degrees]”(e.g.: “35.6813819x139.7660839”)
“[latitude in degrees],[longitude in degrees]”(e.g.: “35.6813819,139.7660839”)
Let’s store two location information about station in Japan by WGS. One is Tokyo station, the other is Shinjyuku station. Both of them are station in Japan. The latitude of Tokyo station is 35 degrees 40 minutes 52.975 seconds, the longitude of Tokyo station is 139 degrees 45 minutes 57.902 seconds. The latitude of Shinjyuku station is 35 degrees 41 minutes 27.316 seconds, the longitude of Shinjyuku station is 139 degrees 42 minutes 0.929 seconds. Thus, location information in milliseconds are “128452975x503157902” and “128487316x502920929” respectively. location information in degrees are “35.6813819x139.7660839” and “35.6909211x139.7002581” respectively.
Let’s register location information in milliseconds.
Execution example:
column_create --table Site --name location --type WGS84GeoPoint
# [[0, 1337566253.89858, 0.000355720520019531], true]
load --table Site
[
{"_key":"http://example.org/","location":"128452975x503157902"}
{"_key":"http://example.net/","location":"128487316x502920929"},
]
# [[0, 1337566253.89858, 0.000355720520019531], 2]
select --table Site --query "_id:1 OR _id:2" --output_columns _key,location
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 2
# ],
# [
# [
# "_key",
# "ShortText"
# ],
# [
# "location",
# "WGS84GeoPoint"
# ]
# ],
# [
# "http://example.org/",
# "128452975x503157902"
# ],
# [
# "http://example.net/",
# "128487316x502920929"
# ]
# ]
# ]
# ]
Then assign the value of geo distance which is calculated by geo_distance function to scorer
parameter.
Let’s show geo distance from Akihabara station in Japan. In world geodetic system, the latitude of Akihabara station is 35 degrees 41 minutes 55.259 seconds, the longitude of Akihabara station is 139 degrees 46 minutes 27.188 seconds. Specify “128515259x503187188” for geo_distance function.
Execution example:
select --table Site --query "_id:1 OR _id:2" --output_columns _key,location,_score --scorer '_score = geo_distance(location, "128515259x503187188")'
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 2
# ],
# [
# [
# "_key",
# "ShortText"
# ],
# [
# "location",
# "WGS84GeoPoint"
# ],
# [
# "_score",
# "Int32"
# ]
# ],
# [
# "http://example.org/",
# "128452975x503157902",
# 2054
# ],
# [
# "http://example.net/",
# "128487316x502920929",
# 6720
# ]
# ]
# ]
# ]
As you can see, the geo distance between Tokyo station and Akihabara station is 2054 meters, the geo distance between Akihabara station and Shinjyuku station is 6720 meters.
The return value of geo_distance function is also used for sorting by specifying pseudo _score
column to sort_keys
parameter.
Execution example:
select --table Site --query "_id:1 OR _id:2" --output_columns _key,location,_score --scorer '_score = geo_distance(location, "128515259x503187188")' --sort_keys -_score
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 2
# ],
# [
# [
# "_key",
# "ShortText"
# ],
# [
# "location",
# "WGS84GeoPoint"
# ],
# [
# "_score",
# "Int32"
# ]
# ],
# [
# "http://example.net/",
# "128487316x502920929",
# 6720
# ],
# [
# "http://example.org/",
# "128452975x503157902",
# 2054
# ]
# ]
# ]
# ]
Groonga also supports to narrow down by “a certain point within specified meters”.
In such a case, use geo_in_circle function in filter
parameter.
For example, search the records which exists within 5000 meters from Akihabara station.
Execution example:
select --table Site --output_columns _key,location --filter 'geo_in_circle(location, "128515259x503187188", 5000)'
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 1
# ],
# [
# [
# "_key",
# "ShortText"
# ],
# [
# "location",
# "WGS84GeoPoint"
# ]
# ],
# [
# "http://example.org/",
# "128452975x503157902"
# ]
# ]
# ]
# ]
There is geo_in_rectangle function which is used to search a certain point within specified region.