Filtering and Boosting
Filtering#
In filtering, the expression must return boolean value for each item. If the returned value is true
, the item passed the filter; if the value is false
, the item does not pass and will be discarded. The value is computed from set property values of the individual items.
Consider the following table of items in sample movie-recommendation domain:
Name string | Year int | Director string | Genres set | Parental-Advisory boolean |
---|---|---|---|---|
Pulp Fiction | 1994 | Quentin Tarantino | {"Crime Fiction","Drama", "Thriller" } | true |
King Kong | 2005 | Peter Jackson | {"Action", "Drama", "Adventure" } | false |
Fight Club | 1999 | David Fincher | {"Drama", "Existentialism"} | true |
The Lord of the Rings: The Return of the King | 2003 | Peter Jackson | { "Adventure", "Fantasy", "Action" } | false |
The Dark Knight | 2008 | Christopher Nolan | {"Superhero","Drama" ,"Action", "Adventure", "Thriller", "Crime Fiction"} | false |
Silence of the Lambs | 1991 | Jonathan Demme | {"Crime Fiction", "Drama", "Thriller", "Horror" } | true |
Dead Alive | 1992 | Peter Jackson | {"Horror", "Comedy"} | true |
… and 10000 other movies |
Example 1#
By default, when items are to be recommended to a given user, the recommender selects any items which seem relevant to the user. However, it may be your policy not to recommend items with Parental-Advisory flag set on. Hence you may use the following simple ReQL filtering expression:
not 'Parental-Advisory'
Then the recommender may only choose from the following movies:
Name string | Year int | Director string | Genres set | Parental-Advisory boolean |
---|---|---|---|---|
King Kong | 2005 | Peter Jackson | {"Action", "Drama", "Adventure" } | false |
The Lord of the Rings: The Return of the King | 2003 | Peter Jackson | { "Adventure", "Fantasy", "Action" } | false |
The Dark Knight | 2008 | Christopher Nolan | {"Superhero","Drama" ,"Action", "Adventure", "Thriller", "Crime Fiction"} | false |
… and 5926 other movies |
Example 2#
If you want to allow only items without Parental-Advisory
which were directed by Peter Jackson (for example because a user selected such a filter at your site) you can do it by:
(not 'Parental-Advisory') and ('Director' == "Peter Jackson")
You can access value of a property by putting name of the property into the single quotes. Strings are enclosed in double quotes.
Only following items can be recommended:
Name string | Year int | Director string | Genres set | Parental-Advisory boolean |
---|---|---|---|---|
King Kong | 2005 | Peter Jackson | {"Action", "Drama", "Adventure" } | false |
The Lord of the Rings: The Return of the King | 2003 | Peter Jackson | { "Adventure", "Fantasy", "Action" } | false |
Example 3#
As another example, consider that user entered the Thriller section of your system’s catalog. Then you sure wish to recommend thrillers, ignoring the fact that usually, the user likes comedies.
As the Genres is a set
property, you can use the in
operator for checking whether Thriller is listed in the item’s genres:
"Thriller" in 'Genres'
Only the following items pass the filter:
Name string | Year int | Director string | Genres set | Parental-Advisory boolean | |
---|---|---|---|---|---|
Pulp Fiction | 1994 | Quentin Tarantino | {"Crime Fiction","Drama", "Thriller" } | true | |
The Dark Knight | 2008 | Christopher Nolan | {"Superhero","Drama" ,"Action", "Adventure", "Thriller", "Crime Fiction"} | false | |
Silence of the Lambs | 1991 | Jonathan Demme | {"Crime Fiction", "Drama", "Thriller", "Horror" } | true | |
… and 3141 other movies |
Handling deleted items#
Filtering offers you an elegant way of handling deleted/obsolete items in the catalog. In many situations, it may happen that some items become unavailable and hence should not be recommended anymore. Considering interaction data, however, such items may still be important for the recommender. For example, the recommender may find out that users who liked a no more available item, x
, will probably like another item, y
, which is still available. Therefore, it is undesirable to simply delete x
, deleting also all the related interactions in cascade.
With filtering, you may handle item deletes using the following scheme:
-
Create a dedicated item property, such as
deleted
, of typeboolean
(the implicit value for all items will benull
, which is OK). -
For deleted items, set the value of
deleted
true
. -
For recommendations, use the following filter:
ReQLnot 'deleted'
-
If the item becomes available again, you may set
deleted
tofalse
.
Such a mechanism cay easily be extended to control availability over different regions, customer licenses, etc.
Boosting#
In advanced applications, besides filtering, you may wish to boost recommendation rates of some items. In contrast to filtering, where items may be completely blocked, in boosting, you may tell the recommender to prefer some items among others. Indeed, by default, it is a task of the recommender itself to select the items which are the most relevant. However, it may be your policy to purposefully bias the recommender toward your business goals.
For example, considering the above table of movies, one may wish to promote the movies which are new and were filmed after 2000, especially if they were filmed after 2005. Then the following boosting query can handle that:
if 'Year' <= 2000 then 1 else (if 'Year' <= 2005 then 1.5 else 2)
As you can see, boosting expressions return numbers rather than booleans as in case of filtering. Specifically, they provide the items with coefficients by which the internal scores determined by the recommender will be multiplied.
The boosting coefficients assigned by the query are shown in the following table:
Name string | Year int | Boosting |
---|---|---|
Pulp Fiction | 1994 | 1.0 |
King Kong | 2005 | 1.5 |
Fight Club | 1999 | 1.0 |
The Lord of the Rings: The Return of the King | 2003 | 1.5 |
The Dark Knight | 2008 | 2.0 |
Silence of the Lambs | 1991 | 1.0 |
Dead Alive | 1992 | 1.0 |
… and 10000 other movies |
Examples#
Exclude some items from recommendations using their IDs#
Use filter
:
'itemId' not in {"item-127", "item-756", "item-568"}
The three items will not be recommended.
Re-rank a given set of items#
Use filter
(usually sent as a dynamic parameter of the API recommendation request):
'itemId' in {"item-42", "item-77", "item-1992"}
- The recommendation engine will re-rank the given set of items for the particular user.
- It is supported to provide thousands of IDs in the query
Recommend related items by the same manufacturer#
Suppose that the items have a string property manufacturer
and the user is viewing a product detail page.
If you want to give the user related items using Recommend Items to Item, but restrict them to the manufacturer of the currently viewed item, use filter
:
'manufacturer' == context_item["manufacturer"]
context_item function is used for retrieving property values of the item, that is currently viewed by the user.
Boost items that were published in last 24 hours#
Suppose that the items have a timestamp property published_date
. Then you can use this booster
:
if 'published_date' >= now() - 24 * 60 * 60 then 2 else 1
now returns current UTC timestamp in seconds.
24 * 60 * 60
is the number of seconds in 24 hours.
Up-sell#
Suppose that the items have a double property price
. Slightly boost items that are more expensive then the currently viewed one with following booster
in Recommend Items to Item:
if 'price' > context_item["price"] then 1.2 else 1
Recommend only items available in user’s city#
- Suppose that the items have a set property
cities
. It contains cities in which the items are available. - Suppose that the users have a string property
city
giving the city where each user live.
To recommend only items available in a user’s city, use filter
:
context_user["city"] in 'cities'
context_user function is used for retrieving property values of the user for which you request the recommendations.
Recommend only items from subscribed topics#
Assume your site contains articles about various topics. The users can choose which topics are interesting for them in order to receive personalized newsletters with articles from these topics.
The subscribed topics are stored for each user in a User property of type set called subscribed_topics
.
Each article can belong to multiple topics, stored in the topics
item property of the type set.
The following filter
allows only items that contain at least one matching topic to be recommended:
size(context_user["subscribed_topics"] & 'topics') > 0
The &
operator returns the intersection of the two sets, which must be non-empty.
Value Types and Operators#
ReQL support following value types:
int
– signed integer (currently 64bit),double
– double-precision floating-point number (IEEE 754 compliant),timestamp
– UTC timestamp, similar todouble
,string
– sequence of Unicode characters,boolean
– binary data type of two possible values:true
orfalse
,set
– unordered collection of values.array
– ordered collection of values.
Except for set
and array
types, all of the types include special value of null
, which, again, corresponds to the fact that null
is an allowed and also default value for the item / user property values in the API.
Numbers#
Notation#
Expression | Equivalent | Comment |
---|---|---|
0123.000 | 123.0 | Leading and trailing zeros are ignored. |
1.23e+3 | 1230.0 | Exponential notation may be used. |
1e9 | 1000000000 | Using simple exponential notation for huge numbers. |
123E-2 | 1.23 | Negative exponents may also be used. Case of the |
Operations#
Expression | Result | Comment |
---|---|---|
1 + 2 | 3 | Addition. |
1 + 2 + 3 + 4 | 10 | Chain of additions. |
1 - 2 | -1 | Subtraction. |
1 - 2 - 3 - 4 | -9 | Chain of subtractions. |
-(1 + 2) | -3 | Unary minus. |
2 * 3 | 6 | Multiplication. |
1 + 2 * 3 - 4 | 3 | Standard operator precedence. |
(1 + 2) * (3 - (4 + 5)) | -18 | Bracketing. |
10 / 5 | 2.0 | Division. |
1 / 2 | 0.5 | Division always results in double, event if the operands are integers! |
5 / 0 | NaN | If the divisor is |
9 % 4 | 1 | Modulo division. |
3.14 % 2.5 | 0.64 | Modulo division also works for doubles. |
5 % 0 | NaN | If the divisor is |
Comparison#
Expression | Result | Comment |
---|---|---|
1 < 2.0 | true | Integers, doubles, and timestamps may be compared using standard comparison operators. |
1 < 2 <= 2 == 2 != 1 >= 1 > 0 | true | Comparison operators may be arbitrarily chained. |
1 < 2 <= 2 == 3 != 1 >= 1 > 0 | false | Chain of comparisons returns |
2 == 2.0 | true | In comparison, there is no difference between integers, doubles, and timestamps. |
Strings#
Notation#
Expression | Comment |
---|---|
"foo" | Strings constants are enclosed in double quotes. |
"" | Empty string. |
"she said \"hello\"" | Double quotes must be escaped. |
"she said 'hello'" | Single quotes needn’t be escaped. |
Comparison#
Expression | Result | Comment |
---|---|---|
"foo" == "foo" | true | Strings are compared for equality with |
"Alice" != null | true | Strings can be compared to |
"Alice" < "Bob" | true | Strings are ordered in lexicographic order. |
"Alice" < "Bob" < "Carol" < "Dan" | true | Comparisons may be chained arbitrarily. |
"Alice" < "Bob" <= "Carol" != "Dan" | true | Comparisons in the chain may be of different types. |
"Alice" < "Bob" >= "Carol" != "Dan" | false | All the comparisons must hold for the chain to return |
"Alice" < 5 | error | Strings are only comparable with strings. |
"Alice" ~ "A[a-z]+" | true | Strings can be matched with regular expressions (regex). |
Containment#
Expression | Result | Comment |
---|---|---|
"ice" in "Alice" | true |
|
"Ice" in "Alice" | false | Containment test is case sensitive. |
"ice" not in "Alice" | false |
|
"" in "abc" | true | Empty string is contained in every string. |
"abc" in "" | false | No non-empty string is contained in empty string. |
5 in "abc" | error | Both operands must be strings for string containment testing. |
Concatenation#
Expression | Result | Comment |
---|---|---|
"foo" + "bar" | "foobar" | Strings can be concatenated using the |
"" + "foo" + "" | "foo" | Empty string is neutral element for concatenation. |
"foo" + 123 | "foo123" | Strings can be concatenated with integers. |
"foo" + 123.0 | "foo123.0" | Strings can be concatenated with numbers. |
Indexing and Slicing#
Expression | Result | Comment |
---|---|---|
"abcd"[1] | "b" | A character in the string can be accessed by its index (starting from |
"abcd"[10] | "" | Requesting an index outside the array boundaries results in an empty string. |
"abcd"[-1] | "d" | Negative indices are interpreted as counting from the end of the string. |
"abcd"[1:3] | "bc" | It is possible to get a sub-string between two indices. |
"abcd"[1:] | "bcd" | If the second index is omitted, all characters until the end of the string are returned. |
Sets#
Notation#
Expression | Comment |
---|---|
{} | Empty set. |
{1, 2, 3} | Set containing three integers. |
{1, 2.0, false, "foo", null} | Sets may contain values of different types. This is an extension to sets in the API, which may only contain strings. |
{{1,2}, {2,3}} | Sets may be nested. |
Properties#
Expression | Result | Comment |
---|---|---|
{ 1, 1, 1, 2 } | { 1, 2 } | Sets only contain unique elements. |
{ 1, 1.0 } | { 1.0 } | Integers, doubles, and timestamps, are merged. |
{ {1,2}, {2,1} } | { {1,2} } | Merging also works for nested sets. |
Value Containment#
Expression | Result | Comment |
---|---|---|
2 in { 1, 2, 3 } | true | Using |
4 not in { 1, 2, 3 } | true | The |
2.0 in { 1, 2, 3 } | true | There is no difference between integers, doubles, and timestamps when testing containment. |
"2" in { 1, 2, 3 } | false | There is a difference between numbers and strings. |
{ 1, 2 } in { 1, 2, 3 } | false |
|
{ 1, 2 } in { {1,2}, {3,4} } | true |
|
Comparison#
Expression | Result | Comment |
---|---|---|
{ 1, 2 } < { 1, 2, 3 } | true | Using |
{ 1, 2 } < { 1, 2 } | false | No set is a proper subset of itself. |
{} < { 1, 2 } | true | Empty set is a proper subset of every non-empty set. |
{} < {} | false | Empty set is not a proper subset of itself. |
{ 1, 2 } <= { 1, 2, 3 } | true | Using |
{ 1, 2 } <= { 1, 2 } | true | Every set is a subset of itself. |
{ 1, 2 } == { 1, 2 } | true |
|
{ 1, 2 } != { 1, 2 } | false |
|
{ 1, 2, 3 } >= { 1, 2 } | true |
|
{ 1, 2 } >= { 1, 2 } | true | Every set is a superset of itself. |
{ 1, 2, 3 } > { 1, 2 } | true |
|
{ 1, 2 } > { 1, 2 } | false | A non-empty set in not a proper superset of itself. |
{ 1, 2 } > {} | true | Every non-empty set is a proper superset of an empty set. |
{} > {} | false | Empty set is not a proper subset of itself. |
Union#
Expression | Result | Comment |
---|---|---|
{ 1, 2 } + { 2, 3 } | { 1, 2, 3 } | Sets may be unified using the |
{ 1, 2.0 } + { 2, 3 } | { 1, 2.0, 3 } | Integers, doubles, and timestamps are merged when unifying sets. |
{ 1, 2 } + { 2, 3 } + { 4 } | { 1, 2, 3, 4 } | Unions may be chained. |
{ 1, 2 } + {} | { 1, 2 } | Unification with empty set has no effect on the original set. |
{ 1, 2 } + { "2", "3" } | { 1, 2, "2", "3" } | Strings and numbers are handled as different values. |
Difference#
Expression | Result | Comment |
---|---|---|
{ 1, 2 } - { 2, 3 } | { 1 } | Set difference may be obtained using the |
{ 1, 2 } - { 2.0, 3.0 } | { 1 } | Integers, doubles, and timestamps are considered equal if they equal in values. |
{ 1, 2 } - {} | { 1, 2 } | Subtracting an empty set has no effect. |
{ 1, 2 } - { 1 } - { 2 } | {} | Chaining of set subtractions works from left to rights. |
{ 1, 2 } - ({ 1, 2 } - { 2 }) | { 2 } | Parenthesizing also works. |
Intersection#
Expression | Result | Comment |
---|---|---|
{ 1, 2 } & { 2, 3 } | { 2 } | Set intersection may be obtained using the |
{ 1, 2 } & { 2.0, 3.0 } | { 2 } | Integers, doubles, and timestamps are considered equal if they equal in values. |
{ 1, 2 } & {"1", "2"} | {} | Strings and numbers are handled as different values. |
{"a", { 1, 2 }} & {"b", { 1, 2 }} | {{1,2}} | Works with subsets. |
Symmetric difference#
Expression | Result | Comment |
---|---|---|
{ 1, 2 } / { 2, 3 } | { 1, 3 } | Symmetric difference of sets may be obtained using the |
{ 1, 2 } / { 2.0, 3.0 } | { 1, 3 } | Integers, doubles, and timestamps are considered equal if they equal in values. |
{ 1, 2 } / {"1", "2"} | {1, 2, "1", "2"} | Strings and numbers are handled as different values. |
{"a", { 1, 2 }} / {"b", { 1, 2 }} | {"a", "b"} | Works with subsets. |
Arrays#
Notation#
Expression | Result | Comment |
---|---|---|
[] | Empty array. | |
[1, 2, 3] | Array containing three integers. | |
[1, 2.0, false, "foo", null] | Arrays may contain values of different types. | |
[[1,2], [2,3]] | Arrays may be nested. |
Value Containment#
Expression | Result | Comment |
---|---|---|
2 in [ 1, 2, 3 ] | true | Using the |
4 not in [ 1, 2, 3 ] | true | The |
2.0 in [ 1, 2, 3 ] | true | There is no difference between integers, doubles, and timestamps when testing containment. |
"2" in [ 1, 2, 3 ] | false | There is a difference between numbers and strings. |
Comparison#
Expression | Result | Comment |
---|---|---|
[ 1, 2 ] == [ 1, 2 ] | true |
|
[ 1, 2 ] != [ 1, 2 ] | false |
|
[ 1, 2 ] != [ 2, 1 ] | true | Order of the elements must be the same in both arrays for equality |
Concatenation#
Expression | Result | Comment |
---|---|---|
[ 1, 2 ] + [ 3, 4 ] | [ 1, 2, 3, 4 ] | Arrays may be concatenated using the |
[ 1 ] + [ 2, 3 ] + [ 4 ] | [ 1, 2, 3, 4 ] | Concatenations may be chained. |
[ 1, 2 ] + [] | [ 1, 2 ] | Concatenating an empty array has no effect on the original array. |
Indexing and Slicing#
Expression | Result | Comment |
---|---|---|
[ "a", "b", "c", "d" ][1] | "b" | Value can be accessed by their index (starting from |
[ "a", "b", "c", "d" ][10] | null | Requesting an index outside the array boundaries results in null. |
[ "a", "b", "c", "d" ][-1] | "d" | Negative indices are interpreted as counting from the end of the array. |
[ "a", "b", "c", "d" ][1:3] | ["b", "c"] | It is possible to get a sub-array between two indices |
[ "a", "b", "c", "d" ][1:] | ["b", "c", "d"] | If the second index is omitted, all elements until the end of the array are returned. |
Logical Operators#
Negation (NOT)#
Expression | Comment |
---|---|
not 'a' == 'b' | 'a' != 'b' |
not 'a' > 'b' | 'a' <= 'b' |
not true | false |
not false | true |
Implicit conversion to boolean (for advanced uses only!):
Expression | Result | Comment |
---|---|---|
not -1 | false | Negative numbers are truthy. |
not 0 | true | Zero numbers are falsy. |
not 1.23 | false | Positive numbers are truthy. |
not "" | true | Empty strings are falsy. |
not "foo" | false | Non-empty strings are truthy. |
not {} | true | Empty sets are falsy. |
not {1,2,3} | false | Non-empty sets are truthy. |
not null | true |
|
Disjunction (OR)#
Expression | a | b | c | Result | Comment |
---|---|---|---|---|---|
'a' > 'b' or 'a' > 'c' | 1 | 2 | 3 | false | If both operands are false, false is returned. |
'a' > 'b' or 'a' > 'c' | 2 | 1 | 3 | true | If at least one of boolean operands is |
'a' > 'b' or 'a' > 'c' | 2 | 3 | 1 | true | If at least one of boolean operands is |
'a' > 'b' or 'a' > 'c' | 3 | 1 | 2 | true | If- both the operands are |
Advanced uses: Implicit conversion to boolean.
Expression | Result | Comment |
---|---|---|
"foo" or "bar" | "foo" | If the first operand truthy, it is returned. |
"" or false | false | If the first operand is falsy, the second operand is returned. |
false or "" | "" | If the first operand is falsy, the second operand is returned. |
Conjunction (AND)#
Expression | a | b | c | Result | Comment |
---|---|---|---|---|---|
'a' > 'b' and 'a' > 'c' | 1 | 2 | 3 | false | If both operands are |
'a' > 'b' and 'a' > 'c' | 2 | 1 | 3 | false | If at least one of boolean operands is |
'a' > 'b' and 'a' > 'c' | 2 | 3 | 1 | false | If at least one of boolean operands is |
'a' > 'b' and 'a' > 'c' | 3 | 1 | 2 | true | If both the operands are |
Advanced uses: Implicit conversion to boolean.
Expression | Result | Comment |
---|---|---|
"foo" and "bar" | "bar" | If the first operand truthy, the second operand is returned. |
"" and false | "" | If the first operand is falsy, it is returned. |
false and "" | false | If the first operand is falsy, it is returned. |
Conditional Operator#
Expression | a | b | Result | Comment |
---|---|---|---|---|
if 'a' > 'b' then "foo" else "bar" | 10 | 5 | "foo" |
|
if 'a' < 'b' then "foo" else "bar" | 10 | 5 | "bar" |
|
if 'a' < 'b' then "foo" |
|
| error |
|
if 'a' < 'b' then "foo" else (if 'a' > 'b' then "bar" else "bah") | 5 | 5 | "bah" |
|
Expression | Result | Comment |
---|---|---|
if -1 then "foo" else "bar" | "foo" | Negative numbers are truthy. |
if 0 then "foo" else "bar" | "bar" | Zero numbers are falsy. |
if 1.23 then "foo" else "bar" | "foo" | Positive numbers are truthy. |
if "" then "foo" else "bar" | "bar" | Empty strings are falsy. |
if "bah" then "foo" else "bar" | "foo" | Non-empty strings are truthy. |
if {} then "foo" else "bar" | "bar" | Empty sets are falsy |
if {1,2,3} then "foo" else "bar" | "foo" | Non-empty sets are truthy. |
if null then "foo" else "bar" | "bar" |
|