Filtering and Boosting

Filtering

In filtering, the expression must return boolean value for each item. If the returned value is true, the item passed the filter; if the value is false, the item does not pass and will be discarded. The value is computed from set property values of the individual items.

Consider the following table of items in sample movie-recommendation domain:

Name: string

Year:int

Director: string

Genres: set

Parental-Advisory:boolean

Pulp Fiction

1994

Quentin Tarantino

{“Crime Fiction”,”Drama”
“Thriller” }

true

King Kong

2005

Peter Jackson

{“Action”, “Drama”,
“Adventure” }

false

Fight Club

1999

David Fincher

{“Drama”, “Existentialism”}

true

The Lord of the Rings: The Return of the King

2003

Peter Jackson

{ “Adventure”, “Fantasy”,
“Action” }

false

The Dark Knight

2008

Christopher Nolan

{“Superhero”,”Drama”
,”Action”,”Adventure”,
“Thriller”,”Crime Fiction”}

false

Silence of the Lambs

1991

Jonathan Demme

{“Crime Fiction”, “Drama”,
“Thriller”, “Horror” }

true

Dead Alive

1992

Peter Jackson

{“Horror”, “Comedy”}

true

… and 10000 other movies

Example 1

By default, when items are to be recommended to a given user, the recommender selects any items which seem relevant to the user. However, it may be your policy not to recommend items with Parental-Advisory flag set on. Hence you may use the following simple ReQL filtering expression:

not 'Parental-Advisory'

Then the recommender may only choose from the following movies:

Name: string

Year:int

Director: string

Genres: set

Parental-Advisory:boolean

King Kong

2005

Peter Jackson

{“Action”, “Drama”,
“Adventure” }

false

The Lord of the Rings: The Return of the King

2003

Peter Jackson

{ “Adventure”, “Fantasy”,
“Action” }

false

The Dark Knight

2008

Christopher Nolan

{“Superhero”,”Drama”
,”Action”,”Adventure”,
“Thriller”,”Crime Fiction”}

false

… and 5926 other movies


Example 2

If you want to allow only items without Parental-Advisory which were directed by Peter Jackson (for example because a user selected such a filter at your site) you can do it by:

(not 'Parental-Advisory') and ('Director' == "Peter Jackson")

Note

You can access value of a property by putting name of the property into the single quotes. Strings are enclosed in double quotes.

Only following items can be recommended:

Name: string

Year:int

Director: string

Genres: set

Parental-Advisory:boolean

King Kong

2005

Peter Jackson

{“Action”, “Drama”,
“Adventure” }

false

The Lord of the Rings: The Return of the King

2003

Peter Jackson

{ “Adventure”, “Fantasy”,
“Action” }

false


Example 3

As another example, consider that user entered the Thriller section of your system’s catalog. Then you sure wish to recommend thrillers, ignoring the fact that usually, the user likes comedies. You may do this using another ReQL filtering expression:

"Thriller" in 'Genres'

Then only the following items will pass the filter:

Name: string

Year:int

Director: string

Genres: set

Parental-Advisory:boolean

Pulp Fiction

1994

Quentin Tarantino

{“Crime Fiction”,”Drama”
“Thriller” }

true

The Dark Knight

2008

Christopher Nolan

{“Superhero”,”Drama”
,”Action”,”Adventure”,
“Thriller”,”Crime Fiction”}

false

Silence of the Lambs

1991

Jonathan Demme

{“Crime Fiction”, “Drama”,
“Thriller”, “Horror” }

true

… and 3141 other movies


Handling deleted items

Filtering offers you an elegant way of handling deleted/obsolete items in the catalog. In many situations, it may happen that some items become unavailable and hence should not be recommended anymore. Considering interaction data, however, such items may still be important for the recommender. For example, the recommender may find out that users who liked a no more available item, x, will probably like another item, y, which is still actual. Therefore, it is undesirable to simply delete x, deleting also all the related interactions in cascade.

With filtering, you may handle item deletes using the following scheme:
  • Create a dedicated item property, such as deleted, of type boolean (the implicit value for all items will be null, which is OK).

  • For deleted items, set the value of deleted true.

  • For recommendations, use the following filter:

not 'deleted'
  • Occasionally, if the item becomes available again, you may set deleted false.

Such a mechanism cay easily be extended to control availability over different regions, customer licenses, etc.

Boosting

In advanced applications, besides filtering, you may wish to boost recommendation rates of some items. In contrast to filtering, where items may be completely blocked, in boosting, you may tell the recommender to prefer some items among others. Indeed, by default, it is a task of the recommender itself to select the items which are the most relevant. However, it may be your policy to purposefully bias the recommender toward your business goals.

For example, considering the above table of movies, one may wish to promote the movies which are new and were filmed after 2000, especially if they were filmed after 2005. Then the following boosting query can handle that:

if 'Year' <= 2000 then 1 else (if 'Year' <= 2005 then 1.5 else 2)

As you can see, boosting expressions return numbers rather than booleans as in case of filtering. Specifically, they provide the items with coefficients by which the internal scores determined by the recommender will be multiplied.

The boosting coefficients assigned by the query are shown in the following table:

Name: string

Year:int

Boosting

Pulp Fiction

1994

1.0

King Kong

2005

1.5

Fight Club

1999

1.0

The Lord of the Rings: The Return of the King

2003

1.5

The Dark Knight

2008

2.0

Silence of the Lambs

1991

1.0

Dead Alive

1992

1.0

… and 10000 other movies

Examples

  • Exclude some items from recommendations using their IDs

    Use filter:

    'itemId' not in {"item-127", "item-756", "item-568"}
    

    The three items will not be recommended.


  • Recommend related items by the same manufacturer

    Suppose that the items have a string property manufacturer and the user is viewing a product detail page.

    If you want to give the user related items using Recommend items to item, but restrict them to the manufacturer of the currently viewed item, use filter:

    'manufacturer' == context_item["manufacturer"]
    

    context_item function is used for retrieving property values of the item, that is currently viewed by the user.


  • Boost items that were published in last 24 hours

    Suppose that the items have a timestamp property published_date. Then you can use this booster:

    if 'published_date' >= now() - 24 * 60 * 60 then 2 else 1
    

    now returns current UTC timestamp in seconds.

    24 * 60 * 60 is the number of seconds in 24 hours.


  • Up-sell

    Suppose that the items have a double property price. Slightly boost items that are more expensive then the currently viewed one with following booster in Recommend items to item:

    if 'price' > context_item["price"] then 1.2 else 1
    

  • Recommend only items available in user’s city

    • Suppose that the items have a set property cities. It contains cities in which the items are available.

    • Suppose that the users have a string property city giving the city where each user live.

    To recommend only items available in a user’s city, use filter:

    context_user["city"] in 'cities'
    

    context_user function is used for retrieving property values of the user for which you request the recommendations.

Value Types and Operators

In compliance with Recombee Recommender API, there are 6 value types which correspond to possible domains of item/user properties:

  • int – signed integer (currently 64bit),

  • double – double-precision floating-point number (IEEE 754 compliant),

  • timestamp – UTC timestamp, similar to double,

  • string – sequence of Unicode characters,

  • boolean – binary data type of two possible values: true or false,

  • set – unordered collection of values.

Except for set, all of the types include special value of null, which, again, corresponds to the fact that null is an allowed and also default value for the property values in the API.

Numbers

Notation

Expression

Equivalent

Comment

0123.000

123.0

Leading and trailing zeros are ignored.

1.23e+3

1230.0

Exponential notation may be used.

1e9

1000000000

Using simple exponential notation for huge numbers.

123E-2

1.23

Negative exponents may also be used. Case of the e character does not matter.

Operations

Expression

Result

Comment

1 + 2

3

Addition.

1 + 2 + 3 + 4

10

Chain of additions.

1 - 2

-1

Subtraction.

1 - 2 - 3 - 4

-9

Chain of subtractions.

-(1 + 2)

-3

Unary minus.

2 * 3

6

Multiplication.

1 + 2 * 3 - 4

3

Standard operator precedence.

(1 + 2) * (3 - (4 + 5))

-18

Bracketing.

10 / 5

2.0

Division.

1 / 2

0.5

Division always results in double, event if the operands are integers!

5 / 0

NaN

If the divisor is 0, the result is NaN.

9 % 4

1

Modulo division.

3.14 % 2.5

0.64

Modulo division also works for doubles.

5 % 0

NaN

If the divisor is 0, the result is NaN.

Comparison

Expression

Result

Comment

1 < 2.0

true

Integers, doubles, and timestamps may be compared using standard comparison operators.

1 < 2 <= 2 == 2 != 1 >= 1 > 0

true

Comparison operators may be arbitrarily chained.

1 < 2 <= 2 == 3 != 1 >= 1 > 0

false

Chain of comparisons returns true if and only if all the individual comparisons are true.

2 == 2.0

true

In comparison, there is no difference between integers, doubles, and timestamps.

Strings

Notation

Expression

Comment

"foo"

Strings constants are enclosed in double quotes.

""

Empty string.

"she said \"hello\""

Double quotes must be escaped.

"she said 'hello'"

Single quotes needn’t be escaped.

Comparison

Expression

Result

Comment

"foo" == "foo"

true

Strings are compared for equality with ==.

"Alice" < "Bob"

true

Strings are ordered in lexicographic order

"Alice" < "Bob" < "Carol" < "Dan"

true

Comparisons may be chained arbitrarily.

"Alice" < "Bob" <= "Carol" != "Dan"

true

Comparisons in the chain may be of different types.

"Alice" < "Bob" >= "Carol" != "Dan"

false

All the comparisons must hold for the chain to return true.

"Alice" < 5

error

Strings are only comparable with strings.

"Alice" ~ "A[a-z]+"

true

Strings can be matched with regular expressions (regex).

Containment

Expression

Result

Comment

"ice" in "Alice"

true

in operator between strings tests whether the first string is contained in the second string.

"Ice" in "Alice"

false

Containment test is case sensitive.

"ice" not in "Alice"

false

in operator may be negated for better readability.

"" in "abc"

true

Empty string is contained in every string.

"abc" in ""

false

No non-empty string is contained in empty string.

5 in "abc"

error

Both operands must be strings for string containment testing.

Concatenation

Expression

Result

Comment

"foo" + "bar"

"foobar"

Strings can be concatenated using the + operator.

"" + "foo" + ""

"foo"

Empty string is neutral element for concatenation.

"foo" + 123

"foo123"

Strings can be concatenated with integers.

"foo" + 123.0

"foo123.0"

Strings can be concatenated with numbers.

Sets

Notation

Expression

Comment

{}

Empty set.

{1, 2, 3}

Set containing three integers.

{1, 2.0, false, "foo", null}

Sets may contain values of different types. This is an extension to sets in the API, which may only contain strings.

{{1,2}, {2,3}}

Sets may be nested.

Properties

Expression

Result

Comment

{ 1, 1, 1, 2 }

{ 1, 2 }

Sets only contain unique elements.

{ 1, 1.0 }

{ 1.0 }

Integers, doubles, and timestamps, are merged.

{ {1,2}, {2,1} }

{ {1,2} }

Merging also works for nested sets.

Value Containment

Expression

Result

Comment

2 in { 1, 2, 3 }

true

Using in operator, you may test whether a value is contained in given set (the ∈ relation)

4 not in { 1, 2, 3 }

true

The in operator may be negated for better readability (the ∉ relation).

2.0 in { 1, 2, 3 }

true

There is no difference between integers, doubles, and timestamps when testing containment.

"2" in { 1, 2, 3 }

false

There is a difference between numbers and strings.

{ 1, 2 } in { 1, 2, 3 }

false

in stays for ∈, not ⊆!

{ 1, 2 } in { {1,2}, {3,4} }

true

in stays for ∈.

Comparison

Expression

Result

Comment

{ 1, 2 } < { 1, 2, 3 }

true

Using < operator, you may test whether one test is a proper subset of another set (⊂ operator in set algebra).

{ 1, 2 } < { 1, 2 }

false

No set is a proper subset of itself.

{} < { 1, 2 }

true

Empty set is a proper subset of every non-empty set.

{} < {}

false

Empty set is not a proper subset of itself.

{ 1, 2 } <= { 1, 2, 3 }

true

Using <= operator, you may test whether one set is a subset of another set (⊆ operator is set algebra).

{ 1, 2 } <= { 1, 2 }

true

Every set is a subset of itself.

{ 1, 2 } == { 1, 2 }

true

== tests whether two sets are identical.

{ 1, 2 } != { 1, 2 }

false

!= tests whether two sets are different.

{ 1, 2, 3 } >= { 1, 2 }

true

>= operator tests whether one set is a superset of another set (⊇ operator in set algebra).

{ 1, 2 } >= { 1, 2 }

true

Every set is a superset of itself.

{ 1, 2, 3 } > { 1, 2 }

true

> operator tests whether one set is a proper superset of another set (⊃ operator in set algebra).

{ 1, 2 } > { 1, 2 }

false

A non-empty set in not a proper superset of itself.

{ 1, 2 } > {}

true

Every non-empty set is a proper superset of an empty set.

{} > {}

false

Empty set is not a proper subset of itself.

Union

Expression

Result

Comment

{ 1, 2 } + { 2, 3 }

{ 1, 2, 3 }

Sets may be unified using the + operator (∪ in set algebra).

{ 1, 2.0 } + { 2, 3 }

{ 1, 2.0, 3 }

Integers, doubles, and timestamps are merged when unifying sets.

{ 1, 2 } + { 2, 3 } + { 4 }

{ 1, 2, 3, 4 }

Unions may be chained.

{ 1, 2 } + {}

{ 1, 2 }

Unification with empty set has no effect on the original set.

{ 1, 2 } + { "2", "3" }

{ 1, 2, "2", "3" }

Strings and numbers are handled as different values.

Difference

Expression

Result

Comment

{ 1, 2 } - { 2, 3 }

{ 1 }

Set difference may be obtained using the - operator (operator is set algebra).

{ 1, 2 } - { 2.0, 3.0 }

{ 1 }

Integers, doubles, and timestamps are considered equal if they equal in values.

{ 1, 2 } - {}

{ 1, 2 }

Subtracting an empty set has no effect.

{ 1, 2 } - { 1 } - { 2 }

{}

Chaining of set subtractions works from left to rights.

{ 1, 2 } - ({ 1, 2 } - { 2 })

{ 2 }

Parenthesizing also works.

Intersection

Expression

Result

Comment

{ 1, 2 } & { 2, 3 }

{ 2 }

Set intersection may be obtained using the & operator.

{ 1, 2 } & { 2.0, 3.0 }

{ 2 }

Integers, doubles, and timestamps are considered equal if they equal in values.

{ 1, 2 } & {"1", "2"}

{}

Strings and numbers are handled as different values.

{"a", { 1, 2 }} & {"b", { 1, 2 }}

{{1,2}}

Works with subsets.

Symmetric difference

Expression

Result

Comment

{ 1, 2 } / { 2, 3 }

{ 1, 3 }

Symmetric difference of sets may be obtained using the / operator.

{ 1, 2 } / { 2.0, 3.0 }

{ 1, 3 }

Integers, doubles, and timestamps are considered equal if they equal in values.

{ 1, 2 } / {"1", "2"}

{1, 2, "1", "2"}

Strings and numbers are handled as different values.

{"a", { 1, 2 }} / {"b", { 1, 2 }}

{"a", "b"}

Works with subsets.

Logical Operators —————-`

Negation (NOT)

Expression

Comment

not 'a' == 'b'

'a' != 'b'

not 'a' > 'b'

'a' <= 'b'

not true

false

not false

true

Implicit conversion to boolean (for advanced uses only!):

Expression

Result

Comment

not -1

false

Negative numbers are truthy.

not 0

true

Zero numbers are falsy.

not 1.23

false

Positive numbers are truthy.

not ""

true

Empty strings are falsy.

not "foo"

false

Non-empty strings are truthy.

not {}

true

Empty sets are falsy.

not {1,2,3}

false

Non-empty sets are truthy.

not null

true

null is falsy.

Disjunction (OR)

Expression

a

b

c

Result

Comment

'a' > 'b' or 'a' > 'c'

1

2

3

false

If both operands are false, false is returned.

'a' > 'b' or 'a' > 'c'

2

1

3

true

If at least one of boolean operands is true, the result is true.

'a' > 'b' or 'a' > 'c'

2

3

1

true

If at least one of boolean operands is true, the result is true.

'a' > 'b' or 'a' > 'c'

3

1

2

true

If- both the operands are true, the result is true.

Advanced uses: Implicit conversion to boolean.

Expression

Result

Comment

"foo" or "bar"

"foo"

If the first operand truthy, it is returned.

"" or false

false

If the first operand is falsy, the second operand is returned.

false or ""

""

If the first operand is falsy, the second operand is returned.

Conjunction (AND)

Expression

a

b

c

Result

Comment

'a' > 'b' and 'a' > 'c'

1

2

3

false

If both operands are false, false is returned.

'a' > 'b' and 'a' > 'c'

2

1

3

false

If at least one of boolean operands is false, the result is false.

'a' > 'b' and 'a' > 'c'

2

3

1

false

If at least one of boolean operands is false, the result is false.

'a' > 'b' and 'a' > 'c'

3

1

2

true

If both the operands are true, the result is true.

Advanced uses: Implicit conversion to boolean.

Expression

Result

Comment

"foo" and "bar"

"bar"

If the first operand truthy, the second operand is returned.

"" and false

""

If the first operand is falsy, it is returned.

false and ""

false

If the first operand is falsy, it is returned.

Conditional Operator

Expression

a

b

Result

Comment

if 'a' > 'b' then "foo" else "bar"

10

5

"foo"

then-value is returned if the condition is satisfied.

if 'a' < 'b' then "foo" else "bar"

10

5

"bar"

else-value is returned if the condition is not satisfied.

if 'a' < 'b' then "foo"

error

else clause must always be present.

if 'a' < 'b' then "foo" else (if 'a' > 'b' then "bar" else "bah")

5

5

"bah"

if-else statements may be nested using parentheses.

Expression

Result

Comment

if -1 then "foo" else "bar"

"foo"

Negative numbers are truthy.

if 0 then "foo" else "bar"

"bar"

Zero numbers are falsy.

if 1.23 then "foo" else "bar"

"foo"

Positive numbers are truthy.

if "" then "foo" else "bar"

"bar"

Empty strings are falsy.

if "bah" then "foo" else "bar"

"foo"

Non-empty strings are truthy.

if {} then "foo" else "bar"

"bar"

Empty sets are falsy.

if {1,2,3} then "foo" else "bar"

"foo"

Non-empty sets are truthy.

if null then "foo" else "bar"

"bar"

null is falsy.