Thesaurus - Query Pipeline Feature

A query pipeline statement expressing the thesaurus query pipeline feature defines how certain terms appearing in the basic part (q) of the query expression should be expanded or replaced before executing the query against the index.

In general:

  • Expanding A to B means: substitute A with (A OR B).

  • Expanding A to B, C means: substitute A with (A OR B OR C).

  • Replacing A with B means: substitute A with B.

  • Replacing A with B, C means: substitute A with with (B OR C).

The thesaurus feature comprises four distinct sub-features:

  • alias: expand each expression in a list to all other expressions in that same list.

  • expand: expand each expression in a list to all expressions in another list.

  • replace: replace each expression in a list with all expressions in another list.

  • quote: replace each expression in a list with its corresponding exact phrase match expression.

The following table summarizes how statements using each of the different thesaurus sub-features would process the basic part (q) of the combined query expression, assuming its current value is kitty cat:

Statement definition Processed q expression
alias /kitt(y|en)/, "cat", "mouse hunter", "feline" (kitty OR cat OR (mouse hunter) OR feline) (cat OR (mouse hunter) OR feline)
expand /kitt(y|en)/, "cat" to "mouse hunter", "feline" (kitty OR (mouse hunter) OR feline) (cat OR (mouse hunter) OR feline)
replace /kitt(y|en)/, "cat" to "mouse hunter", "feline" ((mouse hunter) OR feline) ((mouse hunter) OR feline)
quote "kitty cat" "kitty cat"
quote /kitt(y|en)/ , "cat" to "mouse hunter" "mouse hunter" "mouse hunter"

Typically, a statement expressing the thesaurus feature should only apply when a certain condition is fulfilled.

In general, you should ensure that this is the case by associating such a statement, and/or the query pipeline it is defined in, to a global condition.

In the Coveo Cloud administration console, you can manage statements expressing the thesaurus feature from the Thesaurus tab (see Managing Query Pipeline Thesaurus).

Bear in mind that:

  • An Expand any rule in the Coveo Cloud administration console corresponds to a statement that uses the alias sub-feature.

  • The Coveo Cloud administration console does not allow you to create statements that use the quote sub-feature. To do so, you must use the API directly.

The following diagram shows the process of a query being sent to the Search API and the order of execution of query pipeline features.

Apply thesaurus statements

Syntax

Use the following query pipeline language (QPL) syntax to define a statement expressing the thesaurus feature:

alias <terms> | expand <terms> to <otherTerms> | replace <terms> to <otherTerms> | quote <terms> [to <otherTerms>]

<terms>

A comma-separated list of quoted strings and/or regular expressions, where each quoted string must contain one or more basic query terms (e.g., "foo bar", "baz", /^meo+w$/).

When using the alias feature, <terms> must contain at least one quoted string (i.e., it cannot contain only regular expressions).

<otherTerms>

A comma-separated list of quoted strings, where each quoted string must contain one or more basic query terms (e.g., "hello world", "biz").

If you define a named group in a regular expression in the <terms> list, you can reference the corresponding match group in the <otherTerms> list with the syntax _<name>_, where <name> is the name of the group.

In an empty query pipeline named Testing Match Groups, you create a statement expressing the thesaurus feature with the following definition:

expand /(?<username>[^@]+)@example\.com/ to "_username_"]

The following table summarizes how the current q expression of different queries going through the Testing Match Groups query pipeline is processed when this statement is applied:

Current q expression Processed q expression
asmith@example.com asmith@example.com OR asmith
bjones@example.com bjones@example.com OR bjones

Using the alias Sub-Feature

The alias sub-feature allows you to define statements that essentially make all expressions in a set synonymous to one another. This means that whenever any of these expressions appears in the basic part (q) of the query expression, this expression expands to all other expressions in the alias set.

An alias set of expressions must contain at least one quoted string (i.e., it cannot contain only regular expressions).

This sub-feature is an optimized use of the expand sub-feature, as many statements would otherwise be required to define a set of bi-directional expansions (i.e., synonyms).

  • The following statement:

      alias "foo", "bar", "qux"]
    

    Is equivalent to:

      expand "foo" to "bar"
      expand "foo" to "qux"
      expand "bar" to "foo"
      expand "bar" to "qux"
      expand "qux" to "foo"
      expand "qux" to "bar"]
    
  • In an empty query pipeline named Testing Thesaurus Alias, you create a statement expressing the thesaurus feature with the following QPL definition:

      alias "car", /(dodge) \w+/, "automobile", "motor vehicle"]
    

    The following table summarizes how current q expression of different queries going through the Testing Thesaurus Alias query pipeline is processed when this statement is applied:

    Current q expression Processed q expression
    car car OR automobile OR (motor vehicle)
    automobile car OR automobile OR (motor vehicle)
    motor vehicle car OR automobile OR (motor vehicle)
    dodge stratus car OR (dodge stratus) OR automobile OR (motor vehicle)
    dodge caravan car (car OR (dodge caravan) OR automobile OR (motor vehicle)) (car OR automobile OR (motor vehicle))

Using the expand Sub-Feature

The expand sub-feature allows you to define statements that expand each expression in a left-hand set to all expressions in a right-hand set. Those expansions are uni-directional.

In an empty query pipeline named Testing Thesaurus Expand, you create a statement expressing the thesaurus feature with the following QPL definition:

expand "car", /(dodge) \w+/ to "automobile", "motor vehicle"]

The following table summarizes how the current q expression of different queries going through the Testing Thesaurus Expand query pipeline is processed when this statement is applied:

Current q expression Processed q expression
car car OR automobile OR (motor vehicle)
dodge stratus (dodge stratus) OR automobile OR (motor vehicle)
dodge caravan car ((dodge caravan) OR automobile OR (motor vehicle)) (car OR automobile OR (motor vehicle))

Using the replace Sub-Feature

The replace sub-feature allows you to define statements that replace each expression in a left-hand set with all expressions in a right-hand set.

In an empty query pipeline named Testing Thesaurus Replace, you create a statement expressing the thesaurus feature with the following QPL definition:

replace "car", /(dodge) \w+/ to "automobile", "motor vehicle"]

The following table summarizes how the current q expression of different queries going through the Testing Thesaurus Replace query pipeline is processed when this statement is applied:

Current q expression Processed q expression
car automobile OR (motor vehicle)
dodge stratus automobile OR (motor vehicle)
dodge caravan car (automobile OR (motor vehicle) (automobile OR (motor vehicle))

You can use the replace sub-feature along with the no-stemming operator (+) to systematically prevent the index from stemming certain keywords (i.e., expanding those keywords to other terms that share the same root).

replace "dodge" to "+dodge"

This can be useful when a product name has the same root as a frequently used keyword.

Using the quote Sub-Feature

The quote sub-feature allows you to define statements that automatically turn all expressions in a set into their corresponding exact phrase match expression, or into the corresponding exact phrase match expression of each expression in a right-hand set.

You could obtain the same results using the replace sub-feature, but the required syntax would be heavier.

  • The following statements:

      quote "foo bar"
      quote /foo.*/
    

    Are respectively equivalent to:

      replace "foo bar" to "\"foo bar\""
      replace /(?<fooGroup>foo.*)/ to "\"_fooGroup_\""
    
  • In an empty query pipeline named Testing Thesaurus Quote, you create two distinct statements, each expressing the thesaurus feature, with the following QPL definitions:

    Statement 1

      quote /(dodge) \w+/
    

    Statement 2

      quote "car", "automobile" to "motor vehicle"
    

    The following table summarizes how the current q expression of different queries going through the Testing Thesaurus Quote query pipeline is processed when these statements are applied:

    Current q expression Processed q expression
    dodge stratus "dodge stratus"
    dodge stratus dodge caravan "dodge stratus" "dodge caravan"
    car "motor vehicle"
    automobile "motor vehicle"
    dodge stratus automobile "dodge stratus" "motor vehicle"

The Coveo Cloud administration console does not allow you to create statements that use the quote sub-feature. To do so, you must use the API directly.