Group by operations

This is for:

Developer

A Group By operation allows retrieval of the different available values of a field in an un-paginated query result set, along with an estimated number of occurrences for each retrieved value. Group By operations also support advanced options such as computing aggregate operations on other numeric fields for each retrieved value.

This article describes the members of the structure that defines a single Group By operation. You can specify an array of Group By operations in a query using the groupBy top-level query parameter.

In a graphical search interface, Group By operations are typically used to retrieve data to render facets.

You want to request Group By values based of the @filetype field from the result set that matches the mostly harmless basic query expression (q). You also want to limit the number of retrieved values to 4 (using maximumNumberOfValues), and sort those values in ascending alphabetical order (using sortCriteria).

POST https://platform.cloud.coveo.com/rest/search/v2 HTTP/1.1

Content-Type: application/json
Accept: application/json
Authorization: Bearer **********-****-****-****-************

Payload

{
  "q": "mostly harmless",
  "groupBy": [
    {
      "field": "@filetype",
      "maximumNumberOfValues": 4,
      "sortCriteria": "AlphaAscending"
    }
  ]
}

200 OK response body (excerpt)

{
  ...
  "groupByResults": [
    {
      "field": "filetype",
      "globalComputedFieldResults": [],
      "values": [
        {
          "computedFieldResults": [],
          "lookupValue": "epub",
          "numberOfResults": 3,
          "score": 0,
          "value": "epub",
          "valueType": "Standard"
        },
        {
          "computedFieldResults": [],
          "lookupValue": "mobi",
          "numberOfResults": 1,
          "score": 0,
          "value": "mobi",
          "valueType": "Standard"
        },
        {
          "computedFieldResults": [],
          "lookupValue": "pdf",
          "numberOfResults": 2,
          "score": 0,
          "value": "pdf",
          "valueType": "Standard"
        },
        {
          "computedFieldResults": [],
          "lookupValue": "txt",
          "numberOfResults": 1,
          "score": 0,
          "value": "txt",
          "valueType": "Standard"
        }
      ]
    }
  ],
  ...
}

Group By parameters

This section provides reference documentation for the available Group By parameters.

advancedQueryOverride (string)

The query expression that should override the advanced query expression on which the Group By operation is being performed (see the aq query parameter).

Note: If any query override parameter (e.g., queryOverride, advancedQueryOverride, etc.) is set in a Group By operation, all original parts of the query expression (i.e., q, aq, cq, and dq) will be ignored.

Example: @year==2017

allowedValues (array of string)

The field values allowed in the Group By operation results. You can use trailing wildcards (*) to include ranges of values.

See also the completeFacetWithStandardValues Group By operation parameter.

If you do not explicitly specify an array of allowedValues, or if you specify an empty array, all field values are allowed.

Example:

[
  "Anonymous",
  "Bob Jones",
  "Carrie Green",
  "David Allen"
]

allowedValuesPatternType (string)

The type of pattern being used in the allowed field values.

See also the allowedValues Group By operation parameter.

If you do not explicitly specify a pattern type, the legacy pattern is used by default, which only support trailing wildcards.

Example: regex

completeFacetWithStandardValues (boolean)

Whether to complete the Group By operation result set with standard values.

If you set this parameter to true and the number of specified allowedValues is lower than the maximumNumberOfValues, the Group By operation also attempts to returns standard values until the result set contains the maximumNumberOfValues.

Default: false

computedFields (array of RestComputedField)

The computed fields to evaluate for each Group By value.

A computed field stores the result of an aggregate operation performed on the values of a specific numerical field for all the query result items that share the same Group By field value.

constantQueryOverride (string)

The query expression that should override the constant query expression on which the Group By operation is being performed (see the cq query parameter).

Note: If any query override parameter (e.g., queryOverride, advancedQueryOverride, etc.) is set in a Group By operation, all original parts of the query expression (i.e., q, aq, cq, and dq) will be ignored.

Example: @filetype==forumpost

disjunctionQueryOverride (string)

The query expression that should override the disjunction query expression on which the Group By operation is being performed (see the dq query parameter).

Note: If any query override parameter (e.g., queryOverride, advancedQueryOverride, etc.) is set in a Group By operation, all original parts of the query expression (i.e., q, aq, cq, and dq) will be ignored.

Example: @date=2016-12-01..2016-12-31

field (string)

The name of the field on which to perform the Group By operation. The operation returns a Group By value for each distinct value of this field found in the query result items.

Note: You must ensure that the Facet option is enabled for this field in your index (see Add or Edit Fields).

Example: @author

filterFacetCount (boolean)

Whether to exclude folded result parents when estimating the result count for each facet value.

Default: true

generateAutomaticRanges (boolean)

Whether the index should automatically create range values.

Tip: If you set this parameter to true, you should ensure that the Use cache for numeric queries option is enabled for the Group By field in your index in order to speed up automatic range evaluation (see Add or Edit Fields).

Notes:

  • Setting generateAutomaticRanges to true only makes sense when the Group By field references a numeric or date field in the index.
  • The index cannot automatically generate range values of a field generated by a query function. In such cases, you must rather use the rangeValues Group By parameter.
  • Automatic range generation will fail if the referenced field is dynamically generated by a query function.

Default: false

injectionDepth (integer [int32])

The maximum number of query result items to scan for Group By values.

Note: Specifying a high injectionDepth value can negatively impact query performance.

Default: 1000

maximumNumberOfValues (integer [int32])

The maximum number of values the Group By operation should return.

Default: 10

queryOverride (string)

The query expression that should override the basic query expression on which the Group By operation is being performed (see the q query parameter).

Note: If any query override parameter (e.g., queryOverride, advancedQueryOverride, etc.) is set in a Group By operation, all original parts of the query expression (i.e., q, aq, cq, and dq) will be ignored.

Example: Coveo Cloud V2 Platform

rangeValues (array of RestRangeValue)

The ranges for which to generate Group By values.

Notes:

  • Specifying rangeValues only makes sense when the Group By field references a numeric or date field in the index.
  • You can set the generateAutomaticRanges Group By parameter to true rather than explicitly specifying rangeValues (unless the Group By field was generated by a query function).

sortCriteria (string)

The criterion to use when sorting the Group By operation results.

Allowed values:

  • score: sort using the score value which is computed from the number of occurrences of a field value, as well as from the position where query result items having this field value appear in the ranked query result set. When using this sort criterion, a field value with 100 occurrences might appear after one with only 10 occurrences, if the occurrences of the latter field value tend to appear higher in the ranked query result set.
  • occurrences: sort by number of occurrences, with field values having the highest number of occurrences appearing first.
  • alphaascending/alphadescending: sort alphabetically on the field values.
  • computedfieldascending/computedfielddescending: sort on the value of the first computed field for each Group By operation result (see the ComputedFields Group By parameter).
  • chisquare: sort based on the relative frequency of field values in the query result set compared to their frequency in the entire index. This means that a field value that does not appear often in the index, but does appear often in the query result set will tend to appear higher.
  • nosort: do not sort the results of the Group By operation. The field values will be appear in a random order.

Default: score

Group By Operation submodels

This section provides (or links to) reference documentation for the RestComputedField and RestRangeValue Group By Operation submodels.

RestComputedField

Describes a single computed field operation to perform along with a Group By operation (see the computedFields Group By parameter).

See Computed Fields.

RestRangeValue

Describes a single range value for a Group By operation (see the rangeValues Group By parameter).

end (integer [int32])

The value to end the range at. Must be greater (or later) than the start value.

Note: Timezone of date ranges are determined by the timezone parameter of the search request.

Examples:

  • 100
  • 2019/12/31@23:59:59

endInclusive (boolean)

Whether to include the end value in the range.

Default: false

label (string)

The label to associate with the range.

Note: Not currently leveraged.

Examples:

  • 0 - 100
  • In 2019

start (integer [int32])

The value to start the range at.

Note: Timezone of date ranges are determined by the timezone parameter of the search request.

Examples:

  • 0
  • 2019/01/01@00:00:00

Computed fields

A computed field is an aggregate operation (average, maximum, minimum, or sum) that’s performed on a specific numeric field during a Group By operation. This aggregate operation is computed for each value retrieved by its parent Group By operation, and takes into account the values of its target numeric field for each item sharing the same Group By field value in the un-paginated query result set.

This section describes the members of the structure that defines a single computed field operation. You can specify an array of computed field operations in a Group By operation using the computedFields Group By parameter.

You can only perform computed field operations on numeric fields. Otherwise, the aggregate operation will return NaN (not a number).

You want to request Group By values based of the @author field from the result set that matches the @source==Books constant query expression cq. You also want to limit the number of retrieved values to 3 (using maximumNumberOfValues), and compute the average of the @communityrating field for each of those values (using computedFields) in order to sort them in descending order (using sortCriteria).

POST https://platform.cloud.coveo.com/rest/search/v2 HTTP/1.1
 
Content-Type: application/json
Accept: application/json
Authorization: Bearer **********-****-****-****-************

Payload

{
  "cq": "@source==Books",
  "groupBy": [
    {
      "field": "@author",
      "computedFields": [
        {
          "field": "@communityrating",
          "operation": "average"
        }
      ],
      "maximumNumberOfValues": 3,
      "sortCriteria": "ComputedFieldDescending"
    }
  ]
}

200 OK response body (excerpt)

{
  ...
  "groupByResults": [
    {
      "field": "author",
      "globalComputedFieldResults": [
        7.110552764
      ],
      "values": [
        {
          "computedFieldResults": [
            9.850877193
          ],
          "lookupValue": "George Orwell",
          "numberOfResults": 12,
          "score": 0,
          "value": "George Orwell",
          "valueType": "Standard"
        },
        {
          "computedFieldResults": [
            9.321762292
          ],
          "lookupValue": "J. R. R. Tolkien",
          "numberOfResults": 30,
          "score": 0,
          "value": "J. R. R. Tolkien",
          "valueType": "Standard"
        },
        {
          "computedFieldResults": [
            9.114909274
          ],
          "lookupValue": "John Steinbeck",
          "numberOfResults": 17,
          "score": 0,
          "value": "John Steinbeck",
          "valueType": "Standard"
        }
      ]
    }
  ],
  ...
}

Computed field parameters

This section provides reference documentation for the available computed field parameters.

field (string)

The name of the numeric field on which to perform the aggregate operation.

Tip: You should ensure that the Use cache for computed fields option is enabled for that field in your index in order to speed up evaluation (see Add or Edit Fields).

Example: @wordcount

operation (string)

The aggregate operation to perform on the field values.

Allowed values:

  • sum: get the sum of all values.
  • average: get the average of all values.
  • minimum: get the smallest value.
  • maximum: get the largest value.