Group by operations
Group by operations
A Group By operation allows retrieval of the different available values of a field in an un-paginated query result set, along with an estimated number of occurrences for each retrieved value. Group By operations also support advanced options such as computing aggregate operations on other numeric fields for each retrieved value.
This article describes the members of the structure that defines a single Group By operation. You can specify an array of Group By operations in a query using the groupBy
top-level query parameter.
In a graphical search interface, Group By operations are typically used to retrieve data to render facets.
You want to request Group By values based of the @filetype
field from the result set that matches the mostly harmless
basic query expression (q
). You also want to limit the number of retrieved values to 4
(using maximumNumberOfValues
), and sort those values in ascending alphabetical order (using sortCriteria
).
POST https://platform.cloud.coveo.com/rest/search/v2 HTTP/1.1
Content-Type: application/json
Accept: application/json
Authorization: Bearer **********-****-****-****-************
Payload
{
"q": "mostly harmless",
"groupBy": [
{
"field": "@filetype",
"maximumNumberOfValues": 4,
"sortCriteria": "AlphaAscending"
}
]
}
200 OK response body (excerpt)
{
...
"groupByResults": [
{
"field": "filetype",
"globalComputedFieldResults": [],
"values": [
{
"computedFieldResults": [],
"lookupValue": "epub",
"numberOfResults": 3,
"score": 0,
"value": "epub",
"valueType": "Standard"
},
{
"computedFieldResults": [],
"lookupValue": "mobi",
"numberOfResults": 1,
"score": 0,
"value": "mobi",
"valueType": "Standard"
},
{
"computedFieldResults": [],
"lookupValue": "pdf",
"numberOfResults": 2,
"score": 0,
"value": "pdf",
"valueType": "Standard"
},
{
"computedFieldResults": [],
"lookupValue": "txt",
"numberOfResults": 1,
"score": 0,
"value": "txt",
"valueType": "Standard"
}
]
}
],
...
}
Group By parameters
This section provides reference documentation for the available Group By parameters.
advancedQueryOverride (string)
The query expression that should override the advanced query expression on which the Group By operation is being performed (see the aq
query parameter).
Note: If any query override parameter (e.g., queryOverride
, advancedQueryOverride
, etc.) is set in a Group By operation, all original parts of the query expression (i.e., q
, aq
, cq
, and dq
) will be ignored.
Example: @year==2017
allowedValues (array of string)
The field values allowed in the Group By operation results. You can use trailing wildcards (*
) to include ranges of values.
See also the completeFacetWithStandardValues
Group By operation parameter.
If you do not explicitly specify an array of allowedValues
, or if you specify an empty array, all field values are allowed.
Example:
[
"Anonymous",
"Bob Jones",
"Carrie Green",
"David Allen"
]
allowedValuesPatternType (string)
The type of pattern being used in the allowed field values.
See also the allowedValues
Group By operation parameter.
If you do not explicitly specify a pattern type, the legacy pattern is used by default, which only support trailing wildcards.
Example: regex
completeFacetWithStandardValues (boolean)
Whether to complete the Group By operation result set with standard values.
If you set this parameter to true
and the number of specified allowedValues
is lower than the maximumNumberOfValues
, the Group By operation also attempts to returns standard values until the result set contains the maximumNumberOfValues
.
Default: false
computedFields (array of RestComputedField)
The computed fields to evaluate for each Group By value.
A computed field stores the result of an aggregate operation performed on the values of a specific numerical field for all the query result items that share the same Group By field
value.
constantQueryOverride (string)
The query expression that should override the constant query expression on which the Group By operation is being performed (see the cq
query parameter).
Note: If any query override parameter (e.g., queryOverride
, advancedQueryOverride
, etc.) is set in a Group By operation, all original parts of the query expression (i.e., q
, aq
, cq
, and dq
) will be ignored.
Example: @filetype==forumpost
disjunctionQueryOverride (string)
The query expression that should override the disjunction query expression on which the Group By operation is being performed (see the dq
query parameter).
Note: If any query override parameter (e.g., queryOverride
, advancedQueryOverride
, etc.) is set in a Group By operation, all original parts of the query expression (i.e., q
, aq
, cq
, and dq
) will be ignored.
Example: @date=2016-12-01..2016-12-31
field (string)
The name of the field on which to perform the Group By operation. The operation returns a Group By value for each distinct value of this field found in the query result items.
Note: You must ensure that the Facet option is enabled for this field in your index (see Add or Edit Fields).
Example: @author
filterFacetCount (boolean)
Whether to exclude folded result parents when estimating the result count for each facet value.
Default: true
generateAutomaticRanges (boolean)
Whether the index should automatically create range values.
Tip: If you set this parameter to true
, you should ensure that the Use cache for numeric queries option is enabled for the Group By field
in your index in order to speed up automatic range evaluation (see Add or Edit Fields).
Notes:
- Setting
generateAutomaticRanges
totrue
only makes sense when the Group Byfield
references a numeric or date field in the index. - The index cannot automatically generate range values of a field generated by a query function. In such cases, you must rather use the
rangeValues
Group By parameter. - Automatic range generation will fail if the referenced
field
is dynamically generated by a query function.
Default: false
injectionDepth (integer [int32])
The maximum number of query result items to scan for Group By values.
Note: Specifying a high injectionDepth
value can negatively impact query performance.
Default: 1000
maximumNumberOfValues (integer [int32])
The maximum number of values the Group By operation should return.
Default: 10
queryOverride (string)
The query expression that should override the basic query expression on which the Group By operation is being performed (see the q
query parameter).
Note: If any query override parameter (e.g., queryOverride
, advancedQueryOverride
, etc.) is set in a Group By operation, all original parts of the query expression (i.e., q
, aq
, cq
, and dq
) will be ignored.
Example: Coveo Cloud V2 Platform
rangeValues (array of RestRangeValue)
The ranges for which to generate Group By values.
Notes:
- Specifying
rangeValues
only makes sense when the Group Byfield
references a numeric or date field in the index. - You can set the
generateAutomaticRanges
Group By parameter totrue
rather than explicitly specifyingrangeValues
(unless the Group Byfield
was generated by a query function).
sortCriteria (string)
The criterion to use when sorting the Group By operation results.
Allowed values:
-
score
: sort using the score value which is computed from the number of occurrences of a field value, as well as from the position where query result items having this field value appear in the ranked query result set. When using this sort criterion, a field value with 100 occurrences might appear after one with only 10 occurrences, if the occurrences of the latter field value tend to appear higher in the ranked query result set. -
occurrences
: sort by number of occurrences, with field values having the highest number of occurrences appearing first. -
alphaascending
/alphadescending
: sort alphabetically on the field values. -
computedfieldascending
/computedfielddescending
: sort on the value of the first computed field for each Group By operation result (see theComputedFields
Group By parameter). -
chisquare
: sort based on the relative frequency of field values in the query result set compared to their frequency in the entire index. This means that a field value that does not appear often in the index, but does appear often in the query result set will tend to appear higher. -
nosort
: do not sort the results of the Group By operation. The field values will be appear in a random order.
Default: score
Group By Operation submodels
This section provides (or links to) reference documentation for the RestComputedField
and RestRangeValue
Group By Operation submodels.
RestComputedField
Describes a single computed field operation to perform along with a Group By operation (see the computedFields
Group By parameter).
See Computed Fields.
RestRangeValue
Describes a single range value for a Group By operation (see the rangeValues
Group By parameter).
end (integer [int32])
The value to end the range at. Must be greater (or later) than the start
value.
Note: Timezone of date ranges are determined by the timezone parameter of the search request.
Examples:
100
2019/12/31@23:59:59
endInclusive (boolean)
Whether to include the end
value in the range.
Default: false
label (string)
The label to associate with the range.
Note: Not currently leveraged.
Examples:
0 - 100
In 2019
start (integer [int32])
The value to start the range at.
Note: Timezone of date ranges are determined by the timezone parameter of the search request.
Examples:
0
2019/01/01@00:00:00
Computed fields
A computed field is an aggregate operation (average, maximum, minimum, or sum) that’s performed on a specific numeric field during a Group By operation. This aggregate operation is computed for each value retrieved by its parent Group By operation, and takes into account the values of its target numeric field for each item sharing the same Group By field value in the un-paginated query result set.
This section describes the members of the structure that defines a single computed field operation. You can specify an array of computed field operations in a Group By operation using the computedFields
Group By parameter.
You can only perform computed field operations on numeric fields. Otherwise, the aggregate operation will return NaN
(not a number).
You want to request Group By values based of the @author
field from the result set that matches the @source==Books
constant query expression cq
. You also want to limit the number of retrieved values to 3
(using maximumNumberOfValues
), and compute the average of the @communityrating
field for each of those values (using computedFields
) in order to sort them in descending order (using sortCriteria
).
POST https://platform.cloud.coveo.com/rest/search/v2 HTTP/1.1
Content-Type: application/json
Accept: application/json
Authorization: Bearer **********-****-****-****-************
Payload
{
"cq": "@source==Books",
"groupBy": [
{
"field": "@author",
"computedFields": [
{
"field": "@communityrating",
"operation": "average"
}
],
"maximumNumberOfValues": 3,
"sortCriteria": "ComputedFieldDescending"
}
]
}
200 OK response body (excerpt)
{
...
"groupByResults": [
{
"field": "author",
"globalComputedFieldResults": [
7.110552764
],
"values": [
{
"computedFieldResults": [
9.850877193
],
"lookupValue": "George Orwell",
"numberOfResults": 12,
"score": 0,
"value": "George Orwell",
"valueType": "Standard"
},
{
"computedFieldResults": [
9.321762292
],
"lookupValue": "J. R. R. Tolkien",
"numberOfResults": 30,
"score": 0,
"value": "J. R. R. Tolkien",
"valueType": "Standard"
},
{
"computedFieldResults": [
9.114909274
],
"lookupValue": "John Steinbeck",
"numberOfResults": 17,
"score": 0,
"value": "John Steinbeck",
"valueType": "Standard"
}
]
}
],
...
}
Computed field parameters
This section provides reference documentation for the available computed field parameters.
field (string)
The name of the numeric field on which to perform the aggregate operation.
Tip: You should ensure that the Use cache for computed fields option is enabled for that field in your index in order to speed up evaluation (see Add or Edit Fields).
Example: @wordcount
operation (string)
The aggregate operation to perform on the field
values.
Allowed values:
-
sum
: get the sum of all values. -
average
: get the average of all values. -
minimum
: get the smallest value. -
maximum
: get the largest value.