--- title: XML configuration further customization slug: l15b5020 canonical_url: https://docs.coveo.com/en/l15b5020/ collection: index-content source_format: adoc --- # XML configuration further customization To index database content, you must [provide a list of XML queries](https://docs.coveo.com/en/1885#xml-configuration) defining the content that Coveo should retrieve and how it should index it. Coveo will run these queries against your database and index the returned information in the [fields](https://docs.coveo.com/en/200/) you specified. In addition, you can optionally customize your XML configuration to enable additional functionalities that will make your source and the indexing process more efficient. ## Defining items to delete To index certain database items based on a characteristic that may vary, your XML configuration must include instructions as to which items are no longer relevant and can be deleted from the index. If using the `AccessorForItemsToDelete` accessor, ensure to [enable the refresh capability](#enabling-the-refresh-capability), as item deletion takes place during refresh operations only. **Example** You index `MESSAGE` items with a `message_id` that's less than `100`. The configuration is the following for your MS-SQL database: ```xml everyone SELECT * FROM ( SELECT ROW_NUMBER() OVER (ORDER BY mid) AS mid, -- message.mid, message.sender, message.date, message.message_id, message.subject, Cast(@LastRefresh as nvarchar(4000)) as FunkyDate, message.body, message.folder FROM message ) AS T WHERE mid <= 100 link:https://www.coveo.com/Emails/details.aspx?Id=%[mid] https://www.coveo.com Message_%[mid].txt Message_%[mid] - %[FunkyDate] %[date] %[body] %[sender] everyone ``` However, if, among the items with a `message_id` inferior to `100`, you want to index those of a certain age only, you must add an instruction for Coveo to delete the items that don't satisfy your age condition. Since the age of an item changes constantly, each of the indexed items will eventually become irrelevant and will need to be removed from your index. You want items that are less than a month old only to be searchable. With the following instruction inserted before `` in your XML configuration, every time your Database source is [refreshed](https://docs.coveo.com/en/2039/), items that no longer meet your age criterion are deleted from the index: ```xml SELECT message.mid FROM message WHERE message.date < DATEADD(month, -1, GETDATE()) AND message.mid < 100 ``` ## Enabling the refresh capability A [refresh operation](https://docs.coveo.com/en/2039/) keeps your source up to date by crawling your database and re-indexing modified items. To enable the refresh capability of your Database source, ensure that the database items have a `Date` type field indicating their latest modification date. This date must be updated whenever a record is modified. Moreover, this database field must be dedicated for use by Coveo and shouldn't be used for anything other than its original purpose. Then, in your Database source XML configuration, provide the name of this database field as a [dynamic value](https://docs.coveo.com/en/l15b4451#dynamic-values) under [`ModifiedDate`](https://docs.coveo.com/en/l15b4451#modifieddate). Thanks to this dynamic value, Coveo has the date on which each item has last been modified. Coveo then uses this date to determine whether a database item has changed since the last source update operation. If so, Coveo re-indexes the item so that the content searchable in your Coveo-powered search page matches your actual database content. To enable the refresh capability In the SQL query, the `SELECT` statement must have a `WHERE` clause with a criterion on the last modification date field. **Example** The following example should work with common database engines such as Microsoft SQL Server 2012, PostgreSQL, and MySQL. The `[PARAMETER]` field is sent by the crawler to the query to indicate when the last refresh was performed. * With a MSSQL or SQLServer database, one must select the **SQL Client** driver and, in the XML, replace `[PARAMETER]` with `@LastRefresh`. * With a database of a different type, one must select the **ODBC** driver and, in the XML, replace `[PARAMETER]` with `?`. However, regardless of your database type, the value of the `OrderByFieldName` and `IncrementalRefreshFieldName` parameters must be the name of the latest modification date field in your database. ```xml Select id, title, dateModified, content, author FROM blog WHERE dateModified>=[PARAMETER] order by dateModified OFFSET @startRow ROWS FETCH NEXT (@endRow-@startRow) ROWS ONLY; ``` ## Enable pausing/resuming update operations By default, Database sources don't offer the option to [pause and resume](https://docs.coveo.com/en/3390#pause-a-source-update) their [update operations](https://docs.coveo.com/en/2039/). To enable these actions, you must alter your source configuration. . In the SQL query, add an `ORDER BY` with the same chronological fields as in the `SELECT` statement. . In your source's [XML queries](/1885#xml-configuration), add XML attributes on the `Accessor` element: ** `OrderByFieldName` Specifies the name of the column on which the `ORDER BY` is applied. This attribute must be present to enable the Pause/Resume options. It can be, for example, the name of the latest modification date field in your database. ** `OrderByFieldType` Specifies the .NET data type of that column. This attribute isn't normally required. The source automatically tries to determine the data type by preparing the SQL query - without however executing it - and looking at the schema of the results. However, if a specific DBMS doesn't handle that process correctly, you can manually specify the data type with this attribute. The allowed types are the following: *** short (16-bit signed integer) *** ushort (16-bit unsigned integer) *** int (32-bit signed integer) *** uint (32-bit unsigned integer) *** long (64-bit signed integer) *** ulong (64-bit unsigned integer) *** float (single-precision floating point number) *** double (double-precision floating point number) *** string (String) ** `IncrementalRefreshFieldName` To support both [Refresh](#enabling-the-refresh-capability) and Pause/Resume options, the value of the `IncrementalRefreshFieldName` parameter must be the same as that of the `OrderByFieldName` parameter. See this example for an excerpt of a query for a source with Pause/Resume and Refresh options enabled: > **Note** > > You must replace `` by `@LastRefresh` in an SqlClient scenario or by `?` otherwise. ```xml SELECT Shippers.CompanyName AS ShipperName, Orders.OrderID AS ID, Orders.CustomerID, Orders.OrderDate, Orders.RequiredDate, Orders.ShippedDate, Customers.CompanyName, Employees.LastName, Employees.FirstName FROM Orders, Shippers, Customers, Employees WHERE Orders.ShipVia = Shippers.ShipperID AND Orders.CustomerID = Customers.CustomerID AND Orders.EmployeeID = Employees.EmployeeID AND Orders.OrderDate >= ORDER BY Orders.OrderDate ``` ## Complement information retrieval using subqueries The Database source acquires information about each indexed item through a query performed against a database. For each query, it's possible to associate one or more subqueries to be executed and used to complement information. **Example** You can run a main query, and for each row, run a subquery that crawls more/different information. All the results of a single row from the main query, along with everything from the subquery, are merged into a single item. The source requires a mapping configuration to execute properly. For each mapping type, it's necessary to specify an `Accessor` representing the SQL query to execute. > **Note** > > However, Coveo forbids using dynamic expressions that populate the `Uri` field with metadata extracted through a [subquery](https://docs.coveo.com/en/l15b5020#complement-information-retrieval-using-subqueries). > Coveo forbids using dynamic expressions that populate the [`Uri` field](https://docs.coveo.com/en/l15b4451#uri-required) with metadata extracted through a subquery. ### Specifying subqueries To specify subqueries, set the type of the `Accessor` to query. ```xml ``` Following the `Accessor` definition, add an `AccessorSubQueries` node with all subqueries: > **Note** > > The master key (value following `SELECT`) in the `AccessorSubQuery` node must match exactly the one returned by the server. > The key you include must also have the same casing. > The following error is thrown when the key couldn't be found: > > `Unable to index document : There's a formatting error in a sub query. > Cannot find master key %[key].` ```xml SELECT firstName, lastName FROM employeelist WHERE Email_id = %[sender] ``` ### Subquery attributes * **name** Subquery name referred to in the `Fields` section of the mapping. * **separator** Separator used when concatenating many rows. * **allowDuplicates** (optional) This attribute is mainly used when your subquery returns many rows. If set to `false`, duplicates in the results are ignored in the concatenation of the results. If set to `true`, duplicates are present. * **singleQuoteEscapeSequence** (optional) When the value of the returned field contains single quotes, these single quotes must be escaped. By default when you omit this attribute, the source escapes the single quotes by doubling them (for example, `''`). Usually, this escaping mechanism should work. However, some database types require a different escaping sequence for single quotes. In such cases, use this attribute to specify the single quote escape sequence. **Example** For the MySQL database, the single quote escaping sequence is `\'`. In this case, in the `AccessorSubQuery` tag, include the `singleQuoteEscapeSequence` attribute as follows: ```xml %[sender] %[FirstNameLastName.firstName] %[FirstNameLastName.lastName] ``` The metadata of a subquery can be specified for a field or a custom field. The way to specify is similar to the way it's done when referring a field coming from the main accessor: `%[subQueryName.fieldName]`. In the above example, custom field `firstName` is referring subquery named `FirstNameLastName` and uses the `firstName` metadata. See the following for a complete mapping configuration, used in Coveo unit tests: ```xml everyone SELECT message.mid, message.sender, message.date, message.message_id, message.subject, message.body, message.folder FROM message WHERE DATE like '2001-04-07%' SELECT firstName, lastName FROM employeelist WHERE Email_id = %[sender] link:https://www.coveo.com/Emails/details.aspx?Id=%[mid] https://www.coveo.com Message_%[mid].txt Message_%[mid] %[date] %[body] %[sender] %[FirstNameLastName.firstName] %[FirstNameLastName.lastName] everyone ``` "> ``` ### Subquery master key In a subquery, a master key used in the `WHERE` clause must respect the format `%[fieldName]`, which corresponds to metadata acquired from the main accessor. The master key is used to make the join between the main query and subqueries. ### Specifying subquery metadata for fields The `` section of the mapping configuration is used to specify the metadata to use for indexing. See the following example for a typical `` section of a mapping configuration: <<>> The metadata of a subquery can be specified for a field or a custom field. The way to specify is similar to the way it's done when referring a field coming from the main accessor: `%[subQueryName.fieldName]`. In the above example, custom field `firstName` is referring subquery named `FirstNameLastName` and uses the `firstName` metadata. See the following for a complete mapping configuration, used in Coveo unit tests: <<>>