Supported File Formats

The following table provides a list of the item types and formats that Coveo can index. Items of unsupported formats can be indexed by reference or using indexing pipeline extensions.

Document type File extension Details
Adobe Acrobat .pdf Version 1.0 to 1.7 inclusively
Epub files .epub Must be indexed by content (see Handling File Formats in Source JSON)
Google Docs, Google Slides, and Google Sheets files .doc, .ppt, .xls Text documents, slides, and spreadsheets created with the Google Docs Editors suite appear as Microsoft files in the search results. Clicking an item link opens this item in Google's web application.
Image files
(text extraction)
.bmp, .jpeg, .max, .pcx/.dcx, .pdf, .png, .tiff, .tiff-fx Requires enabling optical character recognition (OCR)
Image files (metadata extraction) .bmp, .emf, .exif, .gif, .icon, .jpeg, .png, .tiff, .wmf Creation of thumbnail
Microsoft Excel .xlam, .xlb, .xlm, .xls, .xlsm, .xlsx, .xltm, .xltx Version 2013, 2010, 2007, 2003, 2000, 97, 95, 5.0
Indexes Excel 2013, 2010 attachments.
Microsoft Outlook files .msg, .oft, .pst Message, archives, and templates
Microsoft PowerPoint .pot, .potm, .potx, .ppam, .pps, .ppsm, .ppsx, .ppt, .pptm, .pptx Version 2013, 2010, 2007, 2003, XP, 2000, 97
Indexes PowerPoint 2013, 2010 attachments.
Microsoft Word .doc, .docm, .docx, .dot, .dotm, .dotx Version 2013, 2010, 2007, 2003, XP, 2000, 98 (for MAC), 97, 95, 6.0, 6.0 (for MAC)
Indexes Word 2013, 2010 attachments.
MIME documents

.email, .eml, .ews, .mime MIME converter available with CES 7.0.5935+ (September 2013 monthly release)

Rich text Format .rtf
Text documents

.ascx, .bat, .cmd, .config, .csv, .dic, .exc, .inf, .ini, .js, .jsl, .log, .nfo, .scp, .sdl, .sln, .txt, .vbdproj , .vbs, .vdp, .vdproj, .vjp, .vjsproj, .vjsprojdata, .wsdl, .wsf, .wtx, .xsd

Coveo can index any file format that contains only text, even if its extension isn't listed above.

ANSI, ASCII, Unicode
Web pages .asp, .aspx, .cgi, .col, .dochtml, .dothtml, .fphtml, .hta, .htm, .html, .jsp, .php, .pothtml, .ppthtml, .shtm, .shtml, .xlshtml When parsing .html files, Coveo converter ignores the sections that contain the following tags:
  • select
  • option
  • script
  • strike
  • style
  • ?xml
  • applet
  • bdo
  • del
  • object
  • s
  • noframes
  • WordPerfect .wp, .wpd,.wpf Version 5 to 10 inclusively
    XML documents .xml
    XML style sheets .xsl, .xslt
    ZIP archives .zip PKZip (except PKZip 9.0 64-bits)
