Python Modules Available to Indexing Pipeline Extensions
Python Modules Available to Indexing Pipeline Extensions
A large number of modules are available for import in indexing pipeline extension (IPE) scripts. The following Python 3 modules may be especially useful:
Note
This list is maintained manually, and might therefore be slightly out of date (for example, new modules may have been added, or existing modules may have been updated to a newer version since the last edit). Last updated: July 2023. You should validate the complete list of modules and their versions programmatically (see Listing Available Modules Programmatically). |
-
Modules from the Python 3 Standard Library.
-
beautifulsoup4
: A Python library for pulling data out of HTML and XML files. -
boto3
: Amazon Web Services (AWS) Software Development Kit (SDK) for Python. -
cryptography
: A cryptographic library. -
lxml
: Feature-rich and easy-to-use library for processing XML and HTML. -
markdown
: A text-to-HTML conversion library. -
python-dateutil
: Provides powerful extensions to the datetime module. -
requests
: HTTP library for Python. -
urllib3
: An HTTP client.
Listing Available Modules Programmatically
To get a log message containing the current, exhaustive, up-to-date list of all Python 3 modules/versions available for import in any given Python 3 indexing pipeline extension script, you can run a Python 3 indexing pipeline extension that has the following script:
import pkg_resources
modules = pkg_resources.working_set
modules_list = sorted(["%s, version %s" % (i.project_name, i.version) for i in modules])
log(str(modules_list))
Getting the Python Version Programmatically
To get a log message containing the current Python version used for indexing pipeline extensions, you can run or test an indexing pipeline extension that has the following script:
import sys
myPythonVersion = sys.version_info
log(str(myPythonVersion))
Deprecation of time.sleep
We recommend against using time.sleep
in extensions to delay outgoing requests.
Doing so introduces idle delays, and since index pipeline extensions are a multi-tenant service, it leads to processing time issues.
The use of time.sleep
in extensions will be completely deprecated in the future.
Suggesting Support for New Modules
If you would like to use a Python module which isn’t currently supported by indexing pipeline extensions, you can suggest the addition by posting an idea on Coveo Connect (Ideas tab).