--- title: Index an inSided Community slug: mc2a0477 canonical_url: https://docs.coveo.com/en/mc2a0477/ collection: index-content source_format: adoc --- # Index an inSided Community You can [index](https://docs.coveo.com/en/204/) an inSided community using a [Web source](https://docs.coveo.com/en/malf0160/) and some Python [indexing pipeline extensions (IPEs)](https://docs.coveo.com/en/206/) (see [Manage Extensions](https://docs.coveo.com/en/1645/)). Similarly to a community form, each post in an inSided community is considered a topic, and each topic is classified in a category. If it's a question, the topic has either the status `Question` or `Solved`. A topic can also be an announcement from a community manager. Once you have made your inSided community searchable, you can implement result folding so that, in your search page, the answers to a topic appear under the topic search result (see [About Result Folding](https://docs.coveo.com/en/1884/)). . [Add a Web source](https://docs.coveo.com/en/malf0160#add-a-web-source). .. Enter a **Domain** to use as the indexing process starting point. .. Use [exclusion rules](https://docs.coveo.com/en/malf0160#exclusions-and-inclusions) to exclude web page duplicates, unwanted categories, and/or the community member list. **Example** * `+https://community.example.com/news-and-announcements-*+` * `+https://community.example.com/off-topic-*+` * `+https://community.example.com/search?*+` * `+https://community.example.com/members/*+` .. Similarly, [add query parameters to ignore](https://docs.coveo.com/en/malf0160#query-parameters-to-ignore) to exclude other duplicates. **Example** Enter `sort` to ignore URLs representing alternative ways to sort the community content. .. In the [Web scraping](https://docs.coveo.com/en/malf0160#web-scraping-subtab) subtab, click **Edit with JSON**, and then enter a custom JSON configuration to ignore unwanted web page parts and to index the desired topic metadata. If you intend to implement [result folding](https://docs.coveo.com/en/1884/), define topic comments as subitems. **Example** ```json [ { "name": "myconfig", "for": { "urls": [ ".*" ] }, "exclude": [ { "type": "CSS", "path": ".ssi-header" }, { "type": "CSS", "path": ".qa-main-navigation" }, { "type": "CSS", "path": ".breadcrumb-container" }, { "type": "CSS", "path": ".qa-brand-hero" }, { "type": "CSS", "path": ".Template-brand-stats" }, { "type": "CSS", "path": ".Template-brand-featured" }, { "type": "CSS", "path": ".Sidebar" }, { "type": "CSS", "path": ".Template-footer" }, { "type": "CSS", "path": ".Template-brand-footer" } ], "metadata": { "status": { "type": "CSS", "path": ".qa-topic-header > .qa-thread-status::text" }, "sticky": { "type": "CSS", "path": ".qa-topic-header > .qa-topic-sticky", "isBoolean": true }, "category": { "type": "CSS", "path": ".qa-topic-header > .qa-topic-meta .qa-link-to-forum::text" }, "questiondate": { "type": "CSS", "path": ".qa-topic-header > .qa-topic-meta time::attr(datetime)" }, "replies": { "type": "CSS", "path": ".qa-topic-header > .qa-topic-meta .qa-link-to-replies::text" }, "question": { "type": "CSS", "path": ".qa-topic-header", "isBoolean": true } }, "subItems": { "reply": { "type": "CSS", "path": "#comments .qa-topic-post-box" } } }, { "for": { "types": [ "reply" ] }, "metadata": { "bestanswer": { "type": "CSS", "path": ".post--bestanswer", "isBoolean": true }, "content": { "type": "CSS", "path": ".qa-topic-post-content::text" } } } ] ``` . [Map the fields](https://docs.coveo.com/en/1640/) you configured in the **Web scraping** section. . Add fields to use in [indexing pipeline extensions (IPEs)](https://docs.coveo.com/en/206/) (see [Manage fields](https://docs.coveo.com/en/1833/)). . Add indexing pipeline extensions to (see [Manage Extensions](https://docs.coveo.com/en/1645/)): ** Add CSS for subitem Quick view in result folding, as by default there's no Quick view for subitems (see [About Result Folding](https://docs.coveo.com/en/1884/)). **Example** ```python try: if (document.uri.find("SubItem:") != -1): extracted_html = [x.strip('\r\n\t') for x in document.get_data_stream('body_html').readlines() if x.strip('\r\n\t')] new_html = "" for line in extracted_html: new_html += line html = document.DataStream('body_html') html.write(new_html) document.add_data_stream(html) except Exception as e: log(str(e)) ``` ** Populate the fields needed to fold answers under topics. **Example** ```python import re try: clickableuri = document.get_meta_data_value('clickableuri')[0] common_field = clickableuri.rsplit('/', 1)[-1] common_field = re.sub('[^0-9a-zA-Z]+', '', common_field)[:49] if (document.uri.find("SubItem:") == -1): document.add_meta_data({'foldfoldingfield': common_field}) document.add_meta_data({'foldparentfield': common_field}) else: document.add_meta_data({'foldfoldingfield': common_field}) document.add_meta_data({'foldchildfield': common_field}) except Exception as e: log(str(e)) ``` ** Exclude `.html` pages causing duplicates. **Example** ```python import re try: filename = document.get_meta_data_value("filename")[0] if (re.search( r"index.*\.html", filename) is not None): document.reject() except Exception as e: log(str(e)) ``` ** Process information collected on a web page. **Example** To get the year on which a topic (question) was published: ```python try: if (document.uri.find("SubItem:") == -1): date = document.get_meta_data_value("questiondate")[0] document.add_meta_data({'questionyear': date[0:4]}) except Exception as e: log(str(e)) ```