# Bots (SEO & Performance tools)

In most cases, bots that visit your website for indexing (such as Google, Bing, etc.) or performance evaluation (like Google PageSpeed Insights, GTmetrix, etc.) do **not** need to see the consent collection UI. In fact, displaying a consent notice to bots can negatively impact your SEO ranking or performance score, as bots cannot interact with consent banners and might index them as part of your site content.

The **Didomi Web SDK** includes built-in options to control how it behaves when a bot visits your website. You can configure it to **bypass consent collection entirely** and act “as if” consent had been given — for example, by loading all third-party vendors. This is recommended since bots are not human visitors and do not provide consent.

#### Default behavior: robots.txt blocks all bots

By default, Didomi serves a `robots.txt` file on **sdk.privacy-center.org** that completely prevents bots from crawling or loading any SDK files:

```
User-agent: *
Disallow: /
```

This configuration tells all compliant crawlers (for example: Googlebot, Bingbot, DuckDuckBot, YandexBot, Baiduspider, Twitterbot, LinkedInBot, Slackbot, FacebookExternalHit, AhrefsBot, SemrushBot, MJ12bot, DotBot, CensysInspect, PetalBot, Archive.org\_bot, CommonCrawlBot) **not to crawl or load any resources from the SDK domain**.

As a result, **the Didomi SDK is not loaded at all for bots that respect this file**.

The configuration described below only applies to **non-compliant or synthetic clients**, such as **headless browsers (Puppeteer, Playwright)** or **monitoring tools** that do not follow `robots.txt`.

#### Allowing bots to load the SDK under a custom domain

If you want the configuration below to apply to **all bots**, including compliant ones, you can set up a **custom domain** for serving the Didomi SDK and configure that domain to allow bots in its own `robots.txt`.\
This can be done through the **Didomi Console** or via our **API**.

## Bypass consent collection for bots

To indicate that consent should not be collected for bots, set the `user.bots.consentRequired` property to `false` in your SDK configuration:

{% tabs %}
{% tab title="JavaScript" %}

```javascript
<script type="text/javascript">
window.didomiConfig = {
  user: {
    bots: {
      /**
       * Indicate whether consent is required for bots.
       * Defaults to false when configuring the consent notice from
       * the Didomi console. Defaults to true otherwise.
       */
      consentRequired: false,
      
      /**
       * Predefined types of bots to identify
       * Defaults to all types supported by the SDK
       * (Optional)
       */
      types: ['crawlers', 'performance'],
    }
  }
};
</script>
```

{% endtab %}

{% tab title="Custom JSON" %}

```python
{
  "user": {
    "bots": {
      "consentRequired": false,
      "types": ["crawlers", "performance"]
    }
  }
}
```

{% endtab %}
{% endtabs %}

The `user.bots.types` and `user.bots.extraUserAgents` properties give you extra control on what user agents are identified as bots. The Didomi SDK can automatically identify all the most common search engine bots and performance tools.

{% hint style="warning" %}
When configuring the consent notice from the Didomi console, consent is not collected for bots by default (consentRequired is set to false). This is usually the expected behavior, and it is thus unnecessary to use the above code.
{% endhint %}

#### Example using extraUserAgents

{% hint style="info" %}
The SDK internally uses the `RegExp` object constructor to generate the needed regular expressions out of the provided `extraUserAgents` array list. Normal string escape rules apply, i.e, preceding special characters with the \ character, [reference](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp#flags_in_constructor).
{% endhint %}

The following is a valid example using the `extraUserAgents` property:

{% tabs %}
{% tab title="JavaScript" %}

```javascript
<script type="text/javascript">
window.didomiConfig = {
  user: {
    bots: {
      /**
       * List of additional regular expressions to match
       * against the User Agent of the visitor.
       * When one of the regular expressions matches, the user
       * is considered to be a bot.
       * This allows you to identify extra bots that the SDK does not
       * know about.
       * Regular expressions must be specified as strings and correctly
       * escaped.
       * (Optional)
       */
      extraUserAgents: [
        "Mozilla\\/5.0 \\(Windows NT 10.0; Win64; x64; rv:88.0\\) Gecko\\/20100101 Firefox\\/88.0 \\(compatible; MonetoringBot\\/2.1\\)",
        "Mozilla\\/5.0 \\(compatible; SiteAnalyzerBot\\/5.0; \\+https:\\/\\/www.site-analyzer.com\\/\\)"
      ],
    }
  }
};
</script>
```

{% endtab %}

{% tab title="Custom JSON" %}

```json
{
  "user": {
    "bots": {
      "extraUserAgents": [
        "Mozilla\\/5.0 \\(Windows NT 10.0; Win64; x64; rv:88.0\\) Gecko\\/20100101 Firefox\\/88.0 \\(compatible; MonetoringBot\\/2.1\\)",
        "Mozilla\\/5.0 \\(compatible; SiteAnalyzerBot\\/5.0; \\+https:\\/\\/www.site-analyzer.com\\/\\)"
      ]
    }
  }
}
```

{% endtab %}
{% endtabs %}

## Bots categories

By default and if you configure the SDK to not collect consent from bots, all bots are impacted.\
If you want to control what categories of bots require consent more specifically, the Didomi SDK categorizes bots in the following categories and allows you to enable only some of them:

| ID            | Category          | Description                                                                                  |
| ------------- | ----------------- | -------------------------------------------------------------------------------------------- |
| `crawlers`    | Crawlers          | Bots that index your websites for Search Engines (Google, Bing, etc.)                        |
| `performance` | Performance Tools | Bots that visit your websites for performance reports (Google Page Insights, GTmetrix, etc.) |
