LyonlBot Crawler Policy

Lyonl Search

LyonlBot helps Lyonl Search discover the open web.

LyonlBot helps Lyonl Search discover and refresh publicly available web documents so people can find useful results without ads or hidden agendas.

LyonlBot is the crawler used by Lyonl Search to build and maintain our open web search index. This policy explains what LyonlBot does, how it identifies itself, how website owners can control crawling, and how to report crawler issues.

No Ads. No Agenda. Just Results.

What LyonlBot Does

LyonlBot visits public URLs to help build and maintain the Lyonl Search index. It may discover documents through links, sitemaps, redirects, and previously known URLs. When possible, LyonlBot uses signals such as sitemaps, redirect targets, robots directives, and conditional fetch headers to avoid unnecessary work and keep results fresh.

Discovers public web documents through links, sitemaps, redirects, and previously known URLs.
Refreshes documents when they appear to have changed or when the control plane schedules a refresh.
Uses sitemaps as discovery hints, not as a guarantee of inclusion.
Uses conditional fetches with ETag and Last-Modified validators where servers provide them.
Uses mandatory native Chromium JavaScript rendering for HTML documents, including SPA and hydration-heavy documents.
Does not intentionally bypass authentication, paywalls, CAPTCHAs, or access controls.
Does not submit forms, register accounts, or perform transactions.

Crawler Identity And User Agents

The user-agent header helps site owners recognize LyonlBot in server logs. Because user-agent strings can be spoofed by unrelated crawlers, do not rely on the user-agent alone for high-confidence verification.

Crawler	Status	Robots.txt token	Full user-agent string	Purpose
LyonlBot	Canonical	`lyonlbot`	`Mozilla/5.0 (compatible; LyonlBot/1.0; +https://lyonl.com/bot.html)`	Main web search crawler for discovering and refreshing public web documents.
LyonlBot Image	Specialized	`lyonlbot-image`	`Mozilla/5.0 (compatible; LyonlBot-Image/1.0; +https://lyonl.com/bot.html)`	Image discovery, rendered document image extraction, and asset rendering support.
LyonlBot News	Specialized	`lyonlbot-news`	`Mozilla/5.0 (compatible; LyonlBot-News/1.0; +https://lyonl.com/bot.html)`	News discovery and refresh crawler for public news documents.

If you see a Lyonl user-agent not listed here, report the IP address, timestamp, requested URL, and full user-agent string through https://lyonl.com/bot.html. The canonical LyonlBot user-agent isMozilla/5.0 (compatible; LyonlBot/1.0; +https://lyonl.com/bot.html).

How To Control LyonlBot With robots.txt

The fastest way to control crawling is with arobots.txtfile at the root of your site. LyonlBot reads the rules for its specific user-agent token. Lyonl uses separate crawler tokens for web, image, and news crawling so site owners can control each crawler class precisely.

Allow LyonlBot

User-agent: lyonlbot Allow: /

Block LyonlBot completely

User-agent: lyonlbot Disallow: /

Block private or duplicate areas

User-agent: lyonlbot Disallow: /admin/ Disallow: /private/ Disallow: /search/ Disallow: /*?session=

Slow LyonlBot down

User-agent: lyonlbot Crawl-delay: 10

Block image and news crawlers

User-agent: lyonlbot-image Disallow: / User-agent: lyonlbot-news Disallow: /

Crawling vs. Indexing

robots.txtcontrols whether LyonlBot may fetch a URL. It is not the same as removing a URL from search results. If you want a document removed from search results, use anoindexdirective on the document or send anX-Robots-Tag: noindexresponse header. LyonlBot must be able to fetch the document to see those noindex instructions.

HTML meta noindex

&lt;meta name="robots" content="noindex"&gt;

HTTP header noindex

X-Robots-Tag: noindex

Politeness And Server Load

LyonlBot is designed to crawl respectfully and avoid creating unnecessary load.

Per-host politeness and per-site politeness limits with robots crawl-delay handling are enforced by the crawler runtime.
Adaptive retry and backoff behavior responds to timeouts, transport errors, HTTP 429, and server errors.
Conditional requests use ETag and Last-Modified validators when available.
Fetch timeouts and maximum response-size limits bound crawler resource use.
Native Chromium rendering is mandatory for HTML documents and bounded by explicit byte, time, cache, and temp-directory limits.

If LyonlBot appears to be creating too much traffic, report sample log lines through Lyonl's crawler support path. We can review the crawl pattern, update site-specific limits, and help confirm whether the traffic is really from Lyonl.

Verify LyonlBot

User-agent strings can be spoofed. We are working on a public verification method. Until crawler IP ranges are published, report the IP address, UTC timestamp, requested URL, full user-agent string, and relevant access-log lines through Lyonl's crawler support path.

A placeholder IP range document is available at/crawler/ips.json. It does not currently publish crawler IP ranges.

Need Help With LyonlBot?

For crawl issues, removals, abuse reports, or verification questions, use Lyonl's crawler support path.

Reduce crawl rate.
Confirm whether traffic is from Lyonl.
Remove or update a URL in Lyonl Search.
Report private or sensitive content.
Report crawler abuse or impersonation.
Ask about a user-agent not listed here.

Include the domain, affected URL or path, desired action, full user-agent string, source IP, timestamp with timezone, relevant robots.txt rules, and sample access log lines.

Open crawler policy support

FAQ

What is LyonlBot?

LyonlBot is the crawler used by Lyonl Search to discover and refresh public web documents for our search index.

How do I block LyonlBot?

Add a LyonlBot rule to your robots.txt file.

User-agent: lyonlbot Disallow: /

Does blocking LyonlBot remove my document from Lyonl Search?

Not necessarily. robots.txt controls crawling. To request removal from search results, use noindex where supported or report the URL through Lyonl's crawler policy support path.

Does LyonlBot respect robots.txt?

Yes. The crawler runtime checks robots.txt before fetching documents and redirects, and it treats robots.txt itself as a control resource rather than crawlable document content.

Why did LyonlBot visit a URL I did not submit?

Crawlers can discover public URLs through links, sitemaps, redirects, and previously known URLs.

Why am I seeing more than one Lyonl user-agent?

Lyonl uses separate crawler identities for web crawling, image discovery, and news discovery. The table above lists the official identities.

Can I slow LyonlBot down?

Yes. LyonlBot supports crawl-delay in robots.txt. You can also report sample log lines through Lyonl's crawler policy support path if traffic still looks too high.

Can user-agent strings be faked?

Yes. A user-agent string alone is not proof that traffic came from Lyonl. Use the verification guidance above or report the request through Lyonl's crawler policy support path.

Does LyonlBot crawl documents behind login forms?

LyonlBot is intended to crawl publicly accessible documents and should not bypass authentication, paywalls, CAPTCHAs, or access controls.