• rosco385@lemm.ee
        link
        fedilink
        English
        arrow-up
        2
        ·
        11 months ago

        It’d be more naive to have a robot.txt file on your webserver and be surprised when webcrawlers don’t stay away. 😂

    • Zexks@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      11 months ago

      Lol. And they’ll delist you. Unless you’re really important, good luck with that.

      robots.txt

      Disallow: /some-page.html

      If you disallow a page in robots.txt Google won’t crawl the page. Even when Google finds links to the page and knows it exists, Googlebot won’t download the page or see the contents. Google will usually not choose to index the URL, however that isn’t 100%. Google may include the URL in the search index along with words from the anchor text of links to it if it feels that it may be an important page.