• asudox@lemmy.world
    link
    fedilink
    arrow-up
    62
    ·
    4 months ago

    Block? Nope, robots.txt does not block the bots. It’s just a text file that says: “Hey robot X, please do not crawl my website. Thanks :>”

    • Cynicus Rex@lemmy.mlOP
      link
      fedilink
      arrow-up
      8
      ·
      4 months ago

      Unfortunate indeed.

      “Can AI bots ignore my robots.txt file? Well-established companies such as Google and OpenAI typically adhere to robots.txt protocols. But some poorly designed AI bots will ignore your robots.txt.”

      • breadsmasher@lemmy.world
        link
        fedilink
        English
        arrow-up
        15
        ·
        4 months ago

        typically adhere. but they don’t have to follow it.

        poorly designed AI bots

        Is it a poor design if its explicitly a design choice to ignore it entirely to scrape as much data as possible? Id argue its more AI bots designed to scrape everything regardless of robots.txt. That’s the intention. Asshole design vs poor design.