med-mastodon.com is one of the many independent Mastodon servers you can use to participate in the fediverse.
Medical community on Mastodon

Administered by:

Server stats:

355
active users

#robotstxt

1 post1 participant0 posts today
Continued thread

Here's #Cloudflare's #robots-txt file:

# Cloudflare Managed Robots.txt to block AI related bots.

User-agent: AI2Bot
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: amazon-kendra
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Applebot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: AwarioRssBot
Disallow: /

User-agent: AwarioSmartBot
Disallow: /

User-agent: bigsur.ai
Disallow: /

User-agent: Brightbot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: DigitalOceanGenAICrawler
Disallow: /

User-agent: DuckAssistBot
Disallow: /

User-agent: FacebookBot
Disallow: /

User-agent: FriendlyCrawler
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: GPTBot
Disallow: /

User-agent: iaskspider/2.0
Disallow: /

User-agent: ICC-Crawler
Disallow: /

User-agent: img2dataset
Disallow: /

User-agent: Kangaroo Bot
Disallow: /

User-agent: LinerBot
Disallow: /

User-agent: MachineLearningForPeaceBot
Disallow: /

User-agent: Meltwater
Disallow: /

User-agent: meta-externalagent
Disallow: /

User-agent: meta-externalfetcher
Disallow: /

User-agent: Nicecrawler
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: omgili
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: PanguBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Perplexity-User
Disallow: /

User-agent: PetalBot
Disallow: /

User-agent: PiplBot
Disallow: /

User-agent: QualifiedBot
Disallow: /

User-agent: Scoop.it
Disallow: /

User-agent: Seekr
Disallow: /

User-agent: SemrushBot-OCOB
Disallow: /

User-agent: Sidetrade indexer bot
Disallow: /

User-agent: Timpibot
Disallow: /

User-agent: VelenPublicWebCrawler
Disallow: /

User-agent: Webzio-Extended
Disallow: /

User-agent: YouBot
Disallow: /

#Google nutzt Inhalte für das #KI-Training auch dann, wenn Urheber dem widersprechen. Das wurde nun offiziell bestätigt.

Laut Google #Deepmind betrifft der Widerspruch nur bestimmte #Konzernbereiche. Wer seine Daten schützen will, muss die Seite komplett aus der #Google-Suche entfernen. #Verlage und #Webseitenbetreiber sehen sich dadurch wirtschaftlich benachteiligt.

golem.de/news/kuenstliche-inte

Golem.de · Künstliche Intelligenz: Google trainiert KI auch, wenn Urheber es nicht erlauben - Golem.deBy Mike Faust

Hey does anyone know if there's still a working zip bomb style exploit that can be deployed on a static site/JS (or as a asset/resource)? Specifically to target web scrapers and AI bullshit? The second any server goes online now it's immediately bombarded by stupid numbers of requests.