Cybersecurity researchers aren't happy about the guardrails on Anthropic's Fable(techcrunch.com)
581 points by speckx 1 day ago | 514 comments
tl;dr: Anthropic's newly released Fable model, a public version of its Mythos cybersecurity model, is drawing complaints from security researchers who say its guardrails are overly aggressive and keyword-based, blocking even benign requests like code reviews or reading blog posts. When triggered, Fable falls back to Claude Opus 4.8, citing flagged "cybersecurity or biology topics." Researchers can apply to Anthropic's Cyber Verification Program for looser restrictions, similar to OpenAI's Trusted Access for Cyber.
HN Discussion:
  • Anthropic walked back the policy after backlash, validating the criticism
  • Silent model downgrading constitutes deceptive practice that destroys user trust
  • Fable is useless across professional domains and easily replaced by Wikipedia
  • Keyword-based guardrails are absurd, blocking legitimate personal and research tasks
  • Market competition or open-source Chinese models will force Anthropic to reverse course