Daily digests of what's actually happening in AI — from research breakthroughs to new model releases, minus the hype.

GPT-5.4-Cyber limited release signals stricter access to cyber models. Here's who gets in, why it matters, and how OpenAI compares with Anthropic.

Unsafe behavior transfer in AI agent distillation raises new safety concerns as arXiv 2604.15559 explores subliminal behavioral transfer.

Runtime security for AI agents covers risk scoring, policy enforcement, and rollback to prevent unsafe actions, loops, and PII leaks in production.

Learn why AI chatbots give vague answers, what causes hollow responses, and how to judge when vague AI output is risky or acceptable.

OpenAI revenue and reputation challenge explained: how trust, governance, and product policy now shape growth after ChatGPT.

LM Studio Claude Code subagent tutorial: run Qwen 3.6 locally, cut Opus token spend, and avoid common workflow failures.

Agentic AI for evidence-based medicine is advancing with DeepER-Med, a system focused on transparent, trustworthy medical research.

Optimize AI agent skills with MCTS using a new bilevel method for LLM agents, with practical implications for skill design and evaluation.

An AI-assisted development workflow case study on building Bloom with Claude Code, TDD, and GitHub Actions in production.

This Minecraft AI agent devlog breaks down Kiwi-chan's progress, looping issues, recovery behavior, and lessons for LLM agent builders.

Best LLM for tabletop RPG game master? See why a 27B model beat a 405B rival on narrative quality, pacing, and long-form play.

AI limitations in long conversations explained through a 3-hour Claude chatbot test, with failure modes, analysis, and evaluation lessons.
