This Week in AI · Jun 6 – Jun 12, 2026

Anthropic and Google are accelerating the pace of AI development with context-rich models and efficient edge deployment, while OpenAI continues to build its enterprise capabilities.

What shifted

Claude Fable 5

[Source · date if known] https://www.anthropic.com/news/claude-fable-5-mythos-5

Anthropic has rolled out Claude Fable 5, a next-generation model that expands context length to 1M tokens and adds native multimodal input (text + image). The shift positions Anthropic as a serious challenger to OpenAI's GPT-4o and Google Gemini in the high-end conversational AI space, pushing other labs to accelerate their own multimodal offerings. Small business owners, marketers, and content creators can now run longer, more complex conversations with a single model without needing to chunk prompts. Experiment with Claude Fable 5's 1M-token context to build end-to-end workflows that previously required multiple tools. Use the multimodal API to pull in product photos and have the model generate captions or marketing copy instantly. If cost is a concern, compare Fable 5's per-token pricing against GPT-4o and adjust your prompt strategy to keep token usage lean while still getting the most out of the richer context.

[see original]

Gemma 4 QAT models: Optimizing compression for mobile and laptop efficiency

[Source · date if known] https://blog.google/innovation-and-ai/technology/developers-tools/quantization-aware-training-gemma-4/

Google released Gemma 4 quantization-aware training (QAT) models that compress the flagship Gemini-like architecture to run efficiently on mobile CPUs and laptops. The move shifts the balance from cloud-centric inference toward edge deployment, enabling lower latency and reduced bandwidth usage while maintaining competitive performance. Small business owners, marketers, and content creators can now host advanced language models directly on their own devices without relying on expensive cloud credits. Try Gemma 4 QAT on your laptop or phone by downloading the open-source weights and setting up a lightweight inference pipeline. This lets you keep sensitive data on device, reduce recurring cloud fees, and maintain fast response times, which is useful for creators who need instant, private AI assistance.

[see original]

Claude Fable is relentlessly proactive

[Source · date if known] https://simonwillison.net/2026/Jun/11/fable-is-relentlessly-proactive/#atom-everything

Anthropic released Claude Fable 5, a new variant that demonstrates autonomous, proactive problem-solving by launching browsers and executing exploratory steps without explicit user commands. This positions Anthropic as a leader in "self-directed" model capabilities, potentially reducing the need for manual prompt engineering and opening doors to more sophisticated developer tools. Small business owners who rely on AI for quick code fixes or automation can now tap into Claude Fable 5 to troubleshoot issues faster, for example, by automatically locating buggy dependencies or opening relevant documentation. Test Claude Fable 5 on a sandboxed project first to understand its proactive limits, then integrate it into your CI pipeline for automated debugging. Ultra Prompt can help you craft safe prompts that instruct the model to stay within defined directories and avoid external side effects, so you get the speed of automation without losing control.

[see original]

BBVA puts AI at the core of banking with OpenAI

[Source · date if known] https://openai.com/index/bbva

BBVA has deployed ChatGPT Enterprise across its entire workforce of 120,000 employees, partnering with OpenAI to embed generative AI into core banking operations. The move demonstrates that hyperscalers can now support massive internal workloads at scale, reducing latency and increasing reliability for mission-critical financial services. For small business owners who rely on BBVA's banking services, this means faster, more accurate customer support chatbots, quicker loan approvals, and AI-driven insights into spending patterns. Monitor how BBVA's AI rollout affects the tools you use, such as banking dashboards or payment APIs, and experiment with prompt templates that extract actionable insights from transaction data. Staying ahead of these changes lets you optimize your own workflows and avoid disruption when new AI-powered features roll out.

[see original]

OpenAI to acquire Ona

[Source · date if known] https://openai.com/index/openai-to-acquire-ona

OpenAI announced the purchase of Ona. The acquisition points toward expanding cloud-based agent capabilities, moving the puck toward more robust, enterprise-grade agent workflows. Small business owners and marketers who rely on automated content creation or data processing will want to watch how Ona's technology shapes OpenAI's agent offerings. Start mapping your current repetitive workflows (invoice processing, customer support scripts) to see where a persistent agent could replace multiple manual steps. Use Ultra Prompt's quick-start guides to prototype end-to-end automated pipelines as these capabilities become available.

[see original]

ALSO THIS WEEK

stratechery: An Interview with Ben Bajarin About Apple, AI, and Compute — Apple's compute shift directly impacts everyday AI users by enabling faster, offline inference on widely used devices.
simonwillison: DiffusionGemma — An open, high-speed diffusion-LLM combination that directly impacts everyday AI users' workflow efficiency and cost.

Ultra Prompt

This Week in AI · Jun 6 – Jun 12, 2026

What shifted

Claude Fable 5

Gemma 4 QAT models: Optimizing compression for mobile and laptop efficiency

Claude Fable is relentlessly proactive

BBVA puts AI at the core of banking with OpenAI

OpenAI to acquire Ona

ALSO THIS WEEK

Ready to level up your prompts?

Written by Sean