Greetings! Another hot AI week is in the books, and the next one could be even bigger. New models are shipping at a pace we have not seen before, and a new category is taking shape in parallel: always-on AI agents that follow you across apps, devices, and messengers. If this trajectory holds, we are looking at billions of agents running around the internet already in the coming months.
Let’s start with agents. OpenClaw showed up less than a month ago and immediately pulled in outsized traction. Products have already been built on top of it, but this still feels like the opening act. Big labs are not going to watch this wave pass by. They will either integrate OpenClaw or ship their own equivalents.
One of the ongoing soap operas is where the creator of OpenClaw lands next. Rumors have pointed in two directions: Meta or OpenAI. On the Meta side, we have seen signs that they are working on integrating OpenClaw directly into the Meta AI app. The pitch is straightforward: chat with your always-on agents through Meta AI, with an option to connect to self-hosted OpenClaw agents too. If that comes together, the obvious endgame is that you could talk to your agents through WhatsApp, the Meta AI app, or even AI glasses. That would be a meaningful shift in how “agent” products actually reach people.
None of that is publicly shipped yet, but the testing trail is getting louder. We also spotted the upcoming, rumored Avocado models being tested inside Meta AI. In parallel, Meta appears to be actively working on integrating Manus AI into Meta AI directly. The model referenced there, Sierra, is positioned as a browser operating agent that can execute complex tasks. It looks like a potential comeback moment for Meta AI, but a lot depends on real-world performance, especially from Avocado.
Manus, for its part, moved fast. Later this week, users noticed Manus AI released its own always-on agent. It appears to be powered by OpenClaw as well, and it pushes users to connect Manus to Telegram so they can talk to a 24/7 agent. Early impressions are that it is capable, but it burns credits quickly and gets expensive fast. Still, it is notable as one of the first attempts by a major lab to ship an OpenClaw-style experience inside its own product. Interestingly, Manus did not launch WhatsApp integration out of the gate, possibly due to technical friction. The Telegram bot also got suspended over the weekend and came back the next day, likely due to a general bug rather than a policy takedown.
The OpenClaw founder to OpenAI storyline remains one of the biggest intrigues in the AI community. Meanwhile, Moonshot AI made its own move. Kimi shipped an always-on agent inside the Kimi app for paid users. From there, users can talk to an OpenClaw agent or connect additional self-hosted OpenClaw agents and operate several at once. It launched over the weekend, and it underlines the broader reality: companies are racing to claim mindshare around always-on agents. Kimi appears to have pulled it off.
On the model side, two major Chinese labs dropped upgrades right before the Lunar New Year holidays:
Both are posting coding performance that is being framed as on par with Claude Opus 4.6. That is a meaningful data point because Opus 4.6 landed only a week ago. If these benchmark narratives hold up, Anthropic may feel pressure to answer quickly with another step up. There are already rumors that a Sonnet upgrade is in the works. Whether it lands as Sonnet 5 or a Sonnet 4.6 style update to match the Opus numbering is something we may learn next week.
From OpenAI, the notable drop was GPT 5.3 Codex Spark. The headline detail is infrastructure: it is the first model powered by Cerebras, and the speed is the point. The open question now is how that speed translates into real coding quality and agent workflows. We are also still expecting the general GPT 5.3 model in the coming weeks. OpenAI often times its releases around the noise from other labs to reclaim attention, and right now they are clearly watching both Google and Anthropic.
Google, meanwhile, pushed a Gemini DeepThink 3 upgrade and showcased a major result on Arc AGI 2 alongside other benchmarks. In the areas it targets, it looks like a serious outlier, both as research and as a product. We also expect Gemini 3.1 Pro to land soon, so we are getting ready for that. Beyond models, there are additional upgrades brewing across AI Studio, Gemini, and the broader API stack.
In video, ByteDance’s Seedance 2.0 arguably owned the week. AI-generated clips are everywhere, and the quality jump is hard to miss. It is being positioned as the top video model right now, ahead of Sora 2.0 and Veo 3.1, with longer generations, strong character consistency, fewer obvious slop moments, and a level of detail that feels like a new tier. Access is still a bit tricky, but some products are offering early access through paid plans. Broader availability is expected after the holidays in China.
If you want something more hands-on and open source-friendly, take a look at the Cline 2.0 upgrade that just shipped. It includes free access to GLM-5 and MiniMax M2.5 for a limited time, and if your goal is to test these models through a new command line coding workflow, this is currently one of the cleanest ways to do it.
Finally, keep an eye on Perplexity. Something significant appears to be forming:
- They are launching a new Gamma mode that is very fast and currently powered by Grok (new scoop coming soon)
- It is still unclear what the final release shape looks like, but the Grok integration is especially interesting with Grok 4.20 expected next week.
- You may have noticed Perplexity’s UI palette shifting toward black and grey, closer to xAI’s Grok look.
That feeds into the broader rumor mill around Perplexity’s strategy lately, including talk that a major announcement could involve acquisition. If I had to place a bet, I would bet that xAI could have acquired them. If that happens, it is a big deal. xAI’s current position has looked shaky at times, and Perplexity would be an immediate boost. xAI is also a lab with very large ambitions, so the move would fit the pattern. We will see what next week brings, and we will dig into it with a full summary.
Newsletter Guide
Here's your guide on how to make the most out of this newsletter:
- 🤳 Gear up with your mobile phone and desktop devices.
- đź“© Scan through the newsletter, picking out the apps that pique your interest.
- 🔥 Keep an eye on the "hot" emoji to identify the hottest user experiences.
- đź‘€ Dive deeper into our TestingCatalog posts to discover how to put these apps to the test.
- 📲 Roll up your sleeves and try them out yourself!
Featured 🤩
Cline

🔥 Cline drops CLI 2.0 coding agent, powered by K2.5 and M2.5 for free – Cline CLI 2.0 brings coding agents into the terminal with interactive and autonomous modes, ACP support, and a free Kimi K2.5 and MiniMax M2.5.
MemOS

MemOS OpenClaw Plugin to cut agent memory costs by 70% – MemOS releases its OpenClaw Plugin, offering a shared memory layer for OpenClaw teams to reduce token costs and maintain consistent agent context.
MiniMax

ICYMI: MiniMax debuts MiniMax-M2.5 model on web and APIs – What's new? MiniMax launched MiniMax-M2.5 for coding, agentic automation and office tasks; it comes in M2.5 and M2.5-Lightning versions via API and MiniMax Agent.
Anthropic

Anthropic brings slash commands and SSH support to Claude Code – Anthropic rolls out new Claude Code slash commands, SSH tunnel support, tool access modes, and teases a possible Sonnet model update.

Anthropic prepares Claude Tasks on mobile for browser automation – Anthropic is testing a Tasks feature in Claude’s mobile apps, bringing Cowork-style automation, repeatable actions, and possible browser tasks soon.
- Anthropic raises $30B Series G at $380B valuation, $14B run-rate
- Claude free plan upgraded with file creation, connectors, skills, and compaction
- Claude Cowork for Windows released with full feature parity
- Claude app gains interactive responses (maps, selectors) and wider voice mode rollout

Google adds 10 customizable infographic styles to NotebookLM – Google NotebookLM is testing a visual style selector for infographics, offering 10 distinct style options for tailored presentations.
- Design System support rolled out for Stitch
- Gemini 3 Deep Think upgraded to gold-medal level in Physics/Chemistry
- 🔥 Gemini 3 Deep Think reaches SOTA 84.6% on ARC-AGI-2
- 🔥 Stitch by Google adds “Ideate” agent mode
- Gemini 3.1 Pro Preview possibly in preparation
- New AI Studio homepage with Omnibar and quick access
- Gemini Enterprise testing new “AI coding” section with code execution
- Premium Content personalisation (paid subscriptions) in development for Gemini
- Stitch gains direct Figma export as editable layers
- Stitch inline edit feature in development
Kimi

Kimi launches Agent Swarm AI for parallel research and analysis – What's new? Agent Swarm coordinates 100 sub-agents to execute 1500 tool calls at 4.5x single-agent speeds; it is offered on Kimi's platform as a research preview;
Manus AI

🔥 Manus AI launched 24/7 Agent via Telegram and got suspended – Manus AI launches “Agents” across platforms, enabling users to create personal agents with persistent memory and Telegram integration for easy access.
- Manus Telegram bot is back
- Telegram suspended new Manus always-on agent account
- Manus AI rolling out always-on agent with skills, subagents, memory, dedicated instance, messengers
Meta

Meta AI redies Avacado, Manus Agent and OpenClaw integration – Meta AI is testing Avocado models, MCP integrations, and Manus browser agent support, with scheduled tasks and OpenClaw compatibility launching soon.
Notion

Notion tests Agents 2.0 with scripting tools and Workers – Notion is testing new agent features, including a redesigned settings UI, new automation triggers, and upcoming Agents 2.0 upgrades.
OpenAI

OpenAI debuts Codex-Spark powered by Cerebras infra – OpenAI introduces GPT-5.3-Codex-Spark, a real-time coding model in Codex built for rapid code iteration, now available to ChatGPT Pro users.

OpenAI works on ChatGPT Skills, upgrades Deep Research – OpenAI’s updated Deep Research in ChatGPT with GPT-5.2, and is working on a new Skills section for ChatGPT to install and edit SKILLS.

OpenAI tests sponsored ads in ChatGPT for free US users – OpenAI is testing sponsored placements in ChatGPT for U.S. users on Free and Go tiers, with privacy rules, user controls, and clear ad labeling.
- Codex merch from the Super Bowl campaign is on the way
- OpenAI accuses DeepSeek of using new obfuscated distillation methods
- Codex for iOS/Android mentioned in repository
- OpenAI Codex Alpha Windows waitlist (Linux also planned)
- GPT-5.3 mention spotted in Codex PR
Perplexity

Perplexity tests Health page with Apple Health integration – Perplexity is testing a Health section that may offer personalised advice, settings-based profiles, and possible Apple Health integration.
Samsung
Telegram

Telegram revamps app with new interface and craftable gifts – What's new? Telegram updated its Android, iOS and iPad apps with a redesigned look, bottom bar, media viewer and shortcut; gift crafting, group transfer and bot button color options added;
xAI
- Grok 4.20 expected next week, significant improvement over 4.1
- Grok Imagine now supports up to 3 reference images
- Grok voice mode can now display internet images
Z AI

Z.AI launched GLM-5, new open-source model on chat and APIs – Z.ai has launched GLM-5, a flagship open-weight LLM designed for complex systems engineering, full project builds, and multi-step tool workflows.
- GLM-5 #1 open model on Text Arena (#11 overall)
- GLM-5 benchmarks released (77.8 SWE-Bench, 50.4 HLE w/tools, 75.9 BrowseComp)
- GLM-5 now live on Z AI
- GLM-5 mention spotted on GitHub (expected February)
Happy testing! 🤖
P.S. Some ref links
- Flowith AI (3k extra credits)
- Ideogram (100 extra credits)
- Runable (1000 extra credits)
- Manus AI (500 extra credits)
- NordVPN (1 month free)
P.S.S.
If you use Telegram, don't forget to join us there too! It is the fastest place to learn about new AI scoops 🗞️
P.S.S.S
If you want to promote your AI tool in this newsletter, please check our Stan Store for Advertisers.