The AI week that just shook everything: 7 news items you need to know

Summary: Kimmy K 2.6 beats proprietary models at 5% of the cost, GPT 5.5 ("Spud") is expected within days, Codex turns into a "super-app" with computer vision on Mac, and Grock 4.5 is rumored to reach 1.5 trillion parameters by month's end. Meanwhile, Google is preparing at least three announcements for Google I/O — and a robot recently ran a marathon faster than humans.

0:04 – "One of the most intense stretches in AI history"

We've just come off one of the most intense stretches in AI news, and the next week looks even crazier. Today alone brought massive new developments: Moonshot AI dropped Kimmy K 2.6, an advanced open-source coding model already being compared to Opus 4.5 and 4.6. At the same time, GPT 5.5 — codename "Spud" — is reportedly right around the corner, possibly dropping today or Thursday. On Google's side, new Gemini checkpoints are surfacing and Qwen 3.6 Max has finally been released. Codex is transforming into something close to a super-app.

0:34 – "Kimmy K 2.6 — open source beats proprietary"

Kimmy K 2.6 is a new open-source coding model from Moonshot AI delivering state-of-the-art results across benchmarks like Swaybench, Browser Comp, and advanced math and vision tasks. In some cases it's being compared directly to Opus 4.6 — which is completely insane for an open-source model.

The big upgrades:

12-hour+ coding sessions with 4,000+ tool calls
300 parallel agents working together
Multilingual, multi-file development from a single prompt
94% cheaper input and 95% cheaper output compared to Opus 4.6 — while still outperforming it on Swaybench Pro

"The fact that this model is basically on par with or just a little behind these proprietary models is insane."

Real-world example: Quantitative strategies across hundreds of assets

Kimmy K 2.6 can design and execute complex multi-step workflows end to end — for example, building full quantitative trading strategies across hundreds of assets. When it comes to front-end, the model is exceptional: it can generate beautiful landing pages with dynamic movements, varied typography, and interactive elements — something not possible with proprietary models.

It can run locally on dual M3 Ultra with MLX at a full one trillion parameter VLM.

2:38 – "GPT 5.5 Spud — the halfway point to GPT 6"

GPT 5.5, codename "Spud," is currently being A/B tested inside ChatGPT. Early demos show incredible speed, token efficiency, and reasoning with faster outputs and stronger performance on complex tasks.

Stands out particularly in:

Coding
SVG generation
Game creation
3D workflows using tools like GS

The model goes beyond prompts to add structure, detail, and better design direction on its own. It feels like a halfway point to GPT 6 — better reasoning, faster performance, and lower cost in one model.

"The best way to think about it is that it's a halfway point to GPT 6, combining better reasoning, faster performance, and lower cost into one model."

According to Poly Market tipsters, the release is expected today or Thursday at the latest — the two days OpenAI typically delivers models.

Real-world example: Excel clone in minutes

With GPT 5.5, a complete Excel clone was created that doesn't just look like Excel — it feels like Excel. Full grid behavior, formatting interactions, cell selection. Scarily close to the real thing.

What makes this practically interesting: the model is token-efficient and readily accessible compared to Opus 4.7. For coding tasks, this could become the natural choice.

4:12 – "DeepSeek v4 — 1.6 trillion parameters incoming"

According to Zank — Princeton PhD researcher and AI lab fellow — DeepSeek version 4 may drop as early as this week. The rumored specs are massive:

Reported spec:

1.6 trillion parameters
Sparse MQA, fused kernels, and hyperconnections
MMLU around 99.4% — just 4 percentage points from maximum
Swaybench: 83.7%

"Early leaks suggest extreme performance levels, but these numbers are still unverified."

The model is said to compete directly with Opus 4.7 and GPT 5.5. Due to the scale, only heavily quantized versions would realistically run locally — potentially requiring a 512 GB-class machine.

6:04 – "Codex becomes a super-app — sees, clicks and thinks"

OpenAI has transformed Codex into something far beyond a coding tool. Codex can now:

Interact with apps on your Mac — see, click and type using its own cursor
Run in the background without taking over your system
Handle front-end iteration, app testing, and workflows without APIs
Schedule work, pause and resume with full context in the same thread
Suggest image generations with GPT Image 1.5 — GPT Image 2 may be coming directly inside Codex

"It is basically turning into a full super app for development and automation."

Real-world example: Automation over apps without APIs

Previously, automating apps without APIs required manual scripting or third-party tools. Now Codex can see the screen, understand what's happening, and interact with elements — like a human user, but without coffee breaks.

New: Chronicle — Codex builds memory from your work

Chronicle is a new research preview inside Codex that lets the model build memories from your day-to-day work on your computer, then uses those memories to become significantly more helpful and context-aware over time.

"People at OpenAI are already saying that it has noticeably changed how they use Codex in daily workflows."

8:38 – "Grock 4.3 — underrated but extremely capable"

Grock 4.3 beta is XAI's latest test model and it is truly underrated. Approximately 0.5 trillion parameters with an improved architecture, trained up to December 2025.

Key upgrades:

Native multimodal with better visual understanding
Agentic tool use and coding
Generates documents, slides, PDFs, spreadsheets
Improved reasoning with fewer hallucinations

Real-world example: CSGO clone with bazooka

With a single request, Grock 4.3 beta created a complete CSGO clone — including a functional bazooka. Fully generated code, no template.

Elon Musk's roadmap revealed

According to Musk's own comments:

Grock 4.4 — 1 trillion parameters, early May
Grock 4.5 — 1.5 trillion parameters, late May
Grock 5 — positioned as AGI

"If even partially accurate, that would mean we're looking at two major model releases from what he's calling AGI."

(Note: we don't know Musk's exact definition of AGI.)

10:11 – "Qwen 3.6 Max — Alibaba's new flagship"

Alibaba has quietly released a preview of Qwen 3.6 Max — the next generation of their flagship model. Focus areas:

Stronger agentic coding capabilities compared to Qwen 3.6 Plus
Better instruction following and improved real-world reasoning
Higher knowledge reliability

The model is designed to be smarter, more consistent in long-horizon tasks, and more capable as an autonomous agent in practical workflows.

"In simple terms: designed to be smarter, more consistent in long-horizon tasks and more capable as an autonomous agent."

11:33 – "Google I/O in 28 days — three things incoming"

With Google I/O roughly 28 days away, rumors are building. Three things stand out:

1. New Gemini checkpoints in AI Studio

Google is testing newer, significantly upgraded models internally. This could be Gemini 3.2 Pro or even Gemini 3.5 Pro — or possibly a lighter flash variant of Gemini 3.1.

2. Co-Work competitor inside Gemini

Google is developing a feature that functions exactly like Co-Work — an agentic automation for delegating goals, connecting applications, and automating workflows. What makes it especially powerful: deep integration with Google Workspace — Gmail, Sheets, Drive, all in one place.

3. Expanded AI Studio access for AI subscribers

Google has now expanded access so AI subscribers can use enhanced coding limits and direct access to Pro models without linking an API.

13:37 – "Robot marathon — the F1 of AI"

To close: robotics has now reached a point that feels straight out of a sci-fi simulation. A full-fledged robot is now competing in a marathon — actually outperforming humans in certain segments.

What makes it even more surreal is how the system is engineered. The entire movement system functions like a full F1 pit stop — humans step in quickly to service the robot, cool it down between runs, and in some cases dry ice is used for cooling.

"This is basically turning it into the F1 of robots."

Thoughts on how this affects the future

What is striking about this week is the pace. Open-source models like Kimmy K 2.6 now perform on par with the best proprietary alternatives at a fraction of the cost. This doesn't just democratize powerful AI — it pushes the entire industry forward.

At the same time, we are seeing a clear convergence: tools like Codex and Grock 4.3 are no longer pure code generation — they are becoming permanent work companions that remember, plan, and execute over time. The line between tools and colleagues is blurring.

For those building with AI daily, this means the choice is no longer "which model" — but "which agent architecture." The question is no longer theoretical.

Watch the full presentation on YouTube

AI in the Workplace: The Radical Change That's Already Here

News

19 April 2026

AI in the Workplace: The Radical Change That's Already Here

60% of large companies want to force AI on reluctant employees — while 75% admit their own AI strategy is completely wrong. What's actually happening?

News

19 April 2026

AI Enablement Radar — Week 16, 2026

This week: Anthropic Managed Agents, OpenAI SDK overhaul, Stanford AI Index 2026, Gartner Agentic AI Hype Cycle, WRITER enterprise survey, EU AI Act, and more.

Demystifying AI: From Chatbots to Autonomous Agents — A Technical Overview

Agentic AIMindset Forge

19 April 2026

Demystifying AI: From Chatbots to Autonomous Agents — A Technical Overview

AI is not magic — it is a scalable tool. Explore how the market leading vendors are evolving from simple text generators into actionable agents.

Real-world example: Quantitative strategies across hundreds of assets

Real-world example: Excel clone in minutes

Real-world example: Automation over apps without APIs

New: Chronicle — Codex builds memory from your work

Real-world example: CSGO clone with bazooka

Elon Musk's roadmap revealed

1. New Gemini checkpoints in AI Studio

2. Co-Work competitor inside Gemini

3. Expanded AI Studio access for AI subscribers

Thoughts on how this affects the future

Related

AI in the Workplace: The Radical Change That's Already Here

AI Enablement Radar — Week 16, 2026

Demystifying AI: From Chatbots to Autonomous Agents — A Technical Overview