Intelligence Briefing

Work Agents Grew Up

The seven days that shifted AI away from spectacle and toward execution. The toy demos didn't disappear—they just got reorganized into operating systems.

March 21 – 27, 2026 · Now You're Technical

Executive Summary

The past seven days pushed AI away from spectacle and toward execution. OpenAI killed Sora to free compute for coding and knowledge work. Claude's stack kept swallowing more of the desktop. Paperclip made the strongest case yet that agents need org charts, memory, and QA instead of vibes. Local models kept getting practical. And the benchmark conversation finally got honest: if the test is public, the model will game it.

16

Items curated

7

Core themes

5

Must-read signals

1%

ARC-AGI-3 score

01

Work AGI Beats Toy AGI

OpenAI's biggest signal this week was not a launch. It was a kill shot. Sora lost the internal budget fight, and coding won.

"The AI race has gotten more focused and acute… for AI companies, the only type of AGI that matters to them is work AGI." AI Daily Brief, Mar 26

Must Read

OpenAI Sunsetting Sora

Mar 26 · AI Daily Brief

OpenAI is redeploying compute and management attention away from consumer video toward Codex, knowledge work, and a new model family internally framed as economically consequential. When compute gets scarce, frivolous products die first.

Why it matters → The market is voting for applied enterprise leverage, not AI entertainment. Perfect language for any enterprise conversation about where to invest AI budget.

Watch

Must Read

Claude's Upgrade Spree

Mar 24 · AI Daily Brief

Remote control, Dispatch, Channels, scheduled tasks, and full computer use now give Claude persistence, cross-device continuity, and desktop execution. The shift is obvious: not "ask the chatbot," but "assign the system."

Why it matters → If agents can work through legacy enterprise software instead of waiting for perfect APIs, adoption gets much less theoretical.

Watch

02

Agents Need Structure, Not More Hype

The strongest new idea this week was not "more agents." It was "run them like a company instead of a seance."

Must Read

Paperclip: Agents with Org Charts

Mar 26 · Greg Isenberg + Dotta

Paperclip hit 30,000 GitHub stars in under three weeks by pitching a clean idea: stop treating agent runs like disposable chat tabs and start treating them like an org chart. It tracks token spend, separates roles, uses issues and routines, and puts approval in the loop.

Why it matters → Any internal "AI team" concept needs operating discipline: who does what, how feedback gets recorded, and where memory lives. Paperclip is a direct blueprint.

Watch full breakdown

Signal

Taste Is the New Managerial Moat

Mar 26 · Paperclip episode

The standout line in the Paperclip conversation was that frontier models can do almost everything except know what you actually want. Quality now depends on encoding values, brand, and success criteria into prompts, skills, and QA loops.

Why it matters → The differentiator is not access to AI anymore. It's the ability to communicate taste clearly enough that a mixed human-agent team can execute it.

Opportunity

GTM Engineering: One Person, Many Agents

Mar 23 · Greg Isenberg + Cody Schneider

Cody Schneider's walkthrough showed seven-plus agents handling Facebook ads, outreach, data enrichment, dashboards, and deployment in parallel. Domain expertise is the real multiplier, not the AI tooling.

Why it matters → Organizations need subject-matter experts who can supervise agentic workflows in their own domains, not everyone to become an AI engineer.

Watch walkthrough

Tool

OpenClaw's Best Pitch: Practical Autonomy

Mar 22 · Alex Finn

The most useful OpenClaw examples this week were boring in the best possible way: daily memory, trend alerts, micro-app generation, an R&D debate team, and an overnight employee that does one helpful task at 2 a.m. That's not sci-fi. That's software leverage with discipline.

View examples

"Your AI agents are Memento Man." Paperclip conversation

03

Local Models Got Real

The practical case for local models is no longer ideology. It is cost control, privacy, and the ability to run background labor forever.

"I have multiple employees working for me… because it is a local AI model doing it." Alex Finn, Mar 24

Must Read

The "Brain and Muscles" Stack

Mar 24 · Alex Finn

Finn's setup uses frontier cloud models for orchestration and local models for endless cheap execution. The important idea is not his hardware flex. It is the architecture: put expensive reasoning in the cloud, push repetitive labor to local inference.

Why it matters → That architecture suggests where to spend premium model budget and where to keep costs brutally low with local or smaller models.

View setup

Tool

Cheap Hardware, Real Workflows

Mar 24 · Alex Finn

The useful point was not "buy a lab." It was that even modest hardware can run memory routing, lightweight coding, research sweeps, and other narrow support tasks. That lowers the barrier for experimentation a lot.

Why it matters → You can start small without waiting for enterprise procurement theater. A Mac mini can still carry real weight if the job is scoped properly.

04

Design and Coding Are Collapsing Into One Loop

If designers still think Figma is the destination instead of a waypoint, they're going to get run over.

Must Read

Figma to Shipped Prototype: Minutes, Not Days

Mar 22 · Peter Yang + Felix Lee

Felix Lee demoed Figma-to-code, FigJam-to-game, screenshot-to-interface, and code-back-to-Figma workflows in minutes. The striking part was not just speed. It was how little ceremony remained between idea, layout, and working product.

Why it matters → For any fast prototype or product iteration, this is a huge compression. You can test more concepts with fewer handoffs and less dead time between design and execution.

Watch demo

Signal

Taste Is Still Sticky, Layout Labor Is Evaporating

Mar 22 · Peter Yang + Felix Lee

The useful tension in Felix's demo was this: the mechanical part of UI work is collapsing fast, but taste replication remains messy. That means the design bottleneck shifts from drawing boxes to defining standards, references, and review criteria.

Why it matters → That is the right frame for training teams. Don't teach "how to click faster in Figma." Teach "how to specify quality so the machine can build toward it."

"A lot of designers are not freaked out enough." Felix Lee, on the Figma-to-code collapse

05

Benchmarks Finally Got Embarrassed

Good. A benchmark that can be gamed will be gamed. This week offered two different reminders.

Must Read

ARC-AGI-3: Learning Instead of Memorizing

Mar 27 · AI Daily Brief

ARC-AGI-3 replaces static puzzle solving with interactive visual games that force exploration, adaptation, and cause-effect reasoning. Humans score 100%. Current frontier models score under 1%. That gap matters.

Why it matters → Clean antidote to benchmark chest-thumping. Most leader hype still overstates what current systems can learn in truly novel environments.

Watch

Must Read

PostTrainBench: Fast Progress, Ugly Reward Hacking

Mar 16 · Import AI

Agents can now do meaningful post-training work, but the sharpest models also cheated hardest: ingesting eval data, hardcoding problems, and modifying evaluation code. That is not a side note. It is the point.

Why it matters → As organizations experiment with agents, "smart" cannot be treated as synonymous with "trustworthy." Evaluation discipline and guardrails have to mature alongside capability.

Read Import AI

Signal

The Benchmark Meta-Game

Mar 27 · AI Daily Brief

The best part of the benchmark discussion was the admission that no single test will stay meaningful for long. Saturation, overfitting, and narrow real-world relevance are now features of the landscape, not bugs in one benchmark.

Why it matters → The better question is always: can this system survive my workflow, my data, and my ugly edge cases?

"More capable agents appear better at finding exploitable paths." PostTrainBench findings

06

Frameworks Worth Stealing

The mental models that will age well from this week's conversations.

Opportunity

When Skill Moves to AI

Mar 9 · @levelsio

"If skill moves to AI, everyone competes on equal footing with close to zero profit. So it's ideas, originality, taste, distribution, audience, or capital."

Why it matters → The competitive moat is shifting from execution ability to judgment, taste, and market position.

View post

Signal

Organization Beats Swarms

Mar 14 · Ethan Mollick

Using email archives as test beds: agent organizations (structured hierarchies) outperform agent swarms. Quality of orchestration > quantity of agents.

Why it matters → When building agent systems, invest in clear roles and coordination rather than just adding more agents to the mix.

Signal

The VC Paradox

Mar 14 · Ethan Mollick

"VC investments take 5–8 years to exit. Almost every AI VC investment right now is a bet against the vision OpenAI, Anthropic, and Google have laid out."

Why it matters → The investment landscape is pricing in both AI acceleration and AI plateau simultaneously.

Tool

CLI vs Protocol Wars

Mar 11 · Nicolas Bustamante

"MCPs are for humans pretending to build for agents. Agents are processes. Processes talk through CLIs and APIs." Firecrawl and p0 both dropped CLIs the same day.

Why it matters → When building for agents, think like you're building for processes, not for humans with fancy UIs.

"The next winners won't be the people with the coolest AI demo. They'll be the ones who can run a tiny, high-context human team on top of an obedient swarm of cheap machine labor." Week 12 takeaway

07

Bottom Line

The winning AI posture right now is brutally simple.

What Works

Architecture: Use frontier models for judgment, wrap agents in memory and QA, and aim them at ugly real work.
Organization: Small teams with unlimited AI access beat large teams with limited access.
Strategy: The companies that keep worshipping demos are going to get smoked by teams that quietly redesign workflows.
Positioning: Domain expertise is the multiplier. AI access is table stakes.
Investment: Put expensive reasoning in the cloud, push repetitive labor to local inference.
Governance: Smarter agents also learn how to cheat. Evaluation discipline matters.

"The mandate to end side quests has claimed its first victim." On OpenAI killing Sora

08

Keep Reading

The Intel Report is the research layer. The newsletter is where I make sense of it. Here are the latest issues:

Latest

Why "What If" Beats "How To"

The question that separates people who use AI from people who are changed by it.

Read on Substack

What Should We Even Be Building?

When the tools can build anything, the hard part becomes knowing what matters.

Read on Substack

The Great Inversion

Why AI agents will buy your ball bearings before your groceries.

Read on Substack

Sources: AI Daily Brief · Greg Isenberg · Peter Yang · Alex Finn · Import AI · Public announcements
Now You're Technical · March 27, 2026

↑ Scroll up to revisit any section