From chatbot to operating layer: when AI agents start doing the work

The most interesting part of this AI shift is not that models answer faster. It is that they are starting to get memory, tools, and access to real processes. At that point, AI stops being a chat window and starts becoming an operating layer for work, with the same need for security, monitoring and clear ownership.
Listen to the episode
The player above contains the full podcast episode. It is a dense genomgång into why persistent AI agents are starting to replace the old pattern: ask a question, get an answer, close the tab, start over next time.
The market is moving toward systems that can carry context over time, connect to internal tools, follow up work in the background and sometimes suggest or execute the next step. That sounds futuristic, but much of what the episode covers is already concrete: MCP, secure tunnels, enterprise consulting rollouts, faster multimodal models and agent platforms that start to behave like project infrastructure.
From question and answer to persistent agent
The classic chatbot is transactional. You ask a question. It answers. Then it forgets the context unless you paste everything in again.
A persistent agent works differently. It has instructions, tools, files, system access and sometimes scheduled tasks. It can read a backlog, compare it with a document, create a draft, wait for an event and continue later. That is less "smart assistant" and more "small work environment".
For small and mid-sized organizations, the useful question is simple: which recurring workflows would actually improve if the AI did not have to start from zero every time? Sales material, quote drafts, customer cases, project status, document review and internal reporting are often better candidates than broad "AI for everything" projects.
Secure access is the real bottleneck
An agent becomes useful when it can see the right context. That is also where the risk starts. Customer data, finance systems, code repositories and internal documents cannot simply be opened to a public AI service without controls.
That is why MCP, the Model Context Protocol, matters so much in the episode. MCP is an open standard for connecting AI applications to external systems such as files, databases, tools, and workflows. The point is not to give AI free access to everything. The point is to create a clearer way to describe what it may read, which tools it may use and how it should ask for help.
Source: Model Context Protocol: What is MCP?
Anthropic also released MCP tunnels as a research preview on May 19, 2026, allowing Claude to connect to MCP servers in private networks. That is exactly the kind of building block companies need if agents are going to work close to internal systems without exposing the inside of the business to the public internet.
Source: Anthropic Claude Platform release notes
The agent moves into the business process
The episode uses PwC and Anthropic as a clear example of how AI is now being sold into the enterprise. This is no longer only about licenses for a chat app. PwC plans to roll out Claude Code and Claude Cowork, build a joint Center of Excellence with Anthropic and train 30,000 professionals on Claude.
Source: Anthropic on the PwC partnership
That says a lot about the direction of travel. Consulting firms, finance teams and engineering groups do not want another separate tool someone has to remember to open. They want AI close to the flow where the work already happens.
In practice, that means the agent needs to understand several systems at once: Slack or Teams for decisions, Jira for tasks, Notion or Google Docs for strategy, Figma for design and GitHub for implementation. The value is not that AI writes a neat summary. The value is that it can spot the gap between what was decided, what was designed and what is actually being built.
Faster models are not enough if the system cannot hold up
Google describes Gemini 3.5 Flash as a model built for agentic workflows, coding and faster output. In the podcast, it becomes an example of why speed matters: an agent sitting inside a customer flow, reviewing cases or orchestrating several steps cannot feel like a slow report generator.
Source: Google: Gemini 3.5
But speed is only half the issue. When AI becomes infrastructure, it needs to be monitored like infrastructure. If a model slows down, starts returning malformed JSON or drops in quality during an incident, "the AI had a bad day" is not an operational plan.
A serious agent workflow needs three things from the start:
- Clear boundaries: what the agent may read, suggest and execute.
- Human decision points: where someone must approve before anything is sent, booked, paid or published.
- Fallback and monitoring: logs, tests, alternative models and stop rules when quality or availability drops.
That is less glamorous than a demo. It is also the difference between a fun prototype and something you can use in daily work.
What you can do now
If you want to start practically, do not choose "build an agent" as the first task. Choose a workflow where context is already scattered and people spend time coordinating it.
Good starting points include:
- a weekly meeting where the same status is always chased across several systems
- a quote process where the same information is copied between email, documents and CRM
- a support flow where the answers exist, but live in different internal sources
- a document review where someone manually compares requirements, contracts, and delivery
That is where an agent can get a narrow job: fetch, compare, suggest and hand over. Not "run the company". Not "make every decision". Just one clear slice of work with the right access, the right limits and visible control.
For Hammer, this is often Tool Forge work: connecting the right tools, creating safe workflows and building the agent so it helps people make better decisions instead of hiding the decisions.
Thoughts on how this affects the future
When AI becomes persistent, the human role changes. We move from writing individual prompts to designing work environments where AI can be useful over time. That requires more systems thinking, not less.
The old chatbot could be a little sloppy without much consequence. An agent that reads internal documents, follows up projects and suggests next steps cannot be a black box. It needs permissions, memory, audit trails and clear stop points.
That is where the next advantage sits. Not in having the most AI tools, but in knowing exactly where they belong.
The Forge newsletter
Get new articles in your inbox
Pick the topics you care about. No noise, at most one email a week.
We follow GDPR. Unsubscribe anytime.


