AI Enablement Radar week 23: build an AI workbench, not another chat

AI tools moved closer to everyday work this week. Not because of one bigger model on its own, but because more tools now connect to apps, databases, coding tools, and implementation partners. For a small team, the question is practical: which recurring task should AI help with next week, which systems does it need to read, and who approves before anything goes out?
Top signals this week
- Codex is moving beyond development teams. OpenAI says more than 5 million people use Codex every week, and non-developers now make up about 20 percent of its users. That matters for small teams: the AI workbench will not only be used for code, but also for reports, campaign material, research, data analysis, and internal apps.
Source: Codex for every role, tool, and workflow, OpenAI
- OpenAI moved into AWS governance. AWS announced that GPT-5.5, GPT-5.4, and Codex are now generally available on Amazon Bedrock. Organizations already using AWS can manage model access, billing, IAM, VPC isolation, encryption, CloudTrail logs, and guardrails inside a familiar IT governance model.
Source: GPT-5.5, GPT-5.4, and Codex from OpenAI are now generally available on Amazon Bedrock, AWS
- Partner quality is becoming part of the AI decision. Anthropic launched Services Track and Partner Hub for the Claude Partner Network. The requirements focus on certified people, deployed production customers, and public customer stories, not just polished sales copy. For buyers, the reminder is simple: ask who has actually built something that runs.
Source: Introducing the Services Track and Partner Hub of the Claude Partner Network, Anthropic
- Agent work now has APIs and sandboxes. GitHub made the Copilot Agent Tasks REST API available in public preview for Copilot Pro, Copilot Pro Plus, and Copilot Max, and released local and cloud sandboxes for Copilot. An agentic workflow here means AI can start, track, and perform multi-step tasks in a bounded environment, not just answer inside chat.
Sources: Agent tasks REST API, GitHub Changelog and Cloud and local sandboxes for GitHub Copilot, GitHub Changelog
- MCP is becoming a practical integration layer. MCP, the Model Context Protocol, is a standard that lets AI tools use selected apps, databases, and actions through a controlled connection. Google Cloud made Remote MCP Server for AlloyDB generally available, while Zapier describes how its MCP can give AI access to more than 9,000 app connections and 30,000 actions with OAuth, limits, and audit logs.
Sources: AlloyDB Remote MCP Server GA, Google Cloud and Zapier MCP guide
What organizations are actually doing with AI
The useful part this week is that the adoption examples look more like real work and less like demos.
Google Cloud highlights small and midsize businesses using Gemini Enterprise for internal search, support engines, market research, contract analysis, and shared knowledge. Finnish company Eficode is named as an example of a team building a technical support engine across complex internal wikis. Huge uses agents for market research and contract analysis, where new business intake can move from several days to minutes.
Source: How Gemini Enterprise is helping SMBs jumpstart their AI transformations, Google Cloud
OpenAI points to a similar pattern from another direction. When Codex is used by analysts, marketers, operators, and designers, the better question is no longer "can it write code?" It is "can it create a first draft the team can actually review?" According to OpenAI, Zapier uses Codex to pull context from Slack, Google Docs, and Coda and turn it into postmortems, incident response plans, and feature tickets. NVIDIA is named for research workflows such as finding ideas and writing machine-learning infrastructure scripts.
Source: Codex for every role, tool, and workflow, OpenAI
Perplexity is aiming more directly at growing businesses. Its Main Street AI Accelerator with the U.S. Small Business Administration includes $25 million in Computer credits, up to 100,000 eligible companies, and connections to more than 400 tools such as QuickBooks, Mailchimp, Shopify, and Stripe. The program is U.S.-specific, but the pattern travels well: AI is moving into the places where small businesses already handle finance, marketing, ecommerce, and payments.
Source: Perplexity Computer for Growing Businesses, Perplexity
The tooling layer: platforms, agents, and workflows
Three technical layers are becoming useful for non-technical teams too.
First: the workbench. Codex, Notion Developer Platform, and Perplexity Computer all point in the same direction from different sides. AI should not only answer in a box. It should read the right context, create a draft, place it where the team already works, and ask for approval when needed. Notion describes external agents, Workers, and sync from API-backed data sources into Notion.
Sources: Notion releases and Perplexity Computer for Growing Businesses, Perplexity
Second: the connection to data. Google Cloud's AlloyDB MCP server is a concrete example. It lets agents read fresh operational data through IAM, restrict access to tables, schemas, or views, use read-only SQL, and log queries and tool calls in Cloud Audit Logs. That is the kind of pattern teams need when AI should be useful without receiving the whole keyring.
Source: AlloyDB Remote MCP Server GA, Google Cloud
Third: search and quality. Mistral released Search Toolkit in public preview for ingestion, retrieval, and evaluation in AI search workflows. Evals are tests that measure whether the AI system finds the right source material and gives useful answers on examples that look like real work. For a school, agency, or small organization, a good first step can be ten real questions and a check of whether the AI finds the right documents before it is used live.
Source: Introducing Search Toolkit, Mistral AI
Governance and risk: what needs to be in place before scaling
The practical security question is not "should we connect AI to anything at all?" It is: how do we connect AI so it can help with small permissions, visible logs, and clear stop points?
The EU AI Act keeps mattering even for organizations that do not build foundation models themselves. The Commission's GPAI guidelines explain which providers fall under the rules for general-purpose AI models, when the obligations apply, and that the Commission's enforcement powers start applying on 2 August 2026. For buyers, vendor questions should get more concrete: model cards, training summaries, incident routines, data protection, and which subprocessors are involved.
Source: Guidelines for providers of general-purpose AI models, European Commission
The NIST AI Risk Management Framework is still a useful checklist for organizations that want clearer ownership, measurement, and follow-up. It is voluntary, but practical: describe the use case, the risks, who owns the decision, how quality is measured, and how failures are reported.
Source: AI Risk Management Framework, NIST
The OWASP GenAI Security Project points to another everyday risk: AI apps and agent flows introduce new vulnerabilities when they can read data, use tools, and take actions. Practical controls include environment variables or secret managers instead of passwords in prompts, scoped API keys, least privilege, sandboxes, approval gates, audit logs, and redaction of sensitive information in outputs.
Source: OWASP Top 10 for Large Language Model Applications
This week's practical Hammer test
This test takes 30 to 45 minutes and suits a small team that already uses several systems but does not want to build a full automation yet.
Pick one recurring task: for example this week's customer questions, an internal support queue, an upcoming newsletter, new quote requests, or student questions before a lesson.
Draw the workbench: write down three things: which sources AI may read, which draft AI may create, and who must approve before anything is sent, published, or updated.
Set small permissions: start with test data or a narrow view. If tools are connected, use scoped API keys, separate test accounts, read-only access where enough, and logging for every action.
Run five real examples: ask AI to create a first draft, but let a human mark what was right, what was missing, and what must not be automated yet.
Save the decision: end with a short rule: "AI may do this", "AI may not do this", and "a human must approve here".
Copy this prompt:
You are our AI workbench designer. Help us choose one recurring task where AI can help next week.
Task: [describe the task]
Sources AI may read: [list systems, documents, or views]
Draft AI may create: [e.g. reply, summary, quote material, work card]
Things AI may not do by itself: [e.g. send to customer, change price, update records]
Human reviewer: [role/person]
Give us:
1. A simple 5-step workflow.
2. The smallest permission AI needs at each step.
3. Which logs or receipts we should save.
4. Five test cases we can run before scaling.
5. A clear stop rule when AI is uncertain.
Companies and tools to watch
- OpenAI Codex: moving from code into more roles and material types.
- Amazon Bedrock: making OpenAI models and Codex easier to buy, govern, and log for AWS organizations.
- GitHub Copilot: showing how agent work becomes API-driven and sandboxed.
- Google Cloud AlloyDB MCP: showing a concrete pattern for AI with governed database access.
- Zapier MCP: making app connections understandable for teams without an integration department.
If you want to do this without getting stuck in tool selection, Hammer Automation can help through Tool Forge / Verktygssmide: choose one workflow, connect it with small permissions, build in review, and make the result measurable. Start here: contact Hammer Automation.
The Forge newsletter
Get new articles in your inbox
Pick the topics you care about. No noise, at most one email a week.
We follow GDPR. Unsubscribe anytime.


