How do you give an AI real work to do, and still keep the person in control?
At Brim I design how people hand work to an AI and stay in control of it once it is running. Brim itself is an AI system that helps businesses own their specific intelligence, meaning the agent will continuously learn and run real tasks accordingly to the way their company works.
This case study is about the assignment, which is the unit of work a person and an agent share on Brim, and the way a piece of work gets defined, run, and held accountable between them. A lot of my job was designing for that while keeping the person in a position to trust and oversee what the agent actually does.
Outcome
I designed the way work gets defined, run, and handed back to the person.
People don't easily trust AI, and that becomes very real the moment the work involves their actual tools, because the agent is pulling in their data, holding permissions to their accounts, and acting on their behalf. An assignment had to make that feel safe rather than opaque.
I worked closely with the CEO and the engineering team to define how an assignment gets run, so the person keeps control and oversight the whole way through. What the agent is doing and what it produces is shown clearly and in real time as it happens, rather than handed over as a finished result they have to take on faith, and the person has the power to make a decision at every checkpoint, after each step in the plan, before the agent carries on.
In more than 20 feedback sessions with our early users and customers, what they responded to most was how much shorter the work became, where something that used to take real effort now mostly happened on its own while they could still see and check it.
How do you trust a workflow you didn't build, that runs a little differently each time?
As AI has become more capable and more accessible, the user describing the work is often not technical and doesn't want to map out steps, they just want to say what they need and see it happen. Underneath, the system plans and runs the work itself, so it can approach the same task a little differently each time.
That left me with a specific problem to solve. When the user isn't the one building the workflow, and there is no fixed set of steps for them to rely on, and the work involves their real tools, their data, and permission to act on their behalf, how do you give them enough visibility and control to actually trust it. I started treating the assignment, the shared unit of work between a person and an agent, as the place this had to be answered, because the assignment is where work gets defined, where it runs, and where it comes back to you.
Looking into new ways to visualise workflow automation.
Before designing anything, I wanted to see how this kind of work was handled before AI, and the honest answer was that you built it yourself, wiring each trigger and action together by hand in tools like Zapier and Bardeen. The more I sat with those examples, the more it became clear I had to come up with something quite different, because I was designing for two shifts at once, for agentic AI, where the system plans and runs the work itself rather than following a fixed script, and for a new kind of dynamic experience, where the interface is being composed in response to what is needed rather than drawn ahead of time. That meant the patterns I was used to, the wiring diagrams and the static screens, could only take me so far.
I also looked at how today's chat tools handle this. Even when a tool like Claude can run something on a schedule, the result just lands back in the chat, with no clear way to see what it actually did. That is fine for one person experimenting on their own, but it doesn't hold up inside a business, where the same piece of work has to be checked, shared, and answered for. What I kept coming back to was keeping the thread simple and giving every run a side panel that logs it, an audit trail that changes from run to run and that a person, or their manager, can open and verify.
Brim had been developing for around a year by this point, but assignments were still a work in progress, taking shape alongside several other features the team was building. A lot of the customer conversations happening around the same time weren't specifically about assignments, though the patterns I kept hearing about how people want to work with AI carried over when it came to designing them.
So I set five rules for how an assignment should behave.
These shaped how every part of an assignment is designed, with the first run especially in mind.
Every assignment has a clear goal.
An assignment is defined by the outcome it is trying to reach, which the person describes as intent, so before anything runs both the person and the agent know what finished is meant to look like.
On the first run, the person approves at every checkpoint.
The first time an assignment runs, the agent pauses at each step in the plan and waits for the person to approve before it carries on, so nothing important happens without them. Over time this can relax as the agent earns the right to run more of the work on its own.
Trust comes from seeing the real work, as it happens.
While an assignment runs, the person can watch it work in real time through a live step rail that shows each step the agent is taking, the tools it is reaching into, and the data it is pulling in.
The work stays in a record you can go back to.
Every run keeps a full log of what the agent did, where the person can open any step and see the real files it retrieved and produced. The record stays there after the run finishes, so the person can look back over what happened and adjust the plan for next time.
The work comes to the person.
Because people won't always log in to check, an assignment reaches them where they already are, through notifications, email, and schedules. The agent can also start a run on its own when it notices new activity in a connected tool.
One assignment, from the sentence that starts it to the moment it needs you.
The five principles come together in how a single assignment actually runs. To make that concrete, I'll walk through one, a supplier invoice reconciliation that an agent runs for a home-furniture retailer, from the moment it is defined to the moment the work reaches the person. The run at the top of this page is that same assignment in motion.
01 · Defining the work and connecting its tools
An assignment starts as intent. The person describes the outcome they want, and the agent works out the setup before proposing anything, asking where the work comes from and where it should go. I wanted defining work to feel like briefing a colleague who asks a couple of sensible questions, and for connecting the tools to happen naturally inside that conversation rather than as a hidden setup step.
02 · The run, made visible
Once it runs, the assignment becomes a live thread. The agent streams in what it is doing as it does it, in plain language, with the real tool calls shown underneath each step.
03 · The work comes to the person, when they need it
The finding that people often weren't logging in shaped this part the most. When a run reaches something that needs the person, the same moment goes to where they already are, in the same words wherever it shows up, so a decision the agent can't make on its own never sits waiting unseen.
04 · When the agent needs you, it shows its working
When the agent finishes its part, the items that need a person's decision sit right inside the conversation. Each one expands to show the specific invoice, the rule it broke, and a suggested action, so the person can read the reasoning in context and act without having to dig for the original document.
- Extracted 5 attachments from approved emails
- Applying Invoice Processing Skill to 5 invoices
- CIS check complete, 2 flagged, 3 clean
- Approver routing complete
Done processing. I found 2 invoices that need your attention, one has a CIS/VAT error and one is missing a job number. The other 3 look correct. Review and action each one below.
Taking the prototype back to the people who would use it.
Once the prototype was in shape, I went back to a handful of our early customers and showed it to them against the work they actually wanted to use it for. I wanted to see whether the way assignments behaved made sense when the agent was running against their tools, their data, and their own kind of work. A few things came up again and again.
Those four pulled in the same direction, towards work the person can see, a record they and their managers can trust and share, and an assistant that reaches them rather than waiting to be checked on. They confirmed the direction the assignment was heading and shaped the parts I went back to refine.
None of this is really specific to Brim. Any AI tool that acts on someone's behalf has to solve the same thing, which is how a person stays in control of work they cannot fully predict, and that is going to be true of most of the tools we use over the next few years.
People will lean on AI for real decisions, so it has to earn their trust.
What worked
The thing I keep coming back to is that as AI gets better, people are going to hand more of their decisions over to it, so the work worth doing is making that feel trustworthy. What I noticed across the design is that trust gets earned, and the shape of how it gets earned turned out to be specific. It is earned when the person can see what is running as it happens, when the agent shows its reasoning for what it has done or is about to do, and when the actual files it is reading and producing stay visible rather than getting tucked behind a polished result. Holding the first run to an approval at every checkpoint turned out to be the thing that let people relax into it later, and the through-line, that the agent has to make itself legible to the person, is the same thread I keep pulling on in the design system case study next door.
What I'd do differently
I'd get into user testing earlier and more often, since a lot of what I learned about how people want to watch an AI work came from feedback I could only have gathered by putting real runs in front of them sooner. The other piece we haven't fully worked out is how separate assignments tie together when they are really part of one larger piece of work. Chaining them is something we are still developing, and it isn't entirely clear yet, which is honestly one of the more interesting problems left to solve.