Keeping people in control of AI that does real work, Almira Hanafiah

01 · Overview

How do you give an AI real work to do, and still keep the person in control?

I work as the product designer at an early-stage AI company building an AI platform for small to mid-sized businesses, where my focus is the moment a person hands work over to an AI and then has to stay in control of it once it is running. Most of my job was designing for that, keeping the person in a position to understand, trust, and oversee what the AI actually does on their behalf.

Role

AI Product Designer

Timeline

February to May 2026

Team

Founder, three engineers, one product designer (me)

Skills

Interaction design, systems thinking, product thinking, user research, prototyping, Figma

Outcome

Design that lets a person see what an AI is doing with their real tools and data, and step in at the moments that matter.

02 · TL;DR

I designed for trust, so the work could feel lighter without feeling out of your hands.

The challenge

People don't easily trust AI, and that becomes very real the moment the work touches their actual tools and data, because the AI is acting on their behalf. The experience had to make that feel safe rather than opaque.

What I did

I worked closely with the founder and the engineering team to design for trust and oversight, so a person can understand what the AI is doing and steer it as it goes, rather than being handed a finished result they have to take on faith.

The outcome

Across more than twenty feedback sessions with early users and customers, what they responded to most was feeling able to see and check the work, so it felt lighter without feeling out of their hands.

03 · Problem

How do you trust a workflow you didn't build?

People don't easily trust AI to do their actual work, especially once it is using their real tools and data.

As AI has become more capable and more accessible, the person describing the work is often not technical and does not want to map out steps. They want to say what they need and have it handled. That is freeing, and it is also unnerving, because you are trusting something you did not build to act for you.

That was the problem I kept coming back to. When a person is not the one wiring the steps, and the work touches their real tools, their data, and permission to act on their behalf, how do you give them enough visibility and control to actually trust it. That question sat under everything I designed.

04 · Research

How this kind of work was handled before AI.

Before designing anything, I looked at how this work was done before AI, and the honest answer was that you built it yourself, wiring each trigger and action together by hand in tools like Zapier and Bardeen. The more time I spent there, the clearer it became that I needed a different approach, because I was no longer designing a fixed set of steps a person assembled themselves. I was designing for an AI that takes the work on, which moved the real design problem away from the wiring and onto trust.

The old way · wire the graph yourself, in a tool like Zapier

X

Trigger

Xero · New invoice

+

N

Find data

NetSuite · Look up the PO

+

Z

Paths · conditional

Split on VAT vs PO cap

Path A · VAT ≤ cap

N

Action

Create journal entry

+

S

Action

Post to Slack

Path B · VAT > cap

Z

Action

Hold the invoice

+

S

Action

Notify for approval

The new way · describe the outcome

Reconcile my invoices each morning, then group them by furniture type.

AGot it. Here's how I'll handle it: each morning I'll pull the new invoices, group them by type, and set aside anything that looks off for you to check. I'll run it by you the first time.

A design exploration of the shift. On the left, the same work built by hand in Zapier, where adding one condition already splits it into branching paths. On the right, the person describes the outcome in a sentence and the agent says back, in plain language, how it will handle it. A generic illustration; the Zapier interface is abstracted.

An early wireframe of a node-based workflow builder, an abandoned direction for tasks. — Early FigJam ideations of the node-based workflow builder, from before AI was part of how I designed.

I also looked at how today's chat tools handle this. Even when a tool like Claude can run something on a schedule, the result just lands back in the chat as a single message. That is fine for one person experimenting on their own, but it doesn't hold up inside a business, where the same piece of work has to be checked, shared, and answered for. Keeping that accountable, without making the experience heavier, was a big part of the design problem.

The product had been in development for around a year by this point, but tasks were still a work in progress, taking shape alongside several other features the team was building. A lot of the customer conversations happening around the same time weren't specifically about tasks, though the patterns I kept hearing about how people want to work with AI carried over when it came to designing them.

05 · Principles

A few principles for designing something people can trust.

These held across the whole experience, and they are as much general design wisdom as anything specific to this product.

01

The person should always know what "done" looks like.

Before anything runs, the person and the AI should share the same picture of the outcome, so the work is measured against what the person actually wanted rather than the tool's best guess.

02

People should understand what the AI is doing.

A lot of the people relying on this are not technical, so the experience should say what the AI is doing in plain language, the way you would explain it to a colleague, rather than leaving them to read system output.

03

Keep the person at the decisions that matter.

The person should be able to step in and decide at the points that carry weight, so the AI never quietly does something important without them.

06 · Solution

Prototyping the experience, from the first ask to the moment it needs you.

With the principles in place, most of the work was prototyping the experience and getting the feeling of oversight right. To make the thinking concrete, here is a walk through a generic example, from the moment a person asks for something to the moment the work comes back to them for a decision.

01 · Defining the work

I wanted asking for something to feel like briefing a colleague who asks a couple of sensible questions, rather than filling in a form or wiring up a setup. Where a question has a few likely answers, I offered them as quick choices, so scoping the work takes a tap rather than a paragraph and the whole thing comes together faster.

New task

Reconcile my invoices each morning and group them by furniture type.

AHappy to. First, anything you would want me to keep an eye on?

Watch for duplicate invoices from the same supplier.

A

Good flag. One more thing, so I handle the edge cases the way you would want.

How should I handle anything that doesn't reconcile?

1 Hold it for me to check before anything posts

2 Flag it and keep going so the rest is not held up

3 Try to resolve it first then ask if you are stuck

Something else Skip

Defining the work as a conversation, with quick choices where they help, so scoping the task takes a tap rather than a back-and-forth. A generic screen, invented for this case study.

02 · Where things are, at a glance

Alongside the conversation, I designed a panel that surfaces where things are at any moment, so a person can read the state at a glance, dip into the detail when they want it, or just keep talking to the agent. It keeps the chat calm while the context stays one look away.

Draft the Q2 report from the brief and our last update.

AOn it. I'll read the brief and the last update first, then draft the sections.

Sounds good.

ADone with the first pass. The summary is up top, and there are two things I set aside for you.

Q2 report In progress

Summary Activity Details

Details

Started

Owner

Status

Attachments

Beside the conversation, a panel that surfaces where things are at a glance. A generic illustration, invented for this case study, with the fields abstracted.

03 · When it needs you, it shows its working

The hardest moments to design were the ones where the AI needs a person to weigh in. I wanted those to arrive with the reasoning in context, so a person could understand why and decide quickly, without having to go digging for the background.

AAgent· 9:14

5 steps completed

Most of this batch was straightforward and is done. There is one item I would rather you decided on, with what I noticed laid out so the choice is quick.

One item is worth a second look

The figures on this one do not line up the way the rest did, so I have set it aside rather than guess. Choose how to handle it and I will finish the others.

March accounts – draft

Word document · 8 pages

A generic decision screen, invented for this case study: one flagged item, presented for a person to decide on.

One detail in the design is my intentional use of colour. I tend not to use red unless something has genuinely gone wrong, because a moment like this is a nudge, not an error. A warm yellow is striking enough to catch a person's eye and pull them to the decision, without the jolt of red making them think the AI has done something wrong.

07 · Feedback

Speaking to the non-technical teams who would use it.

Once the prototype was in shape, I went back to a handful of our early customers and showed it to them against their personalised use cases. I wanted to see whether the way workflows behaved was clear for a user, especially when the agent was running against their tools, their data, and their schedule. A few noteworthy insights came up.

Synthesis · what kept coming up

HabitFitting into how people already work mattered more than adding somewhere new to check.

ClarityPeople wanted to understand what they were agreeing to before they relied on it.

ConfidencePeople felt more sure when the outcome was clear to them visually, not just marked “done.”

SharingA back-and-forth chat isn't something a team or a manager can share, review, or answer for.

Synthesising the feedback sessions into the things that came up again and again. (Anonymised.)

These four key themes pulled in the same direction, which was that visibility and trust was essential, on top of a clear and non-technical-friendly interface. This confirmed the direction workflows were heading and shaped the parts I went back to refine.

Looking ahead

The principle of trust is not specific to this product. Any AI tool that acts on someone's behalf has to solve the same thing, which is how a person stays in control of work they cannot fully predict, and that is going to be true of most of the tools we use over the next few years.

08 · Learnings

People will lean on AI for real decisions, so it has to earn their trust.

What worked

The thing I keep coming back to is that as AI gets better, people are going to hand more of their decisions over to it, so the work worth doing is making that feel trustworthy. What I noticed across the design is that trust gets earned, not assumed, and that good design is what earns it. That through-line is the thread I keep pulling on in everything I design.

What I'd do differently

I'd get into user testing earlier and more often, since a lot of what I learned about how people want to watch an AI work came from feedback I could only have gathered by putting real runs in front of them sooner. And there is plenty still open, like how to keep a person oriented when the work gets longer and more involved, which is honestly one of the more interesting problems left to solve.

A next step

One of the more interesting open problems is how a tool like this earns enough trust over time that people can rely on it comfortably, more about the relationship than any single screen.

Keeping people in control of AI that does real work.

How do you give an AI real work to do, and still keep the person in control?

I designed for trust, so the work could feel lighter without feeling out of your hands.

How do you trust a workflow you didn't build?

How this kind of work was handled before AI.

A few principles for designing something people can trust.

The person should always know what "done" looks like.

People should understand what the AI is doing.

Keep the person at the decisions that matter.

Prototyping the experience, from the first ask to the moment it needs you.

01 · Defining the work

02 · Where things are, at a glance

03 · When it needs you, it shows its working

Speaking to the non-technical teams who would use it.

People will lean on AI for real decisions, so it has to earn their trust.

What worked

What I'd do differently