Always-on inbox, no working memory
- Priorities reshuffle with every new arrival.
- Newsletters, billing alerts, and real questions share the same lane.
- Important threads slip past three notifications and re-surface late.
A five-phase plan for knowledge workers buried under more than a hundred emails a day. Sort the inbound before you read it, batch your responses, and stop confusing presence in the inbox with progress.
If you receive more than a hundred emails a day and you have stopped expecting to be at zero, you are not lazy and you have not failed at time management. You have outgrown the assumption that an inbox is a list of tasks. The fix is not to read faster — it is to sort the inbound before you touch it, draft replies in batches, and keep yourself in the loop only where your judgement matters. This roadmap walks the five phases that get you there, with the AI doing the parts machines are good at and you doing the parts you are still better at.
{your mail client} with filter or rule support (Gmail, Outlook, Fastmail, Superhuman, etc.) and at least 30 days of inbox history searchable.{your AI assistant} account (ChatGPT, Claude, or similar) with API access or a built-in mail integration.{your task manager} you already use (Todoist, Things, Notion, a paper list — anything you check at least daily).Illustrative range, not benchmark — your numbers will vary by role, subscription mix, and how much of your work happens in mail versus elsewhere.
Five phases, sequenced so that each one is shippable on its own. If you stop after phase 2, the deterministic filter layer alone will quietly reduce daily inbound by 30 to 50 percent. Phases 3 through 5 layer in AI categorisation, batched review, and ongoing hygiene. The decision gates between phases are real stop points — if a phase is not working, do not paper over it with the next.
Before any rule, any filter, any AI prompt — you need to know what your inbox actually is. Most people discover their real inbound looks nothing like their assumptions about it, and that two or three categories account for the majority of volume.
Open {your mail client} and pull a representative slice — the last 30 days is usually enough, longer if your role is highly seasonal. Export sender, subject, and arrival timestamp into a spreadsheet, or paste the list directly into a chat with {your AI assistant}. Ask it to cluster the messages into candidate buckets based on sender pattern, subject pattern, and apparent intent. Do not let the AI name the buckets in its own register — give it your four-category target list up front: act (something is required of you), FYI (you should know but no action is needed), noise (automated mail, newsletters, marketing, status pings that do not require eyes), and external (anything from outside your company that needs human-quality attention).
Walk through the clusters one at a time. For each cluster, write down two facts: the share of total inbound it represents (rough percentage is fine) and one or two example senders. By the end of an hour you will have a single-page document that names every meaningful category in your inbox, with its share and its examples. This document is the source of truth for everything that follows. If a filter in phase 2 misclassifies mail, or a phase 3 prompt produces the wrong tag, the cause is almost always that the audit doc is incomplete or muddled.
Resist the urge to start fixing things as you go. The audit is for understanding. Fixing happens in phase 2.
{your mail client} — for the 30-day export or in-app search.{your AI assistant} — for clustering and surface-level pattern analysis.The cheapest, fastest, most reliable categoriser in any inbox is a deterministic rule. Before you spend a single token on an AI categorisation, set up rules in {your mail client} that handle the unambiguous cases — automated alerts, vendor newsletters, calendar notifications, billing receipts. Done well, this kills 30 to 50 percent of your daily inbound without any LLM cost.
Start with the categories from your audit doc that are obviously deterministic. Newsletters almost always come from a predictable sender list and contain "unsubscribe" in the footer. Calendar invites carry a specific MIME header. Billing receipts come from a small set of known domains. CI alerts come from your own infrastructure. For each of these, write a rule in {your mail client} that does exactly one thing: applies a label and skips the inbox. Do not delete on the first pass — labels-and-archive is reversible, deletion is not. You will want a way to audit what the filter is catching for the first two weeks.
Treat the filter set as a small system, not a sprawl. Eight to fifteen rules is normal for a typical knowledge-worker inbox; fifty is a sign you are slicing too thinly or trying to handle ambiguity with rules. Anything that cannot be matched on sender, exact subject pattern, or a single header field is not a deterministic case — it belongs to phase 3.
After the rules are in place, leave them for a full work week before tuning. The instinct to over-iterate in the first 48 hours destroys more good filter sets than any other failure mode. Watch where the filters miss; do not re-tune what catches correctly.
{your mail client} native rules — Gmail filters, Outlook rules, Fastmail sieve scripts, etc.Now the AI handles what the rules cannot: ambiguous mail that needs to be read to be classified. Wire {your AI assistant} to tag every remaining email with two pieces of metadata — an intent label and a suggested action. No drafts yet, no auto-replies — just a labeled, sorted inbox.
Connect {your AI assistant} to {your mail client}. The path matters: most modern mail clients now have first-party AI integrations (Gmail with Gemini, Outlook with Copilot, Superhuman AI) or accept third-party automation through Zapier or Make. Use whichever path your team can support without infrastructure work. The integration should be triggered on new mail that survived the phase 2 filter layer, send the first 1,500 characters of the message to the AI, and return two values: an intent label drawn from a closed list, and a suggested action drawn from a separate closed list.
A practical starting taxonomy: intents are action-required, question-for-you, FYI, scheduling, external-business, and uncertain. Suggested actions are reply-now, reply-in-batch, defer-to-task, archive, and escalate-human. Constrain the prompt to pick exactly one from each list, with a fallback of uncertain + escalate-human when nothing fits. The closed-list constraint is what stops the model from inventing fluent-sounding categories that drift from your audit doc.
Run the system for five business days. Each evening, sample five labels at random and check them. Treat any mis-categorisation as either a category problem (return to phase 1 and refine), a prompt problem (tighten the closed list), or a context problem (the email referenced something the AI cannot see — these will always need human-handling).
{your AI assistant} — via API, native mail-client integration, or a Zapier/Make connector.{your mail client} — to receive the labels back as a label, flag, or custom field.The labeled inbox from phase 3 is now ready to be processed in batches. Stop checking mail continuously. Set two review windows — one in the morning, one near end of day — and process the inbox to zero each time. The AI prepares draft replies for routine threads; you approve, edit, or rewrite before sending.
Pick two times. Morning is usually 30 to 45 minutes after you start work — late enough that the morning's inbound has arrived, early enough that you are not already deep in something else. End of day is usually 30 to 60 minutes before you stop. Block these on {your calendar} and protect them. Outside these windows, the mail client is closed. If you find yourself opening it anyway, set a daily browser block or move the client off your primary device for a week to break the habit.
Inside each window, work top-down by intent. Reply-now threads from phase 3 are the only urgent batch — usually a small number. Reply-in-batch threads get an AI-drafted reply that you read, edit if needed, and send. Defer-to-task threads create a task in {your task manager} with a one-line summary and the original email linked — the email itself gets archived immediately. FYI gets scanned and archived. Escalate-human threads stay in the inbox for you to handle without AI help — usually executive correspondence, anything emotionally loaded, anything legally meaningful.
The AI draft step is narrow on purpose. Only generate drafts for reply-in-batch intents where the underlying answer is routine — meeting confirmations, document-link replies, brief status updates, vendor acknowledgements. Anything that requires judgement, position, or persuasion is faster to write yourself than to edit out of a draft. Aim for a drafted-reply edit rate under 30 percent; if you are rewriting more than that, narrow the categories that trigger drafting.
{your AI assistant} — for draft generation, using a short brand-voice or personal-voice document.{your mail client} — to receive drafts as unsent replies, never auto-sent.{your task manager} — for deferred actions extracted from email.{your calendar} — for the two protected review windows.By the end of phase 4 you have a working system. The remaining work is to keep it working. Email subscriptions accumulate, senders change roles, AI providers update their models, and your own work mix shifts every quarter. Without explicit maintenance, the system silently degrades over six to twelve months until you are back where you started.
The weekly hygiene pass is ten minutes, ideally Friday afternoon. Open the labeled folders the phase 2 filters route to. Look for two things: messages that landed in the wrong folder (a real human reply caught by a noise rule) and messages that should have been filtered but were not. Adjust one or two rules. Then look at the AI "uncertain" bucket from phase 3 — anything sitting there suggests either a missing category or genuinely unusual mail.
The monthly rule review is 30 minutes. Pull the count of mail caught by each filter rule over the past month. Rules that catch zero mail are dead — delete them. Rules that catch hundreds and never produce false positives can be promoted from archive to auto-delete with confidence. Read five randomly-sampled AI categorisations end-to-end; if the edit rate on AI drafts has climbed above 40 percent for any category, that category needs a tighter prompt or removal from the draft list.
The quarterly subscription audit is 45 to 60 minutes. Open {your mail client} and search for the word "unsubscribe" over the past 90 days. Unsubscribe from anything you have not opened in that window. Re-run the phase 1 clustering exercise against the most recent 30 days — if the category mix has shifted by more than 20 percent, your audit doc is stale and the filter set and prompts need to be re-baselined against the new mix. This is also the right moment to test any new AI model your provider has released; behaviour changes meaningfully across versions in ways release notes rarely capture.
{your mail client} — for rule counts, label folder scans, and unsubscribe searches.{your AI assistant} — for the periodic re-clustering exercise.These are specific limits as of 2026-05. Treat them as the failure modes you would otherwise discover at the worst possible moment.
The system needs maintenance, not because AI is fragile but because your role, your network, and the senders who want your attention all change. The cadence below is what holds up over twelve months.
{your AI assistant} releases a new model and you switch, re-run the phase 3 audit (30 emails) before trusting it on production traffic.