This is the complete guide to Mail Extract — what it does, how to use it, and how it handles your data. Written for anyone, with technical detail where it matters.
What Is Mail Extract
Mail Extract pulls emails from your Gmail account and gives you clean, readable text. No formatting noise, no HTML, no reply chains — just the content. The output is designed to be pasted directly into AI tools like ChatGPT or Claude as context.
You search by sender, keyword, date range, and folder. The app returns the matching messages as plain text, ready to copy. Nothing is saved after your session ends.
Mail Extract is a read-only tool. It cannot send, delete, modify, or archive your emails. It reads, extracts, and forgets.
Signing In
There are two ways to sign in, depending on how you want to authenticate with Gmail.
IMAP — App Password
Enter your Gmail address and a Google App Password. This is not your regular Gmail password — it's a special password you generate in your Google account settings specifically for third-party apps.
- Enter your Gmail address and press Enter
- Enter your app password in the field that appears
- Click Sign in
If you don't have an app password yet, the login page includes a link to Google's setup guide. You'll need two-factor authentication enabled on your Google account first.
Google OAuth
Click Sign in with Google. You'll be redirected to Google's consent screen, where you grant read-only access to your Gmail. No passwords are entered in our app — Google handles the entire authentication.
The OAuth track is currently in testing and limited to approved users. If you're not on the list, use the IMAP option.
Both sign-in methods give you the same app with the same features. The only difference is how you authenticate.
Stay Signed In
On the login page, you'll see a toggle: Stay signed in for 7 days. When this is off (the default), your session expires after 2 hours of inactivity or 8 hours total — whichever comes first.
When you turn it on, your session lasts up to 7 days. You can close the browser, come back later, and pick up where you left off without signing in again.
In both cases, your session is encrypted on disk and automatically deleted when it expires. Logging out destroys it immediately.
Searching for Emails
Once signed in, you'll see the main search interface. There are four filters you can combine: folder, sender address, keyword, and date range. You don't need to use all of them — leaving a filter empty means "match anything" for that field.
Folders
Choose between Inbox and All Mail. Inbox searches only your current inbox. All Mail searches everything — inbox, sent, archived, and any labels you've created. All Mail is selected by default.
From Filter
Type an email address and press Enter to add it as a chip. You can add up to 20 addresses. The search returns emails from any of the addresses you list — it's an "or" search, not "and."
Each chip is validated as you add it. If an address isn't a valid email format, it shows with a red border. You can remove any chip by clicking the X on it. Pressing backspace in an empty field removes the last chip.
If you leave this filter empty, the search returns emails from all senders.
Contains Filter
Enter a word or phrase to search for in email subjects and body text. The search is not case sensitive. If you've also added sender addresses, both filters must match — the email must be from one of those senders and contain the keyword.
Date Range
The timeline slider covers the last 24 months. Drag the left and right handles to set your start and end dates. The selected range is displayed above the slider.
Both dates are inclusive — if you set the range to January 2025 through March 2025, you'll get emails from the first day of January through the last day of March.
Max Messages
In the top right of the filters card, you'll see a max counter. This controls how many emails are returned per search. The default is 50. You can set it anywhere from 5 to 200.
The counter is colour-coded as a rough guide:
| Range | Colour | Meaning |
|---|---|---|
| 5 – 50 | Green | Fast, responsive |
| 51 – 100 | Amber | May take a moment |
| 101 – 200 | Red | Slower — larger extractions |
If your search matches more emails than the max, the results will show a notice that more exist. You can increase the counter or narrow your date range to capture them.
Results & Export
After a search completes, the results card appears with a preview of the first 8 messages. Each message shows the date, sender, subject, and the first 300 characters of the body. If there are more than 8, you'll see a note like "... and 42 more (all included in copy)."
Below the preview, you'll see a summary: how many emails were found, the total character count, and an estimated token count (characters divided by 4 — a rough approximation for most AI models).
Copying
Click Copy All to copy every extracted email to your clipboard. The full text is copied — not just the preview. The export includes a header with your search parameters (account, query, date range) followed by the complete text of every message.
The button briefly changes to "copied!" to confirm it worked.
Status Indicators
The coloured dot next to the status bar tells you what happened:
| Colour | Meaning |
|---|---|
| Green | Emails found successfully |
| Amber | Partial results — some addresses had no matches, or no emails found at all |
| Red | No results for any address, or an error occurred |
Signing Out
Click Sign out in the top right of the account card. Your session is destroyed immediately — the encrypted session file is deleted from disk and your browser cookie is cleared. There is nothing left to recover.
If you close the tab without signing out, the session will expire on its own (2 hours or 7 days, depending on your setting) and be cleaned up automatically.
Troubleshooting
| Problem | What to do |
|---|---|
| "Sign-in failed" | Double-check that you're using a Google App Password, not your regular Gmail password. The app password is 16 characters, usually shown in groups of four. |
| "Too many attempts" | Wait for the countdown to finish. The delay increases with each failed attempt but resets after 24 hours. |
| Redirected to login unexpectedly | Your session expired. Sign in again. If this happens frequently, enable "Stay signed in for 7 days." |
| "30 extractions per hour" limit | You've hit the rate limit. The error message shows when the limit resets. Wait for that time, then try again. |
| No results found | Try widening the date range, checking the sender address for typos, or switching from Inbox to All Mail. |
| Search seems slow | Lower the max messages counter. Large extractions (100+) take longer, especially over wide date ranges. |
| OAuth: "Access was denied" | You need to grant Gmail read access on the Google consent screen. The app cannot function without it. |
Limits & Quotas
These limits are enforced server-side to protect both the service and your mail provider's API.
| Limit | Value | Detail |
|---|---|---|
| Extractions per hour | 30 | Per session, fixed window. Resets on the hour. |
| Max messages per search | 200 | Hard cap regardless of UI setting |
| From addresses per search | 20 | Client-side and server-side enforced |
| Contains query length | 200 characters | Longer queries are rejected |
| Request body size | 20 KB | Requests larger than this are dropped |
| IMAP connection timeout | 60 seconds | Per operation — a hung connection won't hold your session |
| Concurrent operations | 3 per session | Additional requests are queued, not rejected |
| Email field max length | 254 characters | Per RFC 5321 |
| Password max length | 128 characters | App passwords are 16 characters; this is a safety cap |
Login Throttling
Failed login attempts trigger a progressive delay keyed to your IP address and the email you entered. The first attempt has no delay. After that: 5 seconds, 10 seconds, 20 seconds, doubling each time. The counter resets after 24 hours without a failed attempt. This is tracked per IP + email combination — a failed attempt for one address doesn't affect another.
How Text Extraction Works
The goal of extraction is to give you only the original message body — no reply chains, no signatures, no formatting artefacts. Here's what happens to each email before it's returned to you.
Two-Way Conversation Capture
When you search by sender address, Mail Extract doesn't just pull emails from that person — it also pulls your replies to them. Behind the scenes, two searches run for each address: one on the From field, one on the To field. The results are merged, deduplicated, and sorted by date.
The effect is that you get the full back-and-forth as clean, individual messages in chronological order. Each message stands on its own — reply chains are stripped, so you're reading what each person actually wrote, not the same quoted thread repeated fifty times. For a conversation with 30 exchanges, you get 30 clean messages instead of one enormous chain where the last reply quotes every message before it.
This is designed for AI context. Language models work best with clean, ordered text — not nested reply chains where the same content is duplicated in every message. Extract once, paste once, and the model has the full conversation.
HTML to Plain Text
<style>and<script>blocks are removed entirely- Block-level tags (
</p>,</div>,</li>) are converted to line breaks <br>tags become line breaks- All remaining HTML tags are stripped
- HTML entities (
&, , numeric codes) are decoded - Excessive blank lines are collapsed
If a plain text version of the email exists and is at least as long as the HTML conversion, the plain text version is used instead.
Reply Chain Removal
The extractor looks for common reply markers and cuts the message at the first one it finds:
- Lines starting with
>(standard quoting) -------- Original Message(Outlook-style forwarding)- Underline or equals separators (
____,====) - Inline forwarding blocks (
From: ... Sent: ...) - "On [date], [name] wrote:" patterns
Signature Removal
Email signatures are stripped at common delimiters: the RFC 2822 standard -- (two dashes and a space), triple underscores, and triple equals signs.
URL Removal
Bare and bracketed URLs are removed from the output. The goal is clean text for language models, not clickable links.
Sessions & Security
This section describes how your credentials are handled from the moment you sign in to the moment you leave. For a full treatment of the server's security posture, see the Server Hardening document.
Split-Key Architecture
Your credentials are never stored in plaintext on the server. When you sign in, the server encrypts your password (or OAuth token) using AES-256-GCM with a randomly generated 256-bit key and a random 12-byte IV. The encrypted data is written to a session file on disk. The encryption key is sent to your browser in a cookie.
The server holds no keys in memory. Your browser holds no encrypted data. Neither half is useful alone.
| If compromised | What the attacker gets |
|---|---|
| Server disk | Encrypted blobs with no decryption keys |
| Browser cookie | A decryption key with no encrypted data |
| Server memory | Nothing — no keys or plaintext are held in memory |
| Both simultaneously | One session's credentials, valid only until the session expires |
Session Lifetime
| Setting | Idle Timeout | Hard Cap |
|---|---|---|
| Standard | 2 hours | 8 hours |
| Stay signed in | 7 days | 7 days |
Every API call resets the idle timer. The hard cap cannot be extended. A background sweep runs every 15 minutes and deletes any session file that has exceeded its timeout.
Cookie Attributes
- HttpOnly — JavaScript on the page cannot read the cookie
- Secure — the cookie is only sent over HTTPS
- SameSite — prevents the cookie from being sent by third-party sites
File Permissions
Session files are stored with chmod 600 (owner read/write only). The session directory is chmod 700. The application runs as a dedicated system user with no shell and no sudo access.
Privacy
Mail Extract does not track you. There is no analytics, no user database, no login history, and no record of what you searched for or extracted.
What Is Not Collected
- IP addresses
- Email addresses (beyond showing yours during your session)
- Search queries or filter settings
- Email content or metadata
- Login timestamps
- Usage patterns
What Is Collected
A single anonymous counter that increments by one each time any user runs a search. No identifying information is attached. It resets daily. It tells us how many extractions happened in a day — not who did them or what they searched for.
After You Leave
Your session file is deleted when you sign out. If you close the tab instead, it's deleted when the session expires. The cleanup sweep runs every 15 minutes. Once the file is gone, there is no trace that you were here.
We don't know who uses this app, how often they use it, or what they use it for. That's deliberate.
Part of the Emergence Project. Built by the Design/OS shell team.
March 2026