This is the public-facing privacy policy for the Magpie private beta. It complements (and is more detailed than) the in-bot consent notice you saw when you ran /start. The bot notice is your binding consent record; this document is reference material you can come back to at any time.
1. Who we are
Magpie is a Telegram receipt-OCR bot operated as a private invite-only beta. The beta is hosted from Singapore in the operator's individual capacity (not a registered company). Under the Singapore Personal Data Protection Act (PDPA) 2012, Magpie's operator is the "organisation" and the operator serves as the designated Data Protection Officer (DPO) per §11.
To reach the DPO during the beta:
- Email:
[email protected] - Or send a direct Telegram message to the Magpie bot account
2. What data we collect
From you (the tester):
- Your Telegram chat ID, the display name you give us (via the application form or in-bot), and any messages you send.
- Receipt photos you upload to the bot.
- Free-text edit corrections, chat messages, and command arguments.
- A consent record of the form
button:v1.7.0:tapplus the timestamp of the tap, captured when you accept this notice + the beta terms via the in-bot consent button. (Pre-v1.7.0 consent records carry the verbatimYES/AGREEreply text from the prior typed-acceptance flow; both forms are preserved for audit-trail integrity.)
From your receipts (extracted automatically by AI):
- Shop name and (where visible) location, date, line items, taxes, discounts, totals, currency, and the OCR's best-effort category.
Operational logs:
- Brief logs of bot activity (chat IDs, command names, error traces, timing). Logs do not store receipt contents.
3. Why we collect it
Every category above is tied to one of these purposes:
- Read your receipts: the photo + the AI-extracted fields.
- Help you find them later: the markdown notes, the price-history file, the daily / weekly / monthly summaries.
- Enforce the beta cap: the daily-usage counter.
- Audit + dispute resolution: the consent audit trail; the bot's operational logs; the receipt photo retained as the original record.
- Authentication: the Telegram chat ID + the whitelist.
We do not use your data for advertising, profiling, training AI models, or selling to third parties.
4. Who else sees your data
Cloud LLM providers (OCR sub-processor). Each receipt image is sent to one cloud-based LLM provider configured by the host. The current provider for this beta is disclosed by the host on request. Possible providers Magpie supports:
| Provider | Jurisdiction | Tier policy |
|---|---|---|
| Google Gemini | USA (Google LLC) | Paid tier: no training. Free tier: may improve products. |
| Anthropic Claude | USA | Paid API: no training. |
| OpenAI | USA | Paid API: no training (default). |
| Groq | USA | API: no training. |
| xAI Grok | USA | API: per published terms. |
| OpenRouter | USA (broker) | Forwards to underlying provider. |
| DeepSeek | China | Per published terms. |
| Ollama (local) | Singapore (this host) | No external transfer. |
Ask the operator which provider + tier this beta currently uses.
Receipt images go to the LLM provider at OCR time. Magpie also sends data to the same provider on these other paths so the AI can answer or correct:
/chator/price: the receipt history relevant to your question (shop names, items, prices, dates) is sent as conversational context.- ✏️ Edit or Auto-fix (the buttons on a confirmation card): the full current receipt JSON (date, shop, items, tax, service charge, location, totals) is sent for the model to correct.
/spendwith natural-language text (e.g. "bought a coffee at starbucks for 7 dollars"): the spend text is sent for the model to extract the shop, amount, currency, and date. Structured shapes like/spend 5.50 laksastill parse offline when the model is unavailable.- Digital-receipt PDFs: the extracted text of the PDF is sent to the model for parsing.
Backblaze B2 (backup sub-processor). The vault and image archive are backed up nightly to Backblaze B2 in the United States. Backups are client-side encrypted via rclone crypt before leaving Singapore; Backblaze stores ciphertext and cannot read your data even if compelled.
DuckDuckGo (search sub-processor, optional). When enabled, OCR's shop name is queried against DuckDuckGo's search HTML endpoint to help enrich the receipt with location and currency hints. Off by default in this beta.
Frankfurter (FX rates sub-processor). Magpie fetches daily foreign-exchange rates from frankfurter.app (an ECB-backed open API hosted in the EU) every 6 hours by default so multi-currency receipts can be summarised in your chosen home currency. Outbound traffic: nothing from your receipts — Magpie only consumes the published rates.
Telegram. Every message you send the bot passes through Telegram's infrastructure (servers in multiple jurisdictions). Telegram is a Telegram-side data controller separate from Magpie.
CSV import (/import). Bank/card statement CSVs you upload are parsed locally on Magpie's host — no third party sees them. Imported rows are written to your vault as receipt notes with a source: import frontmatter field so you can distinguish them from photo-OCR receipts.
Email forwarding (optional). If you configure email forwarding for digital receipts, your inbound email provider (one of Postmark, Mailgun, or SendGrid — operator's choice) is a sub-processor: the email body passes through their service before hitting Magpie's webhook. Each is a US/EU SaaS with its own GDPR / SCC compliance posture; the operator selects one at deploy time. Email subjects and senders are scrubbed of control characters before being stored.
Subscription billing (scaffolding only as of v1.6.2). Magpie's billing infrastructure is scaffolded but not live: no payment data is processed today. When the operator activates Stripe, the only data Stripe sees is your Telegram chat-ID (used as the Checkout reference). Magpie itself does not store card numbers — Stripe does.
5. Cross-border transfer
Your data leaves Singapore via:
- The cloud LLM provider (USA / EU depending on configuration)
- Backblaze B2 (USA)
PDPA §26 requires comparable protection for cross-border transfers. Magpie relies on:
- The providers' published data protection terms
- Client-side encryption (rclone crypt) for Backblaze
- Paid-tier "no training" contractual exclusions where applicable
If you do not want your data to leave Singapore at all, Magpie can be configured with a local-only Ollama OCR model — ask the operator before signing on.
6. How long we keep your data
- Live data (vault notes, images, sessions): for the duration of the beta + 30 days after the beta closes, then permanently destroyed.
- Backups on Backblaze: 30-day version history. A
/deletemyaccountremoves live data immediately; the matching backup snapshots age out within 30 days. - Consent audit trail: copied to a consent archive on deletion and retained for 1 year, per PDPC recommendation for consent records.
- Operational logs: rotated at ~20 MB × 10 backups and not exported to backup.
After the beta ends, the operator will publish a deletion checklist confirming all the above purges have run.
7. Your rights
You have the right to:
- Access: ask the operator what we hold about you. The bot also exposes
/exportwhich gives you a ZIP of your receipts, images, and consent record. - Correction: use
/editbefore confirming a receipt. For confirmed receipts, message the operator with the correction. - Erasure:
/deletemyaccount CONFIRMdeletes everything immediately. Backups age out within 30 days. - Withdraw consent: stop using the bot. Combine with
/deletemyaccountto also erase past data. - Complain to PDPC: if you believe your data has been mishandled, you may complain to the Personal Data Protection Commission of Singapore.
8. Data breach response
If the operator detects or is notified of a significant data breach, the operator will:
- Notify affected individuals within 72 hours.
- Notify the PDPC within 72 hours per PDPA §26D where the breach is significant.
- Publish a post-mortem describing what happened and what was done.
9. Security measures
- Multi-tenant isolation (each tester's data is in their own subtree).
- TLS for every external API call.
- Encrypted backups via rclone crypt.
- File permissions on
config.yamland session state set to0600(owner-only). - Wizard CSRF / DNS-rebinding defenses.
- Atomic state transitions to prevent corruption under concurrent use.
- HMAC-signed webhooks for email-forwarding inbound (fail-closed by default).
- Secret-key redaction in operator-side error logs covering all 8 supported LLM provider key shapes.
10. Changes to this policy
If we materially change this policy, you will be re-prompted to accept the new consent notice on your next interaction with the bot (the consent cache is bound to a notice version hash).
Last reviewed: 2026-06-08.