
E2E Encrypted PII-Scrubbing AI Proxy
A monetized, privacy-preserving SaaS platform that enables businesses to use generative AI by automatically scrubbing, encrypting, and anonymizing PII. The platform features a secure, OTP-verified signup and login system, an AI-powered Risk Dashboard for compliance analysis, multi-modal (image/PDF) text extraction via Vision LLM, a multi-provider LLM selection system (OpenAI, Anthropic, Google), and a user-controlled two-phase scrub-and-review workflow with Redis-backed session management.
Defining the core problem and identified pain points that necessitated this technical intervention.
Businesses are blocked from using powerful LLMs (like GPT, Claude, Gemini) because sending customer PII (emails, phone numbers, credit card info, names, addresses) to third-party APIs is a major security liability and violates compliance standards like GDPR, CCPA, and HIPAA.
The architectural and implementation strategy developed to resolve the challenge.
A standalone app that scrubs PII using a two-pass detection system (NER via OpenRouter/Mistral and Vision OCR via Groq for images/PDFs). Features a multi-step preview-confirm workflow, multi-provider LLM selection (GPT-4o, GPT-5, Claude 3, Gemini), and full end-to-end AES-256-GCM encryption with random nonces for all stored messages.
A tool that integrates directly with ChatGPT via OpenAPI Actions. It intercepts user prompts, scrubs PII, sends anonymized tokens (e.g., `<EMAIL_1>`, `<PHONE_1>`) to ChatGPT, and securely unscrubs the response using a temporary Redis-backed token map with TTL — ensuring zero PII ever reaches OpenAI.
A compliance portal that aggregates all PII detection events and leverages Groq (Llama 4 Scout) to perform automated risk assessments, providing numeric risk scores, risk levels (Low/Moderate/High/Critical), rationale, and actionable recommendations.
A complete SaaS backend secured by email-based OTP verification with branded HTML templates, JWT session management, a credit-based free trial (4 messages), webhook-less Stripe subscription integration, and per-LLM token usage tracking.
My specific roles, responsibilities, and the technical value I added to the project lifecycle.
Architected and built the entire FastAPI application with a Hybrid Layered Architecture, including the AI-powered Risk Dashboard and two-phase scrub-and-review workflow.
Built the complete auth system with email-based OTP verification (branded HTML templates), password reset flow, and JWT (12-hour access / 7-day refresh) token handling.
Engineered the core two-pass PII scrubber using Vision LLM (Groq) for OCR extraction from images/PDFs and Text LLM (OpenRouter/Mistral) for NER-based entity detection.
Built the AI-powered Risk Dashboard, integrating Groq (Llama 4 Scout) with structured output to generate automated compliance assessments from aggregated PII event logs.
Implemented multi-modal text extraction supporting single images, multiple images, and multi-page PDFs using batch processing with Vision LLM and sentence-boundary chunking.
Developed the interactive Preview-Confirm workflow, storing scrub previews in Redis with TTL and allowing users to modify, rename, or remove PII tokens before LLM submission.
Designed the multi-provider system supporting OpenAI (GPT-4o, GPT-5), Anthropic (Claude 3 Haiku/Sonnet), and Google (Gemini Pro/Flash) with centralized provider management.
Developed the ChatGPT Plugin architecture with API key authentication, session-based token mapping in Redis with TTL, and encrypted chat storage for full ChatGPT Actions integration.
Implemented the webhook-less Stripe integration with on-demand sync, subscription create/cancel/modify flows, payment intent generation, and automatic user status synchronization.
Developed the metered 'free trial' system (4 credits), automatic credit decrement, plan-based access control (FREE/PRO), and per-LLM token usage tracking.
Implemented full AES-256-GCM encryption for all chat messages and titles in MongoDB with random 12-byte nonces, bcrypt password hashing, and secure key management.
Two-pass NER + Vision OCR scrubbing ensures zero raw PII ever reaches OpenAI, Anthropic, or Google servers.
All chat messages and titles stored in MongoDB with AES-256-GCM and random 12-byte nonces per message.
Platform architecture satisfies GDPR, CCPA, and HIPAA requirements through automated PII anonymization.
Single platform routes to GPT-4o, GPT-5, Claude 3 Haiku/Sonnet, Gemini Pro/Flash via a centralized provider manager.
AI Risk Dashboard classifies exposure events into Low, Moderate, High, and Critical with scored recommendations.
Combines Mistral NER (text entities) and Groq Vision OCR (images/PDFs) for comprehensive multi-modal PII detection.