
Bilingual Editorial AI Workflows
A multilingual content operations platform that automates source ingestion, AI-assisted article creation, editorial review, and live publishing for a high-volume digital newsroom. The system combines scheduled feed monitoring, bilingual content generation, role-based workflows, and CMS controls into a single backend-driven publishing stack.
Defining the core problem and identified pain points that necessitated this technical intervention.
Modern digital publishers need to monitor many trusted sources, detect new stories quickly, convert raw web content into publishable articles, maintain editorial quality, and publish synchronized content in multiple languages. Doing this manually creates operational bottlenecks: reporters spend too much time rewriting source material, editors struggle to keep multilingual versions aligned, breaking updates are hard to maintain consistently, and leadership lacks visibility into content operations.
The architectural and implementation strategy developed to resolve the challenge.
Administrators register trusted sources, map feed/category URLs, and schedule polling jobs. The backend normalizes and deduplicates discovered article URLs globally, prioritizes queued work, and uses Redis-backed Celery workers to separate feed polling from downstream AI processing.
A LangGraph-powered pipeline transforms raw source URLs into newsroom-ready drafts. After content extraction, the system rewrites articles into a standardized editorial structure, translates them into a second language, enriches them with SEO metadata, and optionally generates hero imagery — all with per-stage progress tracking in MongoDB.
Editors and reporters manage bilingual article pairs (linked by a shared group identifier) through create, translate, submit-for-review, approve, publish, unpublish, and manage placements (top-story, breaking-news, popular-news) workflows.
Live updates in one language are automatically translated and mirrored to the sibling article. Version snapshots capture paired bilingual state for every critical event, providing a safety net for high-speed publishing.
JWT authentication, OTP/email verification, role-and-permission-based access control, media library management, translatable navigation menus, configurable page layouts, moderated comments, system health endpoints, and usage analytics.
My specific roles, responsibilities, and the technical value I added to the project lifecycle.
Architected a distributed ingestion pipeline that polls trusted sources on schedules, canonicalizes URLs, globally deduplicates queue items, and dispatches AI processing through separate Celery worker lanes.
Engineered a multi-step LLM workflow for extraction, rewriting, translation, SEO enrichment, and image generation using structured outputs for predictable and validation-friendly CMS data.
Implemented bilingual article-group modeling across separate language collections, enabling synchronized draft creation, mirrored publication states, and cross-language content management.
Designed robust editorial state transitions for draft, review, approval, publication, unpublication, and request-changes flows with permission-based access rules for admins, editors, reporters, and guests.
Built versioning and rollback services that snapshot paired article states for every critical publishing event, including live-update mutations, reducing operational risk during breaking-news scenarios.
Developed administrative tooling for ingestion monitoring, retry controls, force-stop behavior, media asset compression/storage, translatable menus, layout switching, and analytics logging.
First-draft article preparation time cut by 90% — from hours of manual rewriting to minutes of AI-assisted generation.
Source URL to fully translated, SEO-enriched, bilingual article pair published in under 5 minutes end-to-end.
Scheduled ingestion engine continuously monitors 50+ trusted source feeds for new stories without manual checks.
Global URL canonicalization and deduplication prevents the same story from entering the editorial queue twice.
Role-based permission system covers Admins, Editors, Reporters, and Guests with distinct workflow access levels.
Immutable version snapshots enable safe rollback to any prior publication state during breaking-news corrections.