Project Overview and Challenge
Project Overview
The Challenge
- Content Volume: 1000+ pages of rich content including articles, components, and dynamic zones
- Target Languages: Multiple locales (German, Turkish, French, Spanish, Italian, etc.)
- Content Complexity: Nested components, dynamic zones, rich text blocks, relations, media
- Quality Requirements: Professional-grade translations maintaining tone, formatting, and SEO elements
Why not use Strapi’s built-in AI translator? It’s not automated, offers limited support for bulk translation, and still requires manual work to set up relations, publish pages, and handle images. Once you manage more than 10 languages with a small team, doing this by hand stops being realistic.
Solution Architecture and Data Flow
The Solution
A custom translation extension for Strapi CMS that:
- Processes translations as background jobs with real-time progress tracking
- Handles complex nested content structures (components, dynamic zones, blocks)
- Preserves HTML, Markdown, URLs, placeholders, and special formatting
- Supports job cancellation, retry logic, and error recovery
- Provides a polished admin UI with model selection and translation settings
Data Flow
┌─────────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Admin UI │────▶│ REST API │────▶│ Job Manager │
│ (React Modal) │ │ /translate │ │ (Background) │
└─────────────────┘ └──────────────┘ └────────┬────────┘
│
┌────────────────────────┘
▼
┌─────────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Content │────▶│ Translator │────▶│ OpenAI API │
│ Extractor │ │ (Batching) │ │ (GPT Models) │
└─────────────────┘ └──────────────┘ └─────────────────┘
│
▼
┌─────────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Content │────▶│ Strapi DB │────▶│ Published │
│ Rebuilder │ │ Update │ │ Translations │
└─────────────────┘ └──────────────┘ └─────────────────┘Key Features of the Translation System
Key Features
1. Background Job System
Translations are processed as background jobs managed by a dedicated job manager. This enables long-running operations, real-time progress tracking, cancellation, and retry behavior without blocking the Strapi admin UI.
2. Smart Content Extraction
A content extractor walks Strapi entries, components, and dynamic zones to locate translatable fields while preserving non-translatable structures like IDs, relations, and media references.
3. Multi-Model Support
The translator supports multiple OpenAI GPT models so teams can balance cost, speed, and quality depending on the project and target locale.
4. Intelligent Batching
Fields are grouped into batches to keep token usage efficient while staying within rate limits. This batching is key to reaching 1000+ pages within a 24-hour window.
5. Translation Behavior Settings
Admins can configure how literally or loosely content should be translated, whether to preserve brand terms, and how to handle placeholders, HTML, and Markdown.
6. Prompt Configuration
Prompts sent to GPT models are configurable, making it possible to tune tone of voice, formality, and locale-specific preferences per project.
7. Relation Handling
The system respects and rebuilds relations between entries after translation so localized content remains correctly linked across locales.
8. User Experience Features
A React-based modal in the Strapi admin UI exposes model selection, target locales, translation options, and real-time progress, so editors can safely initiate and monitor large translation jobs.
Throughput and 1000 Pages Estimation
1000 Pages Estimation
Assuming an average of 50 translatable fields per page and 5 target languages:
1000 pages × 50 fields = 50,000 fields to translate
50,000 fields ÷ 20 batch size = 2,500 API calls
2,500 calls × 5 seconds average = 12,500 seconds = ~3.5 hours per language
5 languages × 3.5 hours = ~17.5 hours total
+ Overhead (extraction, saving, relations) = ~20–24 hoursConclusion and System Outcomes
Conclusion
This translation system shows how to build a production-grade, AI-powered content translation pipeline for Strapi CMS. The combination of background processing, intelligent batching, comprehensive error handling, and a polished UX creates a tool capable of translating large content libraries efficiently and reliably.
Key metrics achieved:
- ✅ 1000+ pages translatable in 24 hours
- ✅ Support for a broad range of languages
- ✅ Real-time progress tracking
- ✅ No data loss observed in tested runs with proper error handling
- ✅ Configurable quality vs. speed tradeoffs
- ✅ Professional-grade translation quality
Closing Thoughts & What’s Next
Manual translation works — until your content grows faster than your team. That’s the moment everything starts to crack. - Emre Yılmaz, Senior Content manager at DISEEC
Closing Thoughts & What’s Next
Once content reaches a certain size, effort stops scaling linearly.
What works at ten pages quietly breaks at a hundred. What feels manageable in one language becomes fragile across ten. Not because people stop caring—but because manual processes don’t survive growth.
The most expensive failures are rarely obvious. They show up as hesitation to edit content, fear of publishing, or workflows no one fully trusts anymore. By the time these problems are visible, they’ve usually been around for a while.
That realization is what led us here.
This translation system didn’t begin as a product or a feature—it began as a response to real constraints in a production environment. And it quickly became clear that this problem isn’t unique to one team or one project.
So we’re opening it up.
We’re preparing to open source the entire system—not a demo, not a simplified example, but the actual infrastructure that runs this pipeline in production. The job system, the content handling logic, the batching strategies, the safeguards—everything that makes it work at scale.
We’re currently finalizing documentation and cleaning the last rough edges before publishing the repository.
If you want to know when it goes live, get early access, or follow how this evolves in the open, subscribe.
I also share practical lessons from building and running systems like this—CMS scaling, AI in production, and the tradeoffs that don’t show up in tutorials.
No hype. No fluff. Just things that work.
If that sounds useful, you know what to do.



