Translate Document — 20 Languages, Preserve Formatting, Free

Translate text, markdown, HTML, CSV, JSON, or PDF documents into 20 languages while preserving formatting. Powered by Claude Haiku 4.5.

Translate Document is for translating files where structure matters — a markdown blog post (keep the headings, lists, code blocks intact), a CSV (translate cell values, not headers, preserve commas), an HTML email (keep tags, translate text nodes), or a PDF report (extract text in browser, translate, output plain text). Google Translate handles plain text well but mangles markdown and inline code; DeepL is similar.

Pickrack's Translate Document uses Claude Haiku 4.5 with explicit format-preservation instructions. The tool splits long documents into ~3500-character chunks at paragraph or sentence boundaries, translates each chunk with consistent system prompt (locked tone, locked terminology), then concatenates the result. Markdown structure (# headings, **bold**, `` code blocks ``), HTML tags, JSON keys, and CSV delimiters are preserved.

20 target languages including Vietnamese, English, Spanish, French, German, Chinese (simplified and traditional), Japanese, Korean, Portuguese, Italian, Russian, Arabic, Hindi, Thai, Indonesian, Dutch, Polish, Turkish, Swedish. Free, daily quota of 5 documents/IP (heavier than text-only translation — uses more tokens).

Key features

Markdown structure preserved — Headings stay as headings, lists as lists, code blocks untouched, links keep their URLs. Only visible text is translated.
HTML tags preserved — Translates text content between tags; tags themselves stay exactly. Useful for translating email templates or static HTML pages.
PDF input via browser extraction — Drop a PDF; pdf.js extracts text in your browser. Only the extracted text is sent for translation — the PDF binary never uploads.
20 languages — Major world languages including all of CJK (zh, zh-tw, ja, ko), Southeast Asian (vi, th, id), European (en, es, fr, de, pt, it, ru, nl, pl, sv, tr), Middle Eastern (ar), and South Asian (hi).
Smart chunking — Documents are split at paragraph boundaries (preferred), sentence boundaries (fallback), or word boundaries (last resort). Each chunk is translated with consistent context for stable terminology.
Optional formatting strip — Toggle 'Preserve markdown/HTML' off to get continuous prose. Useful when the source has formatting you don't want in the output.

How to use

Step 1: Upload or paste your source — Drop a .txt, .md, .html, .csv, .json, or .pdf file (up to 20 MB), or paste text directly into the source textarea. Files are read in your browser; PDFs are parsed with pdf.js.
Step 2: Pick a target language — Click a popular language pill (Vietnamese, English, Spanish, French, German, Chinese, Japanese, Korean) or open the dropdown for all 20.
Step 3: Choose format preservation — Default 'on' — keeps markdown/HTML structure. Turn off if you want plain prose output.
Step 4: Click Translate — Long documents are chunked and translated in passes. Progress isn't streamed in v1 (you'll see a spinner) but the chunk count is reported at the end.
Step 5: Copy or download — Copy to clipboard, or download as .txt. The output preserves the format you started with (minus stripped tags if you toggled off preservation).

When to use

Translate a markdown blog post — keep all the headings, lists, code samples, links exactly; translate only the prose
Translate a CSV — translate the cell values while keeping commas, quotes, and column structure intact
Translate an HTML email template — keep <table>, <a href=...>, <style> tags untouched, translate visible text only
Translate a JSON i18n file — translate the values, keep the keys exactly (useful prep for i18n migration)
Translate a PDF research summary — extract text from a 10-page report in your browser, translate to your language
Translate user-facing docs — README.md from English to Vietnamese while keeping all the markdown markup

Frequently asked questions

Does it preserve every markdown feature?

It preserves all the common ones: headings, lists (ordered/unordered), bold/italic, links, code blocks (inline and fenced), tables, blockquotes. Edge cases like complex MDX components or HTML-in-markdown are best-effort — review the output. Claude is good at this but not perfect.

Why chunking instead of one big translation?

Claude Haiku has a 200K context window — technically a 50K char document fits. But quality degrades on very long inputs because attention is spread thin. Chunking to 3500 chars per pass keeps Claude focused on each chunk and produces more consistent terminology and tone. The tradeoff is slightly more API calls (cost) and slightly more time.

Will the same word always translate the same way across chunks?

Usually yes — Claude's training gives it strong terminology consistency. For technical documents with specialized vocabulary, the first chunk often establishes the term and subsequent chunks follow. For inconsistency-sensitive translations (legal, medical), review the output.

Is the PDF uploaded?

No. PDF text extraction happens in your browser using pdf.js. Only the extracted plain text reaches our server. The PDF binary stays in your browser memory and is discarded when you close the tab. Verify in DevTools → Network.

Why the 5/day quota?

Document translation runs multiple Claude calls (one per chunk). A 50,000-char document is ~14 chunks → 14 API calls. At ~$0.01-0.05 per chunk depending on output length, that's $0.15-0.70 per document. 5 documents/day per IP keeps the tool free without unbounded cost.

Can I translate Word (.docx) or PowerPoint (.pptx) files directly?

Not in v1. Workaround: convert the .docx to .pdf or .md first (Pickrack has Word to PDF, and you can paste DOCX text by copying it from Word). Native .docx parsing in this tool is on the roadmap but lower-priority because the conversion-then-translate workflow already works.

What if my document is bigger than 50,000 chars?

It'll be truncated to 50,000 chars with a notice. For longer documents, split into multiple translations (chapter 1, chapter 2, ...) and concatenate the outputs. The chunking system handles boundaries cleanly so no sentence is split across translations.

How does this compare to Google Translate or DeepL for the same document?

For plain text: Google/DeepL are faster (sub-second), Claude is slower (10-30s for long documents) but often more natural. For markdown/HTML/structured input: Pickrack with format preservation usually wins — Google often breaks markdown links and code blocks. For maximum quality on natural prose, professional human translators still beat all three.

Related tools

AI Translator

Translate text into 20 languages while preserving formatting (markdown, bullets, paragraphs). Claude Haiku 4.5.

AI Summarizer

Summarize long text into 2-3 sentences, 4-6 sentences, or detailed bullets. Powered by Claude Haiku 4.5.

Chat with PDF

Ask questions about a PDF and get answers from Claude. PDF parsed in your browser — only extracted text reaches our server.