ES Translator¶
A lazy yet bulletproof machine translation tool for Elasticsearch.
What is es-translator?¶
es-translator reads documents from Elasticsearch, translates them using machine translation, and writes the translations back. It's designed for bulk translation of large document collections.
Features¶
-
Two translation engines
- Argos: Neural machine translation with ~30 languages
- Apertium: Rule-based translation with 40+ language pairs
-
Scalable processing
- Parallel workers for multi-core systems
- Distributed mode with Celery/Redis for multi-server deployments
-
Elasticsearch integration
- Direct read/write with scroll API
- Query string filtering
- Incremental translation (skip already-translated docs)
Quick Start¶
pip install es-translator
es-translator \
--url "http://localhost:9200" \
--index my-index \
--source-language fr \
--target-language en
docker run -it --network host icij/es-translator \
es-translator \
--url "http://localhost:9200" \
--index my-index \
--source-language fr \
--target-language en
Documentation¶
-
Complete guide to using es-translator: commands, options, and examples.
-
All CLI options and environment variables.
-
Using es-translator with ICIJ's Datashare platform.
-
How es-translator works internally.
-
Python API documentation for programmatic usage.
-
How to contribute to es-translator.
Links¶
License¶
MIT License - See LICENSE for details.