Skip to main content

Goslate: Free Google Translate API

Project description

goslate provides you free python API to google translation service by querying google translation website.

It is:

  • Free: get translation through public google web site without fee

  • Fast: batch, cache and concurrently fetch

  • Simple: single file module, just Goslate().translate('Hi!', 'zh')

Simple Usage

The basic usage is simple:

>>> import goslate
>>> gs = goslate.Goslate()
>>> print(gs.translate('hello world', 'de'))
hallo welt

Installation

goslate support both Python2 and Python3. You could install it via:

$ pip install goslate

or just download latest goslate.py directly and use

futures pacakge is optional but recommended to install for best performance in large text translation task.

Proxy Support

Proxy support could be added as following:

import urllib2
import goslate

proxy_handler = urllib2.ProxyHandler({"http" : "http://proxy-domain.name:8080"})
proxy_opener = urllib2.build_opener(urllib2.HTTPHandler(proxy_handler),
                                    urllib2.HTTPSHandler(proxy_handler))

gs_with_proxy = goslate.Goslate(opener=proxy_opener)
translation = gs_with_proxy.translate("hello world", "de")

Romanlization

Romanization or latinization (or romanisation, latinisation), in linguistics, is the conversion of writing from a different writing system to the Roman (Latin) script, or a system for doing so.

For example, pinyin is the default romanlization method for Chinese language.

You could get translation in romanlized writing as following:

>>> import goslate
>>> roman_gs = goslate.Goslate(writing=goslate.WRITING_ROMAN)
>>> print(roman_gs.translate('China', 'zh'))
Zhōngguó

You could also get translation in both native writing system and ramon writing system

>>> import goslate
>>> gs = goslate.Goslate(writing=goslate.WRITING_NATIVE_AND_ROMAN)
>>> gs.translate('China', 'zh')
('中国', 'Zhōngguó')

You could see the result will be a tuple in this case: (Translation-in-Native-Writing, Translation-in-Roman-Writing)

Language Detection

Sometimes all you need is just find out which language the text is:

>>> import goslate
>>> gs = goslate.Goslate()
>>> language_id = gs.detect('hallo welt')
>>> language_id
'de'
>>> gs.get_languages()[language_id]
'German'

Concurrent Querying

It is not necessary to roll your own multi-thread solution to speed up massive translation. Goslate already done it for you. It utilizes concurrent.futures for concurent querying. The max worker number is 120 by default.

The worker number could be changed as following:

>>> import goslate
>>> import concurrent.futures
>>> executor = concurrent.futures.ThreadPoolExecutor(max_workers=200)
>>> gs = goslate.Goslate(executor=executor)
>>> it = gs.translate(['text1', 'text2', 'text3'])
>>> list(it)
['tranlation1', 'translation2', 'translation3']

It is adviced to install concurrent.futures backport lib in python2.7 (python3 has it by default) to enable concurrent querying.

The input could be list, tuple or any iterater, even the file object which iterate line by line

>>> translated_lines = gs.translate(open('readme.txt'))
>>> translation = '\n'.join(translated_lines)

Do not worry about short texts will increase the query time. Internally, goslate will join small text into one big text to reduce the unnecessary query round trips.

Batch Translation

Google translation does not support very long text, goslate bypass this limitation by split the long text internally before send to Google and join the mutiple results into one translation text to the end user.

>>> import goslate
>>> with open('the game of thrones.txt', 'r') as f:
>>>     novel_text = f.read()
>>> gs = goslate.Goslate()
>>> gs.translate(novel_text)

Performance Consideration

Goslate use batch and concurrent fetch aggresivelly to achieve maximized translation speed internally.

All you need to do is reducing API calling times by utilize batch tranlation and concurrent querying.

For example, say if you want to translate 3 big text files. Instead of manually translate them one by one, line by line:

import goslate

big_files = ['a.txt', 'b.txt', 'c.txt']
gs = goslate.Goslate()

translation = []
for big_file in big_files:
    with open(big_file, 'r') as f:
        translated_lines = []
        for line in f:
            translated_line = gs.translate(line)
            translated_lines.append(translated_line)

        translation.append('\n'.join(translated_lines))

It is better to leave them to Goslate totally. The following code is not only simpler but also much faster (+100x) :

import goslate

big_files = ['a.txt', 'b.txt', 'c.txt']
gs = goslate.Goslate()

translation_iter = gs.translate(open(big_file, 'r').read() for big_file in big_files)
translation = list(translation_iter)

Internally, goslate will first adjust the text to make them not so big that do not fit Google query API nor so small that increase the total HTTP querying times. Then it will use concurrent query to speed thing even further.

Lookup Details in Dictionary

If you want detail dictionary explaination for a single word/phrase, you could

>>> import goslate
>>> gs = goslate.Goslate()
>>> gs.lookup_dictionary('sun', 'de')
[[['Sonne', 'sun', 0]],
 [['noun',
   ['Sonne'],
   [['Sonne', ['sun', 'Sun', 'Sol'], 0.44374731, 'die']],
   'sun',
   1],
  ['verb',
   ['der Sonne aussetzen'],
   [['der Sonne aussetzen', ['sun'], 1.1544633e-06]],
   'sun',
   2]],
 'en',
 0.9447732,
 [['en'], [0.9447732]]]

There are 2 limitaion for this API:

  • The result is a complex list structure which you have to parse for your own usage

  • The input must be a single word/phase, batch translation and concurrent querying are not supported

Query Error

If you get HTTP 5xx error, it is probably because google has banned your client IP address from transation querying.

You could verify it by access google translation service in browser manully.

You could try the following to overcome this issue:

  • query through a HTTP/SOCK5 proxy, see Proxy Support

  • using another google domain for translation: gs = Goslate(service_urls=['http://translate.google.de'])

  • wait for 3 seconds before issue another querying

API References

please check API reference

Command Line Interface

goslate.py is also a command line tool which you could use directly

  • Translate stdin input into Chinese in GBK encoding

    $ echo "hello world" | goslate.py -t zh-CN -o gbk
  • Translate 2 text files into Chinese, output to UTF-8 file

    $ goslate.py -t zh-CN -o utf-8 source/1.txt "source 2.txt" > output.txt

use --help for detail usage

$ goslate.py -h

How to Contribute

What’s New

1.5.2

  • [fix bug] removes newlines from descriptions to avoid installation failure

1.5.0

  • Add new API Goslate.lookup_dictionary() to get detail information for a single word/phrase, thanks for Adam’s suggestion

  • Improve document with more user scenario and performance consideration

1.4.0

  • [fix bug] update to adapt latest google translation service changes

1.3.2

  • [fix bug] fix compatible issue with latest google translation service json format changes

  • [fix bug] unit test failure

1.3.0

  • [new feature] Translation in roman writing system (romanlization), thanks for Javier del Alamo’s contribution.

  • [new feature] Customizable service URL. you could provide multiple google translation service URLs for better concurrency performance

  • [new option] roman writing translation option for CLI

  • [fix bug] Google translation may change normal space to no-break space

  • [fix bug] Google web API changed for getting supported language list

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

goslate-1.5.2.tar.gz (16.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

goslate-1.5.2-py3.8.egg (20.7 kB view details)

Uploaded Egg

goslate-1.5.2-py3.7.egg (20.6 kB view details)

Uploaded Egg

goslate-1.5.2-py2.7.egg (20.7 kB view details)

Uploaded Egg

File details

Details for the file goslate-1.5.2.tar.gz.

File metadata

  • Download URL: goslate-1.5.2.tar.gz
  • Upload date:
  • Size: 16.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Python-urllib/2.7

File hashes

Hashes for goslate-1.5.2.tar.gz
Algorithm Hash digest
SHA256 658faceecd38fa46e8ef5dc345f2b2d328fe0a45464a11316e9fb2283b820c41
MD5 3fba6c3edbf26469043b6ad200a3c8de
BLAKE2b-256 cad94d83598105a42b377c02c9005d55eff0fb29bdce689024490e14da3e1f17

See more details on using hashes here.

File details

Details for the file goslate-1.5.2-py3.8.egg.

File metadata

  • Download URL: goslate-1.5.2-py3.8.egg
  • Upload date:
  • Size: 20.7 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Python-urllib/3.8

File hashes

Hashes for goslate-1.5.2-py3.8.egg
Algorithm Hash digest
SHA256 a237aff39b05927fd94cf5f0a5ef1e82cc0cb0cee344aa97f67331aab5ecf446
MD5 d73b70b31935e05758b3cfbaf8ddc82f
BLAKE2b-256 dd2ad25dead9e206e56dc5b45785bbe2d723442a9e6cfb7acc1932aa0b411ef5

See more details on using hashes here.

File details

Details for the file goslate-1.5.2-py3.7.egg.

File metadata

  • Download URL: goslate-1.5.2-py3.7.egg
  • Upload date:
  • Size: 20.6 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Python-urllib/3.7

File hashes

Hashes for goslate-1.5.2-py3.7.egg
Algorithm Hash digest
SHA256 6e97ef89705eab8edbafa78f4c20984e52a61942c01bac52c593bf0698fd8ee6
MD5 c8e3f1164e7003b778f783728e04a605
BLAKE2b-256 3e7e277074c7ef99b4bdfe3d6595c4d02c5b90879846ee4864391477105c3544

See more details on using hashes here.

File details

Details for the file goslate-1.5.2-py2.7.egg.

File metadata

  • Download URL: goslate-1.5.2-py2.7.egg
  • Upload date:
  • Size: 20.7 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Python-urllib/2.7

File hashes

Hashes for goslate-1.5.2-py2.7.egg
Algorithm Hash digest
SHA256 ad1127a0a76942032aacdc1b93556babd2b5100cc32be5101ecaceb4cd0e8c7f
MD5 d30c9dbe94bfa8acd336c98378947024
BLAKE2b-256 4a64ee643b821be3cfaef29fdb4586ae12f9e2af522f6f32c7f4b166a691efb0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page