goslate

Goslate: Free Google Translate API

These details have not been verified by PyPI

Project links

Homepage

Project description

goslate provides you free python API to google translation service by querying google translation website.

It is:

Free: get translation through public google web site without fee
Fast: batch, cache and concurrently fetch
Simple: single file module, just Goslate().translate('Hi!', 'zh')

Simple Usage

The basic usage is simple:

>>> import goslate
>>> gs = goslate.Goslate()
>>> print(gs.translate('hello world', 'de'))
hallo welt

Installation

goslate support both Python2 and Python3. You could install it via:

$ pip install goslate

or just download latest goslate.py directly and use

futures package is optional but recommended to install for best performance in large text translation tasks.

Proxy Support

Proxy support could be added as following:

import urllib2
import goslate

proxy_handler = urllib2.ProxyHandler({"http" : "http://proxy-domain.name:8080"})
proxy_opener = urllib2.build_opener(urllib2.HTTPHandler(proxy_handler),
                                    urllib2.HTTPSHandler(proxy_handler))

gs_with_proxy = goslate.Goslate(opener=proxy_opener)
translation = gs_with_proxy.translate("hello world", "de")

Romanization

Romanization or latinization (or romanisation, latinisation), in linguistics, is the conversion of writing from a different writing system to the Roman (Latin) script, or a system for doing so.

For example, pinyin is the default romanization method for Chinese language.

You could get translation in romanized writing as following:

>>> import goslate
>>> roman_gs = goslate.Goslate(writing=goslate.WRITING_ROMAN)
>>> print(roman_gs.translate('China', 'zh'))
Zhōngguó

You could also get translation in both native writing system and ramon writing system

>>> import goslate
>>> gs = goslate.Goslate(writing=goslate.WRITING_NATIVE_AND_ROMAN)
>>> gs.translate('China', 'zh')
('中国', 'Zhōngguó')

You could see the result will be a tuple in this case: (Translation-in-Native-Writing, Translation-in-Roman-Writing)

Language Detection

Sometimes all you need is just find out which language the text is:

>>> import goslate
>>> gs = goslate.Goslate()
>>> language_id = gs.detect('hallo welt')
>>> language_id
'de'
>>> gs.get_languages()[language_id]
'German'

It is not necessary to roll your own multi-thread solution to speed up massive translation. Goslate has already done it for you. It utilizes concurrent.futures for concurrent querying. The max worker number is 120 by default.

The worker number could be changed as following:

>>> import goslate
>>> import concurrent.futures
>>> executor = concurrent.futures.ThreadPoolExecutor(max_workers=200)
>>> gs = goslate.Goslate(executor=executor)
>>> it = gs.translate(['text1', 'text2', 'text3'])
>>> list(it)
['translation1', 'translation2', 'translation3']

It is advised to install concurrent.futures backport lib in python2.7 (python3 has it by default) to enable concurrent querying.

The input could be list, tuple or any iterator, even the file object which iterate line by line

>>> translated_lines = gs.translate(open('readme.txt'))
>>> translation = '\n'.join(translated_lines)

Do not worry about short texts will increase the query time. Internally, goslate will join small text into one big text to reduce the unnecessary query round trips.

Batch Translation

Google translation does not support very long text, goslate bypasses this limitation by splitting the long text internally before sending it to Google and joining the multiple results into one translation text to the end user.

>>> import goslate
>>> with open('the game of thrones.txt', 'r') as f:
>>>     novel_text = f.read()
>>> gs = goslate.Goslate()
>>> gs.translate(novel_text)

Performance Consideration

Goslate uses batch and concurrent fetch aggressively to achieve maximized translation speed internally.

All you need to do is reduce API calling times by utilizing batch translation and concurrent querying.

For example, say if you want to translate 3 big text files. Instead of manually translate them one by one, line by line:

import goslate

big_files = ['a.txt', 'b.txt', 'c.txt']
gs = goslate.Goslate()

translation = []
for big_file in big_files:
    with open(big_file, 'r') as f:
        translated_lines = []
        for line in f:
            translated_line = gs.translate(line)
            translated_lines.append(translated_line)

        translation.append('\n'.join(translated_lines))

It is better to leave them to Goslate totally. The following code is not only simpler but also much faster (+100x) :

import goslate

big_files = ['a.txt', 'b.txt', 'c.txt']
gs = goslate.Goslate()

translation_iter = gs.translate(open(big_file, 'r').read() for big_file in big_files)
translation = list(translation_iter)

Internally, goslate will first adjust the text to make them not so big that do not fit Google query API, nor so small that increase the total HTTP querying times. Then it will use concurrent queries to speed things even further.

Lookup Details in Dictionary

If you want detail dictionary explanation for a single word/phrase, you could

>>> import goslate
>>> gs = goslate.Goslate()
>>> gs.lookup_dictionary('sun', 'de')
[[['Sonne', 'sun', 0]],
 [['noun',
   ['Sonne'],
   [['Sonne', ['sun', 'Sun', 'Sol'], 0.44374731, 'die']],
   'sun',
   1],
  ['verb',
   ['der Sonne aussetzen'],
   [['der Sonne aussetzen', ['sun'], 1.1544633e-06]],
   'sun',
   2]],
 'en',
 0.9447732,
 [['en'], [0.9447732]]]

There are 2 limitations for this API:

The result is a complex list structure which you have to parse for your own usage
The input must be a single word/phase, batch translation and concurrent querying are not supported

Query Error

If you get an HTTP 5xx error, it is probably because google has banned your client IP address from transaction querying.

You could verify it by accessing google translation service in the browser manually.

You could try the following to overcome this issue:

query through a HTTP/SOCKS5 proxy, see Proxy Support
using another google domain for translation: gs = Goslate(service_urls=['http://translate.google.de'])
wait for 3 seconds before issue another querying

API References

please check API reference

Command Line Interface

goslate.py is also a command line tool which you could use directly

Translate stdin input into Chinese in GBK encoding

$ echo "hello world" | goslate.py -t zh-CN -o gbk

Translate 2 text files into Chinese, output to UTF-8 file

$ goslate.py -t zh-CN -o utf-8 source/1.txt "source 2.txt" > output.txt

use --help for detail usage

$ goslate.py -h

How to Contribute

What’s New

1.5.4

handle deprecated threading.currentThread() properly
add retry_wait_duration param to fine control the retry behavior in case of connection error

1.5.2

[fix bug] removes newlines from descriptions to avoid installation failure

1.5.0

Add new API Goslate.lookup_dictionary() to get detail information for a single word/phrase, thanks for Adam’s suggestion
Improve document with more user scenario and performance consideration

1.4.0

[fix bug] update to adapt latest google translation service changes

1.3.2

[fix bug] fix compatible issue with latest google translation service json format changes
[fix bug] unit test failure

1.3.0

[new feature] Translation in roman writing system (romanization), thanks for Javier del Alamo’s contribution.
[new feature] Customizable service URL. you could provide multiple google translation service URLs for better concurrency performance
[new option] roman writing translation option for CLI
[fix bug] Google translation may change normal space to no-break space
[fix bug] Google web API changed for getting supported language list

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.5.4

Jun 13, 2022

1.5.3

Jun 13, 2022

1.5.2

Nov 16, 2021

1.5.1

Jan 4, 2016

1.5.0

Aug 17, 2015

1.4.0

Apr 9, 2015

1.3.2

Jan 30, 2015

1.3.1

Jan 30, 2015

1.3.0

May 20, 2014

1.2.0

Apr 4, 2014

1.1.3

Aug 3, 2013

1.1.2

Aug 3, 2013

1.1.1

May 27, 2013

1.1.0

May 26, 2013

1.0.0

May 22, 2013

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

goslate-1.5.4.tar.gz (14.1 kB view details)

Uploaded Jun 13, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

goslate-1.5.4-py3.9.egg (20.8 kB view details)

Uploaded Jun 13, 2022 Egg

File details

Details for the file goslate-1.5.4.tar.gz.

File metadata

Download URL: goslate-1.5.4.tar.gz
Upload date: Jun 13, 2022
Size: 14.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.7

File hashes

Hashes for goslate-1.5.4.tar.gz
Algorithm	Hash digest
SHA256	`c6ad6b121d19eec08c29cea7385c055a7bc53247d1b8cd935f68228a4e2063d5`
MD5	`ce8bb1a342adbc513aebeef654c07d64`
BLAKE2b-256	`5858660a0bd2e64716b1c30f611c3ac719bee9a356187d170a17d52648ba742d`

See more details on using hashes here.

File details

Details for the file goslate-1.5.4-py3.9.egg.

File metadata

Download URL: goslate-1.5.4-py3.9.egg
Upload date: Jun 13, 2022
Size: 20.8 kB
Tags: Egg
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.7

File hashes

Hashes for goslate-1.5.4-py3.9.egg
Algorithm	Hash digest
SHA256	`a93385124c6733c863d9bb02005bba7504e501ec3c7acb9062af011e5b1a6e79`
MD5	`b6d9a6e8faea872bd2d6eccaa3d48b0e`
BLAKE2b-256	`3a6f7e0e8649f8c0b03af8cbd4081cea8e0d1ae71cf7b6575745840921935177`

See more details on using hashes here.

goslate 1.5.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes