Typed Python client for Infopédia — the Portuguese language dictionary (www.infopedia.pt/dicionarios/lingua-portuguesa)
Project description
pyinfopedia
Typed Python client for Infopédia — the European-Portuguese dictionary by Porto Editora.
Each word page is parsed into a typed Entry: headword, IPA pronunciation(s), syllabification, etymology, grammatical categories with numbered senses, set phrases, inflected forms, and the sidebar related-word lists (synonyms, rhymes, neighbours…).
Built for Portuguese NLP/lexicon work — it correctly separates heterophonic homographs (same spelling, different pronunciation per reading), which is what makes it useful for grapheme-to-phoneme and disambiguation tasks.
Install
pip install pyinfopedia # or: uv pip install pyinfopedia
pip install pyinfopedia[stealth] # + curl_cffi for Cloudflare bypass
Depends on unblock_requests for transport.
Quick start
import pyinfopedia
entry = pyinfopedia.get_word("casa")
print(entry.pronunciation) # ˈkazɐ
print(entry.categories[0].pos) # nome feminino
print(entry.categories[0].senses[0].definition)
for r in pyinfopedia.search("cas"): # prefix autocomplete
print(r.word, r.url)
Heterophonic homographs
Infopédia lists one entry block per pronunciation; pyinfopedia keeps them separate, tying each grammatical category (and its senses) to the reading it belongs to:
entry = pyinfopedia.get_word("sede")
for cat in entry.categories:
print(cat.pronunciation, cat.pos, "->", cat.senses[0].definition)
# ˈsɛdɨ nome feminino -> lugar onde alguém se pode sentar ou fixar (seat / HQ)
# ˈsedɨ nome feminino -> sensação causada pela necessidade de beber (thirst)
The two readings carry disjoint senses — corte (cut ˈkɔɾtɨ / court ˈkoɾtɨ),
molho (sauce ˈmoʎu / bundle ˈmɔʎu), forma (mould ˈfoɾmɐ / shape ˈfɔɾmɐ) all
behave the same way. See examples/heterophones.py.
Transport / Cloudflare
All HTTP goes through Transport, a wrapper over unblock_requests.CloudflareSession.
Pick a mode when the default is blocked:
from pyinfopedia import Infopedia
client = Infopedia(mode="curl_cffi") # impersonate a browser
client = Infopedia(mode="flaresolverr",
flaresolverr_url="http://192.168.1.116:8191") # FlareSolverr
Modes: requests · curl_cffi · flaresolverr · wayback.
Verbs
from pyinfopedia import get_verb
conj = get_verb("jogar")
print(conj.first_person_singular()) # jogo
print(conj.present_indicative())
Datasets
pyinfopedia.dataset exports JSONL/CSV for a word list — see
examples/build_dataset.py.
Development
pytest -m "not live" # offline parser/model tests (HTML fixtures)
PYINFOPEDIA_FLARESOLVERR=http://host:8191 pytest -m live # hit the live site
Apache-2.0 · JarbasAi <jarbasai@mailfence.com>. Data belongs to Porto Editora /
Infopédia; this is an unofficial client — respect their terms and rate limits.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyinfopedia-0.0.1a2.tar.gz.
File metadata
- Download URL: pyinfopedia-0.0.1a2.tar.gz
- Upload date:
- Size: 21.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e83a7e97f7577651e6001860b769c1a52ed3fccfe2ece6258d05c4959b0199b4
|
|
| MD5 |
843297b9af311103003f8592702bcf06
|
|
| BLAKE2b-256 |
5422c2b610ec11e8ec939fd4a37ab353612a21df12c2c3795ba54ec3798acd64
|
File details
Details for the file pyinfopedia-0.0.1a2-py3-none-any.whl.
File metadata
- Download URL: pyinfopedia-0.0.1a2-py3-none-any.whl
- Upload date:
- Size: 21.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a735b78fc73e36f1f39fdc408e7c001b9258b7f86a13b5d5197582335116c2a0
|
|
| MD5 |
79d7afb0d68714085c1ad6077132a511
|
|
| BLAKE2b-256 |
bcfaabbdcfb7730e7240ff1aadb060ed4eefa1933eac937528166193446b84c1
|