Parse raw Indian address strings into structured fields using a fine-tuned Qwen3 LoRA adapter
Project description
indian-address-parser
Parse raw, unstructured Indian address strings into 13 structured fields using a Qwen3-0.6B model fine-tuned with LoRA. Model weights are downloaded automatically from Hugging Face — this package ships only inference code, no weights.
Input: "FLAT NO.32, UTTARA TOWERS, MG ROAD GUWAHATI , Kamrup Unclassified AS 781029"
Output: {"houseNumber": "FLAT NO.32", "houseName": "UTTARA TOWERS", "poi": null,
"street": "MG ROAD", "subsubLocality": null, "subLocality": null, "locality": null,
"village": null, "subDistrict": null, "district": "Kamrup", "city": "GUWAHATI",
"state": "AS", "pincode": "781029"}
Install
pip install indian-address-parser
Usage
Python
from indian_address_parser import AddressParser
parser = AddressParser() # downloads model weights from HF on first use
result = parser.parse("FLAT NO.32, UTTARA TOWERS, MG ROAD GUWAHATI , Kamrup Unclassified AS 781029")
print(result)
# Batch
results = parser.parse_batch([addr1, addr2, addr3])
CLI
# Single address
indian-address-parser "FLAT NO.32, UTTARA TOWERS, MG ROAD GUWAHATI , Kamrup Unclassified AS 781029"
# Batch from stdin
cat addresses.txt | indian-address-parser --stdin
# Batch from a file, JSONL output
indian-address-parser --file addresses.txt --out results.jsonl
Fields
houseNumber, houseName, poi, street, subsubLocality, subLocality,
locality, village, subDistrict, district, city, state, pincode
Any field not present in the address is null. If the model output can't be parsed as
JSON, all fields are null and a _parse_error key holds the raw model output.
Model details, evaluation metrics, and known limitations
See the model card for training data, LoRA config, per-field evaluation results (100% JSON parse rate, 82.4% mean field accuracy on held-out test data), and known limitations (locality/subLocality/ subsubLocality/village field-boundary ambiguity, etc.).
Apple Silicon (MLX) users
This package uses transformers+peft, which works on CUDA, MPS, and CPU but is not the
fastest path on Apple Silicon. For MLX-native inference, see the mlx/ subfolder of the
Hugging Face repo
instead.
License
Apache 2.0 (matching the base model, Qwen/Qwen3-0.6B).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file indian_address_parser-0.1.3.tar.gz.
File metadata
- Download URL: indian_address_parser-0.1.3.tar.gz
- Upload date:
- Size: 9.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d09c3ea63af4352865c1c6ec5e519b5792eba955d8bb032e9600e176ef8469a3
|
|
| MD5 |
debafa0a94e73f4209ce0a83e2da194b
|
|
| BLAKE2b-256 |
7c7bc274c2fc192b5fa7838b4bce608488b6a5f2e80be771c574497be729dba3
|
File details
Details for the file indian_address_parser-0.1.3-py3-none-any.whl.
File metadata
- Download URL: indian_address_parser-0.1.3-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9dd94ff62ed3cfe63a9fbadce8c7980d044d076bc7feaaf281564b3d4de5742b
|
|
| MD5 |
0675df4d4a547175d8b5f4d3b8dbcc26
|
|
| BLAKE2b-256 |
5efa859db396796817d14b3cf33c051e264d0de0e2272e304ae78dedc26c85e6
|