Skip to main content

Extract people from text

Project description

person-extractor

Work in Progress: Identify People's Names in Text

usage

You initialize a PersonExtractor with a path to a CSV of names with each column a language. You can create a csv through Wikinames.

from person_extractor import PersonExtractor

text = "Але дістатися на роботу працівникам цих бізнесів, якщо у них немає власного автомобіля або грошей на таксі чи корпоративну розвозку, стане справжньою проблемою, прогнозує політолог Микола Давидюк."

extractor = PersonExtractor(data="names.csv")

people = extractor.extract(text)

extract returns a list of objects:

    [
        {
            'start': 336,
            'end': 343,
            'text': 'Давидюк',
            'spellings': {
                'en': 'Davidyuk',
                'uk': 'Давидюк'
            }
        }
    ]

test

To test the package run:

python -m unittest person_extractor.test

contact

Post an issue at https://github.com/Mak4Lab/person-extractor/issues or email the package authors at daniel@mak4lab.com and victoria@mak4lab.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

person-extractor-4.0.0.tar.gz (360.5 kB view details)

Uploaded Source

File details

Details for the file person-extractor-4.0.0.tar.gz.

File metadata

  • Download URL: person-extractor-4.0.0.tar.gz
  • Upload date:
  • Size: 360.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/44.1.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/2.7.17

File hashes

Hashes for person-extractor-4.0.0.tar.gz
Algorithm Hash digest
SHA256 b52d0aaa8c2d47ddfa6430a72de07d3b54440c18d2f922f8294d5331322fb6e0
MD5 c74f91e6467bf5ae1096b1c8a2993d50
BLAKE2b-256 27ebc9227836d8a4e9b1e943b9d514e5965f46200a4f1d9b5f3c10a3b5138feb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page