Extract date and location from a list of strings
Project description
Date Location Extractor
date_location_extractor is a date_location extractor that retrieves dates and locations found in a list of strings. The input can either be a json file with a list or you can use a list directly
Install & Setup
Grab the package using pip
(this will take a few minutes)
pip install date-location-extractor
Dater Location Extractor uses the following dependencies:
- datetime
- dateutil
- geotext
- datefinder
- ast
- os
Basic Usage
Import the module, give some text or a URL, and presto.
from date_location_extractor import DateLocationExtractor
date_location_extractor = DateLocationExtractor()
print(date_location_extractor.get_date_location_from_json_file("list_to_parse.json", use_simple_parser=True))
use_simple_parser
does not use datefinder and uses the simple dateutil parser
The result is a list of dictionaries, e.g:
[{"address": "San Juan Costa Rica", "date_iso": "2009-11-27", "ranking": 1.0, "normalized_address": {"City": "San Juan", "Country": "CR"}}]
Without loading a file:
from date_location_extractor import DateLocationExtractor
date_location_extractor = DateLocationExtractor()
print(date_location_extractor.get_date_location_from_list(["13 May 2009", "12/15/2010"]))
print(date_location_extractor.get_date_location_from_list_with_parser(["13 May 2009", "12/15/2010"]))
The ranking algorithm has the following weights set:
- RANKING_WEIGHT_HAS_DATE = 0.3
- RANKING_WEIGHT_HAS_DAY = 0.2
- RANKING_WEIGHT_HAS_COUNTRY = 0.3
- RANKING_WEIGHT_HAS_CITY = 0.2
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Close
Hashes for date_location_extractor-0.1.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba1013f2f57dd8c0e2c37fd3e25344e133ad65c420559dfb0824959c2d03666c |
|
MD5 | 5d80e5e978cce2f4fcfe5644574f4406 |
|
BLAKE2b-256 | 6d3df1ebfbb5b37f5f084a07981e8f2e05b7751668e2307215a4543cdbab1bd8 |