Crawling all GEO metadata.
Project description
geo-spider
crawl all GEO metadata, features:
- crawl platforms
- crawl samples
- crawl series
- incremental crawling
- missed crawling
Table of Contents
installation
pip install geo-spider
output file format
geo-spider saves files in jsonlines form, Refer to this site for details.
logs
geo-spider default generate logs to geo-spider.log(current directory)
in WARNING level, you can customize by -d
and -l
options.
-d
to enable debug mode-l
specify customized log file
geo-spider -d -l new-geo-spider.log <sub-command>
platforms
platforms denovo crawling
geo-spider platforms -o platforms.jl
platforms incremental crawling
If you have a crawled platforms jsonlines file:
geo-spider platforms -cf platforms.jl -o new-platforms.jl
If you have multiple platforms jsonlines files:
geo-spider platforms -cd platforms -o new-platforms.jl
platforms missed crawling
Specify -cf
or -cd
like incremental crawling, add a -m
option.
geo-spider platforms -cf platforms.jl -m missed -o new-platforms.jl
samples
samples denovo crawling
geo-spider samples -o samples.jl
samples incremental crawling
geo-spider samples -pcf platforms.jl -cf samples.jl -o new-samples.jl
samples missed crawling
geo-spider samples -pcf platforms.jl -cf samples.jl -m missed -o new-samples.jl
series
series denovo crawling
geo-spider series -o series.jl
series incremental crawling
geo-spider series -pcf platforms.jl -scf samples.jl -cf series.jl -o new-series.jl
series missed crawling
geo-spider series -pcf platforms.jl -scf samples.jl -cf series.jl -m missed -o new-series.jl
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
geo-spider-0.0.5.tar.gz
(5.3 kB
view details)
Built Distributions
geo_spider-0.0.5-py3.7.egg
(9.2 kB
view details)
File details
Details for the file geo-spider-0.0.5.tar.gz
.
File metadata
- Download URL: geo-spider-0.0.5.tar.gz
- Upload date:
- Size: 5.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.4.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d7f72efeb3eb0f0c47e2769b687532a3bc05224fe71736b517a99c60fa4f8205 |
|
MD5 | 0fc61d74ec7cbeecbe6f99929bb03f86 |
|
BLAKE2b-256 | e60411427d6fb07b376428e28e259e75dcacf406df624d803bc3ecec0084d28a |
File details
Details for the file geo_spider-0.0.5-py3.7.egg
.
File metadata
- Download URL: geo_spider-0.0.5-py3.7.egg
- Upload date:
- Size: 9.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.4.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 94fc97203fe489df737e85bf2d6339f56450fcbc217f2c1c231c221e2e6b87e5 |
|
MD5 | acf6c24ed5c16243027414f1a08c490a |
|
BLAKE2b-256 | d9a0dbfe5612194751ce973c5688855c4b8c520e5a46709bc91ae28871c28cfa |
File details
Details for the file geo_spider-0.0.5-py3-none-any.whl
.
File metadata
- Download URL: geo_spider-0.0.5-py3-none-any.whl
- Upload date:
- Size: 4.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.4.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9dabb06e1a2ca91f4e6c052554070da5746551cf414d7c9af5d9120ee60e2fb6 |
|
MD5 | 07448375ae2e062104ea75626262720f |
|
BLAKE2b-256 | fd273ea9ae0aba2fae8212c6d0756953cdeb8b4f93c399bbded6a7733cce9d1a |