The fastest web crawler and indexer.

These details have not been verified by PyPI

Project links

Source Code

Project description

spider-py

The spider project ported to Python.

Getting Started

pip install spider_rs

import asyncio

from spider_rs import Website

async def main():
    website = Website("https://choosealicense.com")
    website.crawl()
    print(website.get_links())

asyncio.run(main())

View the examples to learn more.

Development

Install maturin pipx install maturin and python.

maturin develop

Benchmarks

View the benchmarks to see a breakdown between libs and platforms.

Test url: https://espn.com

`libraries`	`pages`	`speed`
`spider(rust): crawl`	`150,387`	`1m`
`spider(nodejs): crawl`	`150,387`	`153s`
`spider(python): crawl`	`150,387`	`186s`
`scrapy(python): crawl`	`49,598`	`1h`
`crawlee(nodejs): crawl`	`18,779`	`30m`

The benches above were ran on a mac m1, spider on linux arm machines performs about 2-10x faster.

Issues

Please submit a Github issue for any issues found.

Project details

These details have not been verified by PyPI

Project links

Source Code

Release history Release notifications | RSS feed

0.0.53

Sep 24, 2024

0.0.52

Sep 2, 2024

0.0.51

Sep 1, 2024

This version

0.0.50

Aug 29, 2024

0.0.49

Aug 27, 2024

0.0.47

Aug 26, 2024

0.0.46

Aug 23, 2024

0.0.45

Aug 14, 2024

0.0.44

Aug 14, 2024

0.0.43

Jul 19, 2024

0.0.42

Jul 19, 2024

0.0.41

Jul 4, 2024

0.0.40

Jul 4, 2024

0.0.39

Jun 19, 2024

0.0.37

Jun 10, 2024

0.0.36

May 31, 2024

0.0.35

May 29, 2024

0.0.34

Apr 16, 2024

0.0.33

Mar 29, 2024

0.0.32

Mar 27, 2024

0.0.31

Mar 22, 2024

0.0.30

Mar 20, 2024

0.0.28

Mar 20, 2024

0.0.27

Mar 18, 2024

0.0.26

Mar 15, 2024

0.0.25

Mar 9, 2024

0.0.24

Feb 26, 2024

0.0.23

Feb 25, 2024

0.0.22

Feb 24, 2024

0.0.21

Feb 20, 2024

0.0.20

Feb 19, 2024

0.0.19

Feb 19, 2024

0.0.18

Jan 31, 2024

0.0.17

Jan 20, 2024

0.0.16

Jan 13, 2024

0.0.15

Dec 31, 2023

0.0.14

Dec 31, 2023

0.0.13

Dec 27, 2023

0.0.12

Dec 27, 2023

0.0.11

Dec 27, 2023

0.0.10

Dec 26, 2023

0.0.9

Dec 15, 2023

0.0.8

Dec 9, 2023

0.0.7

Dec 9, 2023

0.0.6

Dec 9, 2023

0.0.5

Dec 9, 2023

0.0.4

Dec 8, 2023

0.0.2

Dec 8, 2023

0.0.1

Dec 8, 2023

0.0.0

Dec 8, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spider_rs-0.0.50.tar.gz (42.8 kB view hashes)

Uploaded Aug 29, 2024 Source

Built Distribution

spider_rs-0.0.50-cp311-cp311-macosx_11_0_arm64.whl (9.3 MB view hashes)

Uploaded Aug 29, 2024 CPython 3.11 macOS 11.0+ ARM64

Hashes for spider_rs-0.0.50.tar.gz

Hashes for spider_rs-0.0.50.tar.gz
Algorithm	Hash digest
SHA256	`270bec5bac23a3d22a675d011da1362434b6a9ca883fd9493e1e2cf7802ae52e`
MD5	`311bb9cddd70cb7d297d470d70822855`
BLAKE2b-256	`b18904cc2dbdcf4eb353aa807e17e3384fb034dcc1ce508bc4867b548527aca2`

Hashes for spider_rs-0.0.50-cp311-cp311-macosx_11_0_arm64.whl

Hashes for spider_rs-0.0.50-cp311-cp311-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`be7225b9a1ca5f76cf5aebc38e7578c4cc04a9597cd744caae41707a87c14692`
MD5	`9f66706e413740b50d94489ddd65870f`
BLAKE2b-256	`6f640e446b43a56d3dd815cbfaa4f4d70cd6b7b310416d10e6a18991114931b0`