Skip to main content

A simple python crawler

Project description

## Description

A simple crawler written in Python.

## Installation and Use

```sh
$ pip install pycrawler
$ ./crawler.py -d5 http://gotchacode.com // -d5 means crawl to the depth of 5.
```

## Results:

You would get a result similar to this:

```sh
Crawler started for http://gotchacode.com, will crawl upto depth 5
===============================================================
http://gotchacode.com/
http://gotchacode.com/atom.xml
http://gotchacode.com/blog/archives
http://gotchacode.com/blog/2014/05/01/the-pragmatic-programmer-checklist/
http://gotchacode.com/blog/2014/05/01/the-pragmatic-programmer-checklist/#disqus_thread
https://gist.github.com/vinitkumar/55ef44f759b7e5620d59
http://gotchacode.com/blog/2014/02/26/simple-state-machine-framework-in-c-number/
http://gotchacode.com/blog/2014/02/26/simple-state-machine-framework-in-c-number/#disqus_thread
https://github.com/mubeenh/SSM
https://www.nuget.org/packages/SSM/
http://gotchacode.com/blog/2014/02/15/setup-a-local-gitignore-without-messing-up-project/
http://gotchacode.com/blog/2014/02/15/setup-a-local-gitignore-without-messing-up-project/#disqus_thread
http://gotchacode.com/blog/2014/02/14/how-to-use-facebook-page-albums-as-image-source-in-django/
http://gotchacode.com/blog/2014/02/14/how-to-use-facebook-page-albums-as-image-source-in-django/#disqus_thread
http://imgur.com/dmrxcXh
https://github.com/changer/cmsplugin-fbgallery
http://gotchacode.com/blog/2014/02/13/migrating-to-octopress/
http://gotchacode.com/blog/2014/02/13/migrating-to-octopress/#disqus_thread
http://gotchacode.com/2014/01/music-movies-and-life.html
http://gotchacode.com/2014/01/music-movies-and-life.html#disqus_thread
http://gotchacode.com/2014/01/happy-new-year-2014.html
http://gotchacode.com/2014/01/happy-new-year-2014.html#disqus_thread
http://gotchacode.com/2013/12/using-html5-localstorage-into-your-app.html
http://gotchacode.com/2013/12/using-html5-localstorage-into-your-app.html#disqus_thread
http://gotchacode.com/2013/12/how-to-focus-and-learn-one-thing.html
http://gotchacode.com/2013/12/how-to-focus-and-learn-one-thing.html#disqus_thread
http://gotchacode.com/2013/11/what-is-information-overload-and-what.html
http://gotchacode.com/2013/11/what-is-information-overload-and-what.html#disqus_thread
http://gotchacode.com/blog/page/2/
https://github.com/vinitkumar
https://plus.google.com/+VinitKumarme?rel=author
http://octopress.org
http://gotchacode.com/blog/categories/movie/
http://gotchacode.com/blog/categories/music/
http://gotchacode.com/blog/categories/life/
http://gotchacode.com/blog/categories/html5/
http://gotchacode.com/blog/categories/localstorage/
http://gotchacode.com/blog/categories/development/
http://gotchacode.com/blog/categories/web/
http://gotchacode.com/blog/categories/stress/
http://gotchacode.com/blog/categories/tricks/
http://gotchacode.com/blog/categories/happiness/
http://gotchacode.com/blog/categories/information-overload/
http://gotchacode.com/blog/categories/social-networking/
http://gotchacode.com/2013/11/happiness-driven-development.html
http://gotchacode.com/2013/07/seven-tips-to-get-better-at-writing-code.html
http://gotchacode.com/2013/07/json2xml-lightweight-python-module-to.html
http://gotchacode.com/blog/2013/07/02/demo-post/
http://gotchacode.com/blog/categories/articles/
http://gotchacode.com/2013/06/talks-and-videos-that-could-make-you.html
http://gotchacode.com/blog/categories/developers/
http://gotchacode.com/blog/categories/tips/
http://gotchacode.com/2013/06/javascript-is-adult-now-see-what-it-can.html
http://gotchacode.com/blog/categories/javascript/
http://gotchacode.com/blog/categories/posts/
http://gotchacode.com/2013/03/modern-development-workflow-for-team.html
http://gotchacode.com/blog/categories/git/
http://gotchacode.com/blog/categories/github/
http://gotchacode.com/2013/02/guide-to-start-with-application.html
http://gotchacode.com/2013/02/how-to-setup-mac-for-web-development.html
http://gotchacode.com/blog/categories/mac/
http://gotchacode.com/blog/categories/osx/
http://gotchacode.com/blog/categories/setup/
http://gotchacode.com/blog/categories/sublime-text/
http://gotchacode.com/blog/categories/vim/
http://gotchacode.com/blog/categories/z/
http://gotchacode.com/blog/categories/zsh/
http://gotchacode.com/2013/02/configure-sublime-text2-for-javascript.html
http://gotchacode.com/blog/categories/linux/
http://gotchacode.com/blog/categories/windows/
http://gotchacode.com/2013/01/happy-new-year-readers.html
http://gotchacode.com/2012/12/how-to-subscribe-any-tag-on.html
http://gotchacode.com/blog/categories/email/
http://gotchacode.com/blog/categories/stackoverflow/
http://gotchacode.com/blog/categories/subscribe/
http://gotchacode.com/2012/12/how-to-setup-ideal-front-end.html
http://gotchacode.com/blog/categories/front-end/
http://gotchacode.com/blog/categories/python/
http://gotchacode.com/blog/categories/ubuntu/
http://gotchacode.com/blog/categories/ruby/
http://gotchacode.com/2012/12/what-are-some-good-javascript-libraries.html
http://gotchacode.com/2012/12/google-communities-great-start.html
http://gotchacode.com/blog/categories/google/
http://gotchacode.com/blog/categories/community/
http://gotchacode.com/blog/categories/google-plus/
http://gotchacode.com/2012/12/tips-for-optimizing-chrome-extension.html
http://gotchacode.com/blog/categories/extensions/
http://gotchacode.com/blog/categories/optimization/
http://gotchacode.com/blog/categories/browsers/
http://gotchacode.com/blog/categories/chrome/
http://gotchacode.com/2012/12/unix-philosophy.html
http://gotchacode.com/2010/05/ubuntu-1004-out.html
http://twitter.com/share
http://disqus.com/?ref_noscript


Crawler Statistics
==================
No of links Found: 177
No of follwed: 6
Time Stats : Found all links after 13.05s
```

## Issues

Create an issue in case you found a bug




Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycrawler-0.1.3.tar.gz (4.3 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page