A simple python crawler
Project description
## Description
A simple crawler written in Python.
## Installation and Use
```sh
$ pip install pycrawler
$ ./crawler.py -d5 http://gotchacode.com // -d5 means crawl to the depth of 5.
```
## Results:
You would get a result similar to this:
```sh
Crawler started for http://gotchacode.com, will crawl upto depth 5
===============================================================
http://gotchacode.com/
http://gotchacode.com/atom.xml
http://gotchacode.com/blog/archives
http://gotchacode.com/blog/2014/05/01/the-pragmatic-programmer-checklist/
http://gotchacode.com/blog/2014/05/01/the-pragmatic-programmer-checklist/#disqus_thread
https://gist.github.com/vinitkumar/55ef44f759b7e5620d59
http://gotchacode.com/blog/2014/02/26/simple-state-machine-framework-in-c-number/
http://gotchacode.com/blog/2014/02/26/simple-state-machine-framework-in-c-number/#disqus_thread
https://github.com/mubeenh/SSM
https://www.nuget.org/packages/SSM/
http://gotchacode.com/blog/2014/02/15/setup-a-local-gitignore-without-messing-up-project/
http://gotchacode.com/blog/2014/02/15/setup-a-local-gitignore-without-messing-up-project/#disqus_thread
http://gotchacode.com/blog/2014/02/14/how-to-use-facebook-page-albums-as-image-source-in-django/
http://gotchacode.com/blog/2014/02/14/how-to-use-facebook-page-albums-as-image-source-in-django/#disqus_thread
http://imgur.com/dmrxcXh
https://github.com/changer/cmsplugin-fbgallery
http://gotchacode.com/blog/2014/02/13/migrating-to-octopress/
http://gotchacode.com/blog/2014/02/13/migrating-to-octopress/#disqus_thread
http://gotchacode.com/2014/01/music-movies-and-life.html
http://gotchacode.com/2014/01/music-movies-and-life.html#disqus_thread
http://gotchacode.com/2014/01/happy-new-year-2014.html
http://gotchacode.com/2014/01/happy-new-year-2014.html#disqus_thread
http://gotchacode.com/2013/12/using-html5-localstorage-into-your-app.html
http://gotchacode.com/2013/12/using-html5-localstorage-into-your-app.html#disqus_thread
http://gotchacode.com/2013/12/how-to-focus-and-learn-one-thing.html
http://gotchacode.com/2013/12/how-to-focus-and-learn-one-thing.html#disqus_thread
http://gotchacode.com/2013/11/what-is-information-overload-and-what.html
http://gotchacode.com/2013/11/what-is-information-overload-and-what.html#disqus_thread
http://gotchacode.com/blog/page/2/
https://github.com/vinitkumar
https://plus.google.com/+VinitKumarme?rel=author
http://octopress.org
http://gotchacode.com/blog/categories/movie/
http://gotchacode.com/blog/categories/music/
http://gotchacode.com/blog/categories/life/
http://gotchacode.com/blog/categories/html5/
http://gotchacode.com/blog/categories/localstorage/
http://gotchacode.com/blog/categories/development/
http://gotchacode.com/blog/categories/web/
http://gotchacode.com/blog/categories/stress/
http://gotchacode.com/blog/categories/tricks/
http://gotchacode.com/blog/categories/happiness/
http://gotchacode.com/blog/categories/information-overload/
http://gotchacode.com/blog/categories/social-networking/
http://gotchacode.com/2013/11/happiness-driven-development.html
http://gotchacode.com/2013/07/seven-tips-to-get-better-at-writing-code.html
http://gotchacode.com/2013/07/json2xml-lightweight-python-module-to.html
http://gotchacode.com/blog/2013/07/02/demo-post/
http://gotchacode.com/blog/categories/articles/
http://gotchacode.com/2013/06/talks-and-videos-that-could-make-you.html
http://gotchacode.com/blog/categories/developers/
http://gotchacode.com/blog/categories/tips/
http://gotchacode.com/2013/06/javascript-is-adult-now-see-what-it-can.html
http://gotchacode.com/blog/categories/javascript/
http://gotchacode.com/blog/categories/posts/
http://gotchacode.com/2013/03/modern-development-workflow-for-team.html
http://gotchacode.com/blog/categories/git/
http://gotchacode.com/blog/categories/github/
http://gotchacode.com/2013/02/guide-to-start-with-application.html
http://gotchacode.com/2013/02/how-to-setup-mac-for-web-development.html
http://gotchacode.com/blog/categories/mac/
http://gotchacode.com/blog/categories/osx/
http://gotchacode.com/blog/categories/setup/
http://gotchacode.com/blog/categories/sublime-text/
http://gotchacode.com/blog/categories/vim/
http://gotchacode.com/blog/categories/z/
http://gotchacode.com/blog/categories/zsh/
http://gotchacode.com/2013/02/configure-sublime-text2-for-javascript.html
http://gotchacode.com/blog/categories/linux/
http://gotchacode.com/blog/categories/windows/
http://gotchacode.com/2013/01/happy-new-year-readers.html
http://gotchacode.com/2012/12/how-to-subscribe-any-tag-on.html
http://gotchacode.com/blog/categories/email/
http://gotchacode.com/blog/categories/stackoverflow/
http://gotchacode.com/blog/categories/subscribe/
http://gotchacode.com/2012/12/how-to-setup-ideal-front-end.html
http://gotchacode.com/blog/categories/front-end/
http://gotchacode.com/blog/categories/python/
http://gotchacode.com/blog/categories/ubuntu/
http://gotchacode.com/blog/categories/ruby/
http://gotchacode.com/2012/12/what-are-some-good-javascript-libraries.html
http://gotchacode.com/2012/12/google-communities-great-start.html
http://gotchacode.com/blog/categories/google/
http://gotchacode.com/blog/categories/community/
http://gotchacode.com/blog/categories/google-plus/
http://gotchacode.com/2012/12/tips-for-optimizing-chrome-extension.html
http://gotchacode.com/blog/categories/extensions/
http://gotchacode.com/blog/categories/optimization/
http://gotchacode.com/blog/categories/browsers/
http://gotchacode.com/blog/categories/chrome/
http://gotchacode.com/2012/12/unix-philosophy.html
http://gotchacode.com/2010/05/ubuntu-1004-out.html
http://twitter.com/share
http://disqus.com/?ref_noscript
Crawler Statistics
==================
No of links Found: 177
No of follwed: 6
Time Stats : Found all links after 13.05s
```
## Issues
Create an issue in case you found a bug
A simple crawler written in Python.
## Installation and Use
```sh
$ pip install pycrawler
$ ./crawler.py -d5 http://gotchacode.com // -d5 means crawl to the depth of 5.
```
## Results:
You would get a result similar to this:
```sh
Crawler started for http://gotchacode.com, will crawl upto depth 5
===============================================================
http://gotchacode.com/
http://gotchacode.com/atom.xml
http://gotchacode.com/blog/archives
http://gotchacode.com/blog/2014/05/01/the-pragmatic-programmer-checklist/
http://gotchacode.com/blog/2014/05/01/the-pragmatic-programmer-checklist/#disqus_thread
https://gist.github.com/vinitkumar/55ef44f759b7e5620d59
http://gotchacode.com/blog/2014/02/26/simple-state-machine-framework-in-c-number/
http://gotchacode.com/blog/2014/02/26/simple-state-machine-framework-in-c-number/#disqus_thread
https://github.com/mubeenh/SSM
https://www.nuget.org/packages/SSM/
http://gotchacode.com/blog/2014/02/15/setup-a-local-gitignore-without-messing-up-project/
http://gotchacode.com/blog/2014/02/15/setup-a-local-gitignore-without-messing-up-project/#disqus_thread
http://gotchacode.com/blog/2014/02/14/how-to-use-facebook-page-albums-as-image-source-in-django/
http://gotchacode.com/blog/2014/02/14/how-to-use-facebook-page-albums-as-image-source-in-django/#disqus_thread
http://imgur.com/dmrxcXh
https://github.com/changer/cmsplugin-fbgallery
http://gotchacode.com/blog/2014/02/13/migrating-to-octopress/
http://gotchacode.com/blog/2014/02/13/migrating-to-octopress/#disqus_thread
http://gotchacode.com/2014/01/music-movies-and-life.html
http://gotchacode.com/2014/01/music-movies-and-life.html#disqus_thread
http://gotchacode.com/2014/01/happy-new-year-2014.html
http://gotchacode.com/2014/01/happy-new-year-2014.html#disqus_thread
http://gotchacode.com/2013/12/using-html5-localstorage-into-your-app.html
http://gotchacode.com/2013/12/using-html5-localstorage-into-your-app.html#disqus_thread
http://gotchacode.com/2013/12/how-to-focus-and-learn-one-thing.html
http://gotchacode.com/2013/12/how-to-focus-and-learn-one-thing.html#disqus_thread
http://gotchacode.com/2013/11/what-is-information-overload-and-what.html
http://gotchacode.com/2013/11/what-is-information-overload-and-what.html#disqus_thread
http://gotchacode.com/blog/page/2/
https://github.com/vinitkumar
https://plus.google.com/+VinitKumarme?rel=author
http://octopress.org
http://gotchacode.com/blog/categories/movie/
http://gotchacode.com/blog/categories/music/
http://gotchacode.com/blog/categories/life/
http://gotchacode.com/blog/categories/html5/
http://gotchacode.com/blog/categories/localstorage/
http://gotchacode.com/blog/categories/development/
http://gotchacode.com/blog/categories/web/
http://gotchacode.com/blog/categories/stress/
http://gotchacode.com/blog/categories/tricks/
http://gotchacode.com/blog/categories/happiness/
http://gotchacode.com/blog/categories/information-overload/
http://gotchacode.com/blog/categories/social-networking/
http://gotchacode.com/2013/11/happiness-driven-development.html
http://gotchacode.com/2013/07/seven-tips-to-get-better-at-writing-code.html
http://gotchacode.com/2013/07/json2xml-lightweight-python-module-to.html
http://gotchacode.com/blog/2013/07/02/demo-post/
http://gotchacode.com/blog/categories/articles/
http://gotchacode.com/2013/06/talks-and-videos-that-could-make-you.html
http://gotchacode.com/blog/categories/developers/
http://gotchacode.com/blog/categories/tips/
http://gotchacode.com/2013/06/javascript-is-adult-now-see-what-it-can.html
http://gotchacode.com/blog/categories/javascript/
http://gotchacode.com/blog/categories/posts/
http://gotchacode.com/2013/03/modern-development-workflow-for-team.html
http://gotchacode.com/blog/categories/git/
http://gotchacode.com/blog/categories/github/
http://gotchacode.com/2013/02/guide-to-start-with-application.html
http://gotchacode.com/2013/02/how-to-setup-mac-for-web-development.html
http://gotchacode.com/blog/categories/mac/
http://gotchacode.com/blog/categories/osx/
http://gotchacode.com/blog/categories/setup/
http://gotchacode.com/blog/categories/sublime-text/
http://gotchacode.com/blog/categories/vim/
http://gotchacode.com/blog/categories/z/
http://gotchacode.com/blog/categories/zsh/
http://gotchacode.com/2013/02/configure-sublime-text2-for-javascript.html
http://gotchacode.com/blog/categories/linux/
http://gotchacode.com/blog/categories/windows/
http://gotchacode.com/2013/01/happy-new-year-readers.html
http://gotchacode.com/2012/12/how-to-subscribe-any-tag-on.html
http://gotchacode.com/blog/categories/email/
http://gotchacode.com/blog/categories/stackoverflow/
http://gotchacode.com/blog/categories/subscribe/
http://gotchacode.com/2012/12/how-to-setup-ideal-front-end.html
http://gotchacode.com/blog/categories/front-end/
http://gotchacode.com/blog/categories/python/
http://gotchacode.com/blog/categories/ubuntu/
http://gotchacode.com/blog/categories/ruby/
http://gotchacode.com/2012/12/what-are-some-good-javascript-libraries.html
http://gotchacode.com/2012/12/google-communities-great-start.html
http://gotchacode.com/blog/categories/google/
http://gotchacode.com/blog/categories/community/
http://gotchacode.com/blog/categories/google-plus/
http://gotchacode.com/2012/12/tips-for-optimizing-chrome-extension.html
http://gotchacode.com/blog/categories/extensions/
http://gotchacode.com/blog/categories/optimization/
http://gotchacode.com/blog/categories/browsers/
http://gotchacode.com/blog/categories/chrome/
http://gotchacode.com/2012/12/unix-philosophy.html
http://gotchacode.com/2010/05/ubuntu-1004-out.html
http://twitter.com/share
http://disqus.com/?ref_noscript
Crawler Statistics
==================
No of links Found: 177
No of follwed: 6
Time Stats : Found all links after 13.05s
```
## Issues
Create an issue in case you found a bug
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pycrawler-0.1.3.tar.gz
(4.3 kB
view hashes)