Skip to main content

Yet another set of scraping tools for FanFiction.Net

Project description

Collaborative Filtering in FanFiction Networks

PyPI - Python Version PyPI license

"ffscraper" Yet another set of scraping tools for FanFiction.Net

Alexander L. Hayes (@batflyer)

Installation

pip install ffscraper

Requires: bs4, requests

Background

FanFiction.Net was established in 1998 and is among the world's largest collection of user-submitted fanfiction (works of fiction authored by fans of existing stories, such as movies, books, or TV shows). The large amount of easily-available user content has drawn interest from those interested in analyzing the content and creative differences between original works and their fanfiction derivatives [1]. More recently, [2] created an anonymized dataset of the metadata from fanfiction sources.

This repository's purpose is twofold: creating robust open-source tools for scraping content, and using that content to build open-source systems which can be used by the FanFiction.Net community.

References

  • [1] Milli, Smitha and David Bamman, "Beyond Canonical Texts: A Computational Analysis of Fanfiction." Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.
  • [2] Yin, K., Aragon, C., Evans, S. and Katie Davis. "Where No One Has Gone Before: A Meta-Dataset of the World's Largest Fanfiction Repository." Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 2017.

Attribution

  • This was originally part of a final project for Professor Vibhav Gogate's Spring 2018 Advanced Machine Learning class at the University of Texas at Dallas. This version of the code, TeX, and .pdf are tagged as v0.1.0.
  • monochrome is a Jekyll theme by @dyutibarma. Used under the terms of the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

ffscraper-0.2.0-py2.py3-none-any.whl (28.5 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page