Skip to main content

Yet another set of scraping tools for FanFiction.Net

Project description

Collaborative Filtering in FanFiction Networks

PyPI - Python Version PyPI license

"ffscraper" Yet another set of scraping tools for FanFiction.Net

Alexander L. Hayes (@batflyer)

Installation

pip install ffscraper

Requires: bs4, requests

Background

FanFiction.Net was established in 1998 and is among the world's largest collection of user-submitted fanfiction (works of fiction authored by fans of existing stories, such as movies, books, or TV shows). The large amount of easily-available user content has drawn interest from those interested in analyzing the content and creative differences between original works and their fanfiction derivatives [1]. More recently, [2] created an anonymized dataset of the metadata from fanfiction sources.

This repository's purpose is twofold: creating robust open-source tools for scraping content, and using that content to build open-source systems which can be used by the FanFiction.Net community.

References

  • [1] Milli, Smitha and David Bamman, "Beyond Canonical Texts: A Computational Analysis of Fanfiction." Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.
  • [2] Yin, K., Aragon, C., Evans, S. and Katie Davis. "Where No One Has Gone Before: A Meta-Dataset of the World's Largest Fanfiction Repository." Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 2017.

Attribution

  • This was originally part of a final project for Professor Vibhav Gogate's Spring 2018 Advanced Machine Learning class at the University of Texas at Dallas. This version of the code, TeX, and .pdf are tagged as v0.1.0.
  • monochrome is a Jekyll theme by @dyutibarma. Used under the terms of the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

ffscraper-0.2.0-py2.py3-none-any.whl (28.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file ffscraper-0.2.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for ffscraper-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 638880dae8e0bc1a07696e4ef318c301f188c8f87e4204e59915fc43891a738c
MD5 71a658f76ee9ac414e773ddb3ce72d9e
BLAKE2b-256 a743fd63c9f6ab9c0d9ed98b0aecd3a6670a37dc2831e9d30f40f810303cb384

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page