Skip to main content

Metadata scraper for FFN and AO3

Project description

Fanficapi

An unofficial API (more like story and author metadata scraper) for fanfiction.net and archiveofourown.org written in python

Fanficapi is simple and easy to use python package for scraping story and author metadata from fanfiction.net and archiveofourown.org.

Features

  • Get story metadata from FFN and AO3 from a story link
  • Get author metadata from FFN and AO3 from author link
  • Simple keyword search to get the story link from FFN or AO3

Note: This was just my November project that I made for learning html scraping using python, I know the code is sh*t and just wanted to save it as a private repo, but recently I noticed all my other scrapers stopped working because of FFN's cloudflare protection. This one's working because it's based on undetected-chromedriver. So, if you feel it might be useful to you, here it is!

Installation

Installation using pip:

pip install fanficapi or pip3 install fanficapi

Manual installation by cloning the github repository:

git clone https://github.com/lonely-code-cube/fanficapi
cd fanficapi
python3 setup.py install

Note: The github repository usually has latest updates and features, so it might contain more bugs

Usage

For getting ao3 story metadata:

import fanficapi

ao3 = fanficapi.AO3()
print (ao3.getStoryMeta("https://archiveofourown.org/works/5105735/chapters/11745368"))

The getStoryMeta() function returns a dictionary that looks like:

{'title': 'When In Doubt',
'author': 'JesWithOneEss',
'rating': 'Teen And Up Audiences',
'archiveWarnings': 'Creator Chose Not To Use Archive Warnings',
'category': 'F/M',
'fandom': 'Harry Potter - J. K. Rowling',
'relationship': 'Hermione Granger/Ron Weasley',
'characters': ['Hermione Granger', 'Ron Weasley'],
'tags': ['romione', 'Ron and Hermione - Freeform','Angst', 'Missing Moments', 'Deathly Hallows','book canon', 'Harry Potter - Freeform', 'book 7', 'rhr'],
'published_date': '2015-10-30',
'word_count': '13921',
'chapter_count': '4',
'comments': '8',
'kudos': '76',
'hits': '4217'}

For getting ffn story metadata:

import fanficapi

ffn = fanficapi.FFN(headless=False, delay=5)
print (ffn.getStoryMeta("https://www.fanfiction.net/s/7562379/1/Australia"))

It is not recommended to use the headless mode as increases chances of getting detected by cloudflare, nevertheless, depends on when your are using and your luck

The getStoryMeta() returns a dictionary that looks like:

{'story_name': 'Australia',
'author_name': 'MsBinns',
'Rated': 'Fiction M',
'Language': 'English',
'Genre': 'Romance/Angst',
'Character': 'Ron W., Hermione G.',
'Chapters': '45',
'Words': '340,509',
'Reviews': '2,555',
'Favs': '2,026',
'Follows': '1,456',
'Updated': 'Aug 31, 2014',
'Published': 'Nov 19, 2011',
'Status': 'Complete',
'id': '7562379'}
  • The fanficapi.AO3() takes only one optional argument AO3(textMode: bool) which is by default False
  • The fanficapi.FFN() takes 4 optional argument FFN(textMode: bool, headless: bool, executable_path: str, delay: int)
  • By default undetected-chromedriver is run in headless mode, set this to False if scraping doesnot work
  • If you don't have chrome added to path, download the chromedriver from their wesite or use the one in the repo, set executable_path = "/path/of/chromedriver"
  • If scraping doesn't work even after disableing headless mode, try increasing the delay (default is 5)
  • Text mode just informs what's happening if you don't planning on printing the result but want to know the status

License

GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007

I am not responsible for any kind of loss caused by the usage of this software. This is just a free software, use it at your own risk.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fanficapi-1.0.5.tar.gz (8.4 kB view details)

Uploaded Source

Built Distribution

fanficapi-1.0.5-py3-none-any.whl (20.4 kB view details)

Uploaded Python 3

File details

Details for the file fanficapi-1.0.5.tar.gz.

File metadata

  • Download URL: fanficapi-1.0.5.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for fanficapi-1.0.5.tar.gz
Algorithm Hash digest
SHA256 d389249f0f89ced1a09d1b43a2e4127024f4d062f12d68dd64880754f9f96894
MD5 9f13495a87cf6737668c96621528f925
BLAKE2b-256 dbc42de86783e165def05a34f7047747f3b45dd0baa6f88a41b680e12165dbeb

See more details on using hashes here.

File details

Details for the file fanficapi-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: fanficapi-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 20.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for fanficapi-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 908aff3a6d9d8f88ac7eb9014600c1822737a78ed6b433406d664887498b8ab8
MD5 1e0f3a36c2806bf96b6068fdaac399cf
BLAKE2b-256 726695fd2ffb98170040c0b7a78360a9898c1daf9b1069cad4996cc2df9d8f1a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page