Skip to main content

Python module to implement API-like functionality for the FurAffinity.net website.

Project description

Fur Affinity API

version_pypi version_gitlab version_python

issues_gitlab issues_github

Python module to implement API-like functionality for the FurAffinity.net website

Requirements

Python 3.8+ is necessary to run this module.

Usage

The API is comprised of a main class FAAPI, two submission classes Submission and SubmissionPartial, a journal class Journal, and a user class User.

Once FAAPI is initialized its method can be used to crawl FA and return machine-readable objects.

import faapi
import json

cookies = [
    {"name": "a", "value": "38565475-3421-3f21-7f63-3d341339737"},
    {"name": "b", "value": "356f5962-5a60-0922-1c11-65003b703038"},
]

api = faapi.FAAPI(cookies)
sub, sub_file = api.get_submission(12345678,get_file=True)

print(sub.id, sub.title, sub.author, f"{len(sub_file)/1024:02f}KiB")

with open(f"{sub.id}.json", "w") as f:
    f.write(json.dumps(sub))

with open(sub.file_url.split("/")[-1], "wb") as f:
    f.write(sub_file)

gallery, _ = api.gallery("user_name", 1)
with open("user_name-gallery.json", "w") as f:
    f.write(json.dumps(gallery))

robots.txt

At init, the FAAPI object downloads the robots.txt file from FA to determine the Crawl-delay value set therein. If not set, a value of 1 second is used.

To respect this value, the default behaviour of of the FAAPI object is to wait when a get request is made if the last request was performed more recently then the crawl delay value.

See under FAAPI for more details on this behaviour.

Furthermore, any get operation that points to a disallowed path from robots.txt will raise an exception. This check should not be circumvented and the developer of this module does not take responsibility for violations of the TOS of Fur Affinity.

Cookies

To access protected pages, cookies from an active session are needed. These cookies must be given to the FAAPI object as a list of dictionaries, each containing a name and a value field. The cookies list should look like the following random example:

cookies = [
    {"name": "a", "value": "38565475-3421-3f21-7f63-3d341339737"},
    {"name": "b", "value": "356f5962-5a60-0922-1c11-65003b703038"},
]

To access session cookies, consult the manual of the browser used to login.

Note: it is important to not logout of the session the cookies belong to, otherwise they will no longer work. Note: as of 2020-02-21 only cookies a and b are needed.

User Agent

FAAPI attaches a User-Agent header to every request. The user agent string is generated at startup in the following format: faapi/{package version} Python/{python version} {system name}/{system release}.

Objects

FAAPI

This is the main object that handles all the calls to scrape pages and get submissions.

It holds 6 different fields:

  • cookies: List[dict] = [] cookies passed at init
  • session: CloudflareScraper cfscrape session used for get requests
  • robots: Dict[str, List[str]] robots.txt values
  • crawl_delay: float crawl delay from robots.txt, else 1
  • last_get: float time of last get (not UNIX time, uses time.perf_counter for more precision)
  • raise_for_delay: bool = False if set to True, raises an exception if a get call is made before enough time has passed

Init

__init__(cookies: List[dict] = None)

The class init has a single optional argument cookies necessary to read logged-in-only pages. The cookies can be omitted and the API will still be able to access public pages.

Note: Cookies must be in the format mentioned above in #Cookies.

Methods & Properties

  • load_cookies(cookies: List[Dict[str, Any]])
    Load new cookies in the object and remake the CloudflareScraper session.
  • connection_status -> bool
    Returns the status of the connection.
  • get(path: str, **params) -> requests.Response
    This returns a response object containing the result of the get operation on the given url with the optional **params added to it (url provided is considered as path from 'https://www.furaffinity.net/').
  • get_parse(path: str, **params) -> Optional[bs4.BeautifulSoup]
    Similar to get() but returns the parsed HTML from the normal get operation. If the GET request encountered an error, an HTTPError exception is raised. If the response is not ok, then None is returned.
  • get_submission(submission_id: int, get_file: bool = False) -> Tuple[Submission, Optional[bytes]]
    Given a submission ID, it returns a Submission object containing the various metadata of the submission itself and a bytes object with the submission file if get_file is passed as True.
  • get_submission_file(submission: Submission) -> Optional[bytes]
    Given a submission object, it downloads its file and returns it as a bytes object.
  • get_user(user: str) -> User
    Given a username, it returns a User object containing information regarding the user.
  • gallery(user: str, page: int = 1) -> Tuple[List[SubmissionPartial], int]
    Returns the list of submissions found on a specific gallery page and the number of the next page. The returned page number is set to 0 if it is the last page.
  • scraps(user: str, page: int = 1) -> -> Tuple[List[SubmissionPartial], int]
    Returns the list of submissions found on a specific scraps page and the number of the next page. The returned page number is set to 0 if it is the last page.
  • favorites(user: str, page: str = "") -> Tuple[List[SubmissionPartial], str]
    Downloads a user's favorites page. Because of how favorites pages work on FA, the page argument (and the one returned) are strings. If the favorites page is the last then an empty string is returned as next page. An empty page value as argument is equivalent to page 1.
    Note: favorites page "numbers" do not follow any scheme and are only generated server-side.
  • journals(user: str, page: int = 1) -> -> Tuple[List[Journal], int]
    Returns the list of submissions found on a specific journals page and the number of the next page. The returned page number is set to 0 if it is the last page.
  • search(q: str = "", page: int = 0, **params) -> Tuple[List[SubmissionPartial], int, int, int, int]
    Parses FA search given the query (and optional other params) and returns the submissions found and the next page together with basic search statistics: the number of the first submission in the page (0-indexed), the number of the last submission in the page (0-indexed), and the total number of submissions found in the search. For example if the the last three returned integers are 0, 47 and 437, then the page contains submissions 1 through 48 of a search that has found a total of 437 submissions.
    Note: as of October 2020 the "/search" path is disallowed by Fur Affinity's robots.txt.
  • watchlist_to(self, user: str) -> List[User]
    Given a username, returns a list of User objects for each user that is watching the given user.
  • watchlist_by(self, user: str) -> List[User]
    Given a username, returns a list of User objects for each user that is watched by the given user.
  • user_exists(user: str) -> int
    Checks if the passed user exists - i.e. if there is a page under that name - and returns an int result.
    • 0 okay
    • 1 account disabled
    • 2 system error
    • 3 unknown error
    • 4 request error
  • submission_exists(submission_id: int) -> int
    Checks if the passed submissions exists - i.e. if there is a page with that ID - and returns an int result.
    • 0 okay
    • 1 account disabled
    • 2 system error
    • 3 unknown error
    • 4 request error
  • journal_exists(journal_id: int) -> int
    Checks if the passed journal exists - i.e. if there is a page under that ID - and returns an int result.
    • 0 okay
    • 1 account disabled
    • 2 system error
    • 3 unknown error
    • 4 request error

Journal

This object contains information gathered when parsing a journals page or a specific journal page. It contains the following fields:

  • id: int journal id
  • title: str journal title
  • date: str upload date in YYYY-MM-DD format
  • author: str journal author
  • content: str journal content
  • mentions: List[str] the users mentioned in the content (if they were mentioned as links, e.g. :iconusername:, @username, etc.)
  • user_icon_url: str the url to the user icon
  • journal_item: Union[bs4.element.Tag, bs4.BeautifulSoup] the journal tag/page used to parse the object fields

Journal objects can be directly casted to a dict object or iterated through.

Init

__init__(journal_item: Union[bs4.element.Tag, bs4.BeautifulSoup] = None)

Journal takes one optional parameters: a journal section tag from a journals page or a parsed journal page. Parsing is then performed based on the class of the passed object.

Methods

  • parse(journal_item: Union[bs4.element.Tag, bs4.BeautifulSoup] = None)
    Parses the stored journal tag/page for information. If journal_item is passed, it overwrites the existing journal_item value.

SubmissionPartial

This lightweight submission object is used to contain the information gathered when parsing gallery, scraps, favorites and search pages. It contains only the following fields:

  • id: int submission id
  • title: str submission title
  • author: str submission author
  • rating: str submission rating [general, mature, adult]
  • type: str submission type [text, image, etc...]
  • thumbnail_url: str the url to the submission thumbnail
  • submission_figure: bs4.element.Tag the figure tag used to parse the object fields

SubmissionPartial objects can be directly casted to a dict object or iterated through.

Init

__init__(submission_figure: bs4.element.Tag)

SubmissionPartial init needs a figure tag taken from a parsed page.

Methods

  • parse(submission_figure: bs4.element.Tag)
    Parses the stored submission figure tag for information. If submission_figure is passed, it overwrites the existing submission_figure value.

Submission

The main class that parses and holds submission metadata.

  • id: int submission id
  • title: str submission title
  • author: str submission author
  • date: str upload date in YYYY-MM-DD format
  • tags: List[str] tags list
  • category: str category *
  • species: str species *
  • gender: str gender *
  • rating: str rating *
  • description: str the description as an HTML formatted string
  • mentions: List[str] the users mentioned in the description (if they were mentioned as links, e.g. :iconusername:, @username, etc.)
  • folder: str the submission folder (gallery or scraps)
  • file_url: str the url to the submission file
  • thumbnail_url: str the url to the submission thumbnail
  • user_icon_url: str the url to the user icon
  • submission_page: bs4.BeautifulSoup the submission page used to parse the object fields

* these are extracted exactly as they appear on the submission page

Submission objects can be directly casted to a dict object and iterated through.

Init

__init__(submission_page: bs4.BeautifulSoup = None)

To initialise the object, an optional bs4.BeautifulSoup object is needed containing the parsed HTML of a submission page.

If no submission_page is passed then the object fields will remain at their default - empty - value.

Methods

  • parse(submission_page: bs4.BeautifulSoup = None)
    Parses the stored submission page for metadata. If submission_page is passed, it overwrites the existing submission_page value.

User

A small class that holds a user's full information.

  • name: str display name with capital letters and extra characters such as "_"
  • status: str user status (~, !, etc.)
  • profile: str profile text in HTML format
  • user_icon_url: str the url to the user icon
  • user_page: bs4.BeautifulSoup the user page used to parse the object fields

Init

__init__(user_page: bs4.BeautifulSoup = None)

To initialise the object, an optional bs4.BeautifulSoup object is needed containing the parsed HTML of a submission page.

If no user_page is passed then the object fields will remain at their default - empty - value.

Methods

  • parse(user_page: bs4.BeautifulSoup = None)
    Parses the stored user page for metadata. If user_page is passed, it overwrites the existing user_page value.

Contributing

All contributions and suggestions are welcome!

The only requirement is that any merge request must be sent to the GitLab project as the one on GitHub is only a mirror: GitLab/FALocalRepo

If you have suggestions for fixes or improvements, you can open an issue with your idea, see #Issues for details.

Issues

If any problem is encountered during usage of the program, an issue can be opened on the project's pages on GitLab (preferred) or GitHub (mirror repository).

Issues can also be used to suggest improvements and features.

When opening an issue for a problem, please copy the error message and describe the operation in progress when the error occurred.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

faapi-2.16.0.tar.gz (21.5 kB view hashes)

Uploaded Source

Built Distribution

faapi-2.16.0-py3-none-any.whl (18.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page