Skip to main content

An asynchronous file archival program

Project description

Ripandtear

An asynchronous file archival program

The intention of Ripandtear is to make it easy to save/update content uploaded by content creators online. What makes Ripandtear unique is that it stores all information on a user in a .rat file to help condense all the information about a content creator into one location (don't worry, a .rat is just a .json file with a different extension name). From what usernames they use across different websites, to tracking already downloaded URLs and even file names with their MD5 hash to remove duplicates, it's all stored in the .rat for easy convenience when adding new information from the command line. You can store all the information about a content creator, archive their content, remove duplicates and sort the files all with one command!

By using the .rat file you eliminate the need for re-downloading the same content and creating duplicate files. All the previously downloaded Url's are tracked in the .rat file. If the file has already been downloaded, it is skipped to save you time/data/bandwidth and make fewer requests to servers. This also makes it convenient if you share a folder that has a .rat file with someone else. They can pick up where you left off without having to download the same content all over again!

Installation

Requires Python 3.10

(Linux/Mac)

pip install ripandtear

(Windows)

py -m pip install ripandtear

Note for Windows users - If you are installing Python for the first time we recommend you download it from the official Python website. Make sure you select "Add Python <version number> to PATH" during the installation.

Additional Downloads

After downloading ripandtear from your command line run:

playwright install

Playwright is a webdevelopment tool that opens up a browser in the background. Ripandtear uses this for certain websites to interact with javascript in order to find and download links.

Updating

If you already have ripandtear downloaded you can update it with the following command

(Linux/Mac)

pip install --upgrade ripandtear

(Windows)

py -m pip install --upgrade ripandtear

Windows Errors

libmagic missing

If you are using Windows and getting an error about libmagic being missing, try using the following command to fix:

py -m pip install python-magic-bin

More details can be found here

UnicodeEncodeError

Certain emojis are causing ripandtear to crash on Windows. It seems that is might be a configuration problem with the Windows console you are using. This seems like it could be a possible fix.

Current Supported Sites

Bunkr.su

Coomer.party

Cyberdrop.me (The website has many problems so you may have to use the -se flag often. It still might not work even doing that)

Gfycat.com (only direct links and image pages. Not profiles)

Gofile.io (can take passwords if seperated by a ~. https://gofile.io/d/Qjakte~yiAjdwl)

Imgur.com (after the banning of NSFW content albums and galleries no longer work, only direct media URLs (i.imgur.com/asdf123.png))

Jpg.fish|fishing|li|church|pet

Pixl.li

Reddit.com (supports downloading uploads from user pages, subreddits and individual threads. No comments. Just pictures, submitted text and videos)

Redgifs.com

Tiktits.com

Store Names

Tell Ripandtear to save names to the below categories with the specific flags. If you want to print the names to the screen put a 'p' in front of the flag. If syncing is availible add a 's' in front of the flag to download all new content from the site (you cannot print and sync in the same flag: -psr)

chaturbate - -cb, -pcb (to print)

fansly - -f, -pf (to print)

generic - -g, -pg (to print)

instagram - -i, -pi (to print)

myfreecams - -mfc, -pmfc (to print)

onlyfans - -o, -po (to print)

patreon - -P, -pP (to print)

pornhub - -p, -pp (to print)

reddit - -r, -pr (to print), -sr (to sync)

redgifs - -R, -pR (to print), -sR (to sync)

tiktits - -tt, -ptt (to print), -stt (to sync)

tiktok - -T, -pT (to print)

tumblr - -tum, -ptum (to print)

twitch - -twitch, -ptwitch (to print)

twitter - -t, -pt (to print)

youtube - -y, -py (to print)

Store Links

Store links to the websites below with the following flags

coomer.party - -c, -pc (to print), -sc (to sync)

manyvids - -mv, -pmv (to print)

simpcity - -s, -ps (to print)

Save Urls

Save urls to download and already downloaded urls with the following flags

add url to download - -u, -pu (to print), -eu (to erase stored urls)

add already downloaded url - -U, -pU (to print)

Save Tags

Add tags to the .rat file

add tag - -tags, -ptags (to print)

Examples

Download a link

Ripandtear has extractors to download content from many different content providers (see Supported Sites). The simplest use case is to download content from a supported url

ripandtear -d 'https://www.supportedsite.com/<content-id>|https://www.differentsite.io/<content-id>.jpg'

ripandtear - the name of the program

-d - -d stands for download. If Ripandtear recognizes the url it will download all the content it can find into the current directory. You can download multiple links at once by seperating them with a pipe (|) or even adding multiple -d flags in the same command

Creating a new user

Run Ripandtear with the following flags:

ripandtear -mk 'Username' -r 'random_reddit_username' -R 'redgifs_username' -sr -sR -H -S

With this one command you have created a new directory, recorded a reddit and redgifs username in a .rat file, downloaded all the content from each website, found the hashes of the files to remove duplicates and sorted the files into distinct directories based on their type.

Here is the result of the command:

Username (<- the directory)
├── Username.rat
├── pics
│   └── reddit-2022-01-01-example_pic.png
└── vids
    └── redgifs-2023-04-12-example_vid.mp4

Inside the directory you created you now have a .rat file with the same name as the directory (the .rat must match have the same name as the directory. Don't worry it is created automatically). The .rat now contains the usernames you set for reddit and redgifs. Ripandtear has also downloaded all the content off of reddit and redgifs, hashed the files to remove duplicates, sorted the files into folders respective of their content type and recorded all information into the .rat

Lets walk through the previous command to explain what each flag does:

ripandtear - the name of the program

-mk Username - Creates a directory with the name of 'Username', then moves into that directory before executing the following commands. There is no problem if the directory already exists

-r 'random_reddit_username' - the -r flag adds the reddit username to the .rat file.

-R 'redgifs_username' - the -R flag adds the redgifs username to the .rat file.

-sr - -sr stands for "Sync Reddit". Running this command will look up all the Reddit names saved in the .rat file, then download all content that has not been downloaded yet from the users Reddit profile

-sR - -sR stands for "Sync Redgifs". Running this command will look up all the Redgifs names saved in the .rat file, then download all content that has not been downloaded yet from the users Redgifs profile

-H - -H stands for "Hash files". This command hashes all the files in the current directory and removes duplicates. If it is run in the same directory with a .rat file it looks in the .rat file to see if newly downloaded content matches older downloaded content. If there is a match Ripandtear deletes the file with the shorter filename and keeps the file with the longer username. If two files have the same hash, identical file name lengths, but different filenames, the older file is deleted and the newer file is kept.

-S - -S stand for "Sort files". It sorts the files in the current directory into either pics, vids, audio and text directories depending on the file type. If a file does not fit in any of those folders it is kept in the current directory

Adding another username

You find out that the same content creator uses multiple different reddit names. You want to add the new name you found. Doing that is extreamly easy. All you have to do is run the following command:

ripandtear -r 'new_example_reddit_username|plus_another_username' -r 'even_another_name'

Ripandtear looks in the current directory for a .rat file. If a .rat doesn't exist is creates one (naming itself after the current directory) and adds the names. If it finds a .rat it adds the new username(s) to the already existing .rat file. You can even add multiple names at once by separting them with a pipe (|) and/or multiple identical flags. Don't worry about accidentally adding the same name multiple times. Ripandtear makes sure that each name stored in the .rat is unique and that there are no duplicates. This approach applies for all username categories. See stored names to see which usernames you can store and the flags to use to set them.

Adding a website url to download later

Ripandtear has the ability to download content from many different hosting sites. See the Supported Sites section for compatible websites. If you want to add a url to the .rat file to be downloaded later you can do it with the following command.

ripandtear -u 'https://www.examplesite.com/<content-id>.jpg'

-u - -u adds a url to be downloaded later. It stores raw url's so any url can be stored. Saved urls can be downloaded with Ripandtear (if Ripandtear supports the link) with -su (sync urls). Ripandtear looks at all the stored urls in the .rat, If the url matches an extractor Ripandtear has, it will attempt to download all content. If the url doesn't match, it will be kept in the .rat for later, either to help you archive links you want to manually download later, or for when Ripandtear adds an extractor to download the content. Feel free to add whatever url's you want!

Syncing errors

Sometimes when downloading, things go wrong. A server could be down, you could be making so many requests that you were blocked, maybe the content is temporarily unavailible. Ripandtear keeps track of all urls and information relating to the url if an error occures when attempting a download and saves it for later. If you want to reattempt failed downloads you can run the command:

ripandtear -se

-se - -se stands for "Sync Errors". Ripandtear will look for any saved errors and attempt them again. If the url works, the error will be removed. If there is another error Ripandtear will continue to save the url to attempt later. After 5 attempts Ripandtear assumes getting the content will not be possible to download and move the bad url into the downloaded urls category within the .rat to prevent more attempts in the future. The errors saved in the .rat can be cleared with the -ee (erase errors) and printed with -pe (print errors)

Sync All

If you want to update a user and get all of their new content you can run the following command

ripandtear -sa -HS

-sa - -sa stand for "Sync All". It syncs all supported sites (see supported names) and stored urls (-su). It does not sync errors though (-se)

-SH - This is a combination of -S and -H. The order of -S and -H does not matter when combining them

Add Already Downloaded Urls

If you are coming from another downloader, or have a list of already downloaded urls, you can add them to the .rat with the following command so Ripandtear will skip downloading the url again

ripandtear -U 'https://url.com/content.jpg'

-U - -U adds the url to the already downloaded urls section of the .rat file. This is the information Ripandtear checks before preceding with a download. If a url to be downloaded is found within this category, it is skipped. If you are a more advanced Linux user you could run a command like this to move all the urls from a text file into the .rat

cat location/to/urls.txt | xargs -i ripandtear -U {}

Just remember that Ripandtear works off the current directory it is in. Make sure you are located where you want the .rat file to be saved/updated

Logging to the screen

ripandtear -l 2 -SH

By default Ripandtear trys to show as little information as possible to keep the downloading experience very minimal. Sometimes you might want more information to make sure the program is working, or incase you need to post information online to help with trouble shooting. You can do this by adding the -l 2 flag to the command you are executing. There are five levels (1-5). 2 is the recommended level for just tracking that Ripandtear is working and see what is happening. 1 is for debugging and will print a LOT of information that won't be useful if you are doing multiple downloads at once, but exists for extreme situations where you need help trouble shooting a single url

FAQ

Ripandtear seems to be hanging/has long pauses while downloading. Has it frozen?

Ripandtear tries to keep the information printed to the screen as minimal as possible. While Ripandtear is trying to find all the downloadable links, it won't print anything to the screen. If Ripandtear is finding a lot of content this is probably the reason for it to "hang" or seem frozen. In most cases it is just doing a lot of work in the background so just let it run and it should be fine. If you are really worried that something might be wrong, run it with the -l 2 flag. This will print logging information to the screen and you can get more information about what is going on behind the screen. Some websites have specific quirks so if you notice it having more problems more often on specific sites check out the Website Quirks section for more potential causes from that site

WTF?! I ran Ripandtear, it paused and then just shut down without downloading or printing anything

If you are syncing a users profile Ripandtear only downloads content that has not been downloaded yet. If the user has not uploaded anything new between the last time you did a sync, then Ripandtear won't download anything. It not printing anything to the screen just means it didn't find any new content. The other situation this might happen is the link you entered is not supported or there was an error while downloading. You can run the same command with -L to log information to the screen for trouble shooting purposes to try and gather more information.

Website Quirks

Every website is a unique and beautiful website. Because of this they might have their own little oddities and special behavior. Here are some potential symptoms you might notice and what is going on behind the scene

Bunkr

Bunkr file format:

<content_name>.<extension>

To be kinder on Bunkr's servers, Ripandtear's Bunkr extractor incorporates random delays. Finding links will be slower compared to other sites and Ripandtear limits itself to download only 2 files at once. All of this is done to prevent 429 errors (Too Many Requests).

Coomer

Coomer file format:

coomer-<year>-<month>-<day>-<post-id>-<count>-<content-title>.<extension>

Currently limited to 2 simultaneous downloads to prevent 429 errors (Too Many Requests)

Cyberdrop

Cyberdrop file format:

<content_name>.<extension>

Gfycat

Only downloads direct video links and video pages. Not user profiles

Gfycat file format:

gfycat-<year>-<month>-<day>-<gfycat-id>-<reddit post tile: if sent from reddit>.<extension>

Gofile

Gofile file format:

<content_name>.<extension>

You can add gofile urls with passwords via the -d or -u flags if you separte the url and password with a ~

Example: https://gofile.io/d/Qjakte~password

Imgur

Imgur file format:

imgur-<year>-<month>-<day>-<image hash>-<reddit post title: if sent from reddit>.<extension>

Jpg.fish|church

Jpg file format:

<content_id>.<extension>

Pixl

Pixl file format:

<content_id>.<extension>

Reddit

Reddit file format:

reddit-<year>-<month>-<day>-<post id>-<order: if applicable>-<file name>-<post title>.<extension>

The Reddit extractor also uses yt-dlp to download Reddit videos

Reddit subreddit file format: <website file download from>-<year>-<month>-<day>-<post id>-<post rank (at time of download)>-<file name>-<post title>.<extension>

Subreddits can be downloaded with the following url format(s):

https://www.reddit.com/r/gonewild/

https://www.reddit.com/r/gonewild/top/?sort=top&t=month

https://www.reddit.com/r/gonewild/controversial/?sort=controversial&t=day

https://www.reddit.com/r/gonewild/top/?sort=top&t=month&limit=50

Redgifs

Redgifs file format:

redgifs-<year>-<month>-<day>-<redgifs_id>-<reddit post title: if passed from reddit>.<extension>

Tiktits

Tiktits file format:

tiktits-<upload_date>-<url/file_name>.<extension>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ripandtear-0.9.45.tar.gz (58.4 kB view hashes)

Uploaded Source

Built Distribution

ripandtear-0.9.45-py3-none-any.whl (64.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page