A photo-scraping and photo-storage tool used for generating training sets
Project description
instadownloader
instadownloader
is a Python library built on top of instaloader for scraping and storing Instagram photos to generate training datasets. It is currently designed so that a user can download "All Photos", the "Most Recent" Photo, or the Nth-Most Recent Photos, for both public and private users. For private users, the login credentials must be provided.
The script will save .jpg files for the photos downloaded, a pickle file with the raw .jpg files and a pickle file with the photos converted to numpy arrays. These photos will be stored in a folder with the following format : {username}_{today's date}, where today's date is in the format "MM_DD_YY".
To read more about the documentation, visit the src folder.
Installation
Installation is made simple by using pip
pip install instadownloader
Usage
Here is a simple example. A tkinter GUI will be used and all the user will need to do is choose whether the profile is public or private, provide the account name, a password if the account is private, as well the photo range that they desire.
Example :
import instadownloader
import warnings
if __name__ == '__main__':
#filter warnings
warnings.filterwarnings("ignore")
#get the credentials
selector = instadownloader.TkinterSelector()
profile_type = selector.profile_selector()
username, password, photo_range = selector.photo_entry()
#convert the photo range to integer if downloading range of photos
try:
photo_range = int(photo_range)
except:
print('Not an integer-based photo range')
print('')
#get credentials
instadownloader = instadownloader.InstaDownloader(username, password, photo_range, profile_type)
#start the session
loader, profile = instadownloader.start_session()
#download the images
try:
if photo_range == 'All':
images = instadownloader.get_all_images()
elif photo_range == 'Most Recent':
images = instadownloader.get_most_recent_image()
elif type(photo_range) == int:
images = images = instadownloader.get_image()
except Exception as e:
print(e)
#empty the directory
instadownloader.empty_directory()
#save pickle file
instadownloader.save_pickle_file()
These examples and other can be found in the examples folder.
Contributing
We are open to pull requests and look forward to expanding this library further to tackle more complex games. Please open an issue to discuss any changes or improvements.
To install instadownloader
, along with the tools you need to develop and run tests, run the following in your virtualenv:
$pip install -e .[dev]
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file instadownloader-0.0.1.tar.gz
.
File metadata
- Download URL: instadownloader-0.0.1.tar.gz
- Upload date:
- Size: 4.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.0 importlib_metadata/3.7.3 packaging/20.4 pkginfo/1.5.0.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 51cec25167553682c8e5f13134cc2a884fd549b9baf126854551cee919094ebc |
|
MD5 | a54005640e5c8a79cdbc52b83f241180 |
|
BLAKE2b-256 | ea92d42c1617351df209634de2a05ceff4ef0cc00ad82ff04ab25fdfd947fd5c |
File details
Details for the file instadownloader-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: instadownloader-0.0.1-py3-none-any.whl
- Upload date:
- Size: 5.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.0 importlib_metadata/3.7.3 packaging/20.4 pkginfo/1.5.0.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c55e216e6547d70f024ad352e0ac181dbf41243e3b9c3a1f8fd3f180eda71c74 |
|
MD5 | b93463ed112f92c3740e77f1b1bd3e7a |
|
BLAKE2b-256 | e999913c16e27b8d9997ba187525900d715f461791c15c75a3f590d75124bf1f |