Skip to main content

Make ZIM file from TED Talks

Project description

ted2zim

Get the best :bulb: TED videos offline :arrow_down:

An offliner to create ZIM :package: files from TED talks

PyPI Docker Codefactor Grade License

TED (Technology, Entertainment, Design) is a global set of conferences under the slogan "ideas worth spreading". They address a wide range of topics within the research and practice of science and culture, often through storytelling. The speakers are given a maximum of 18 minutes to present their ideas in the most innovative and engaging ways they can. One can eaisly find all the TED videos here.

This project is aimed at creating a sustainable solution to make TED accessible offline by creating ZIM files providing these videos in a similar manner like online.

Getting started :rocket:

Install the dependencies

Make sure that you have python3, unzip, ffmpeg, wget and curl installed on your system before running the scraper (otherwise you'll get a warning to install them).

Setup the package

One can eaisly install the PyPI version but let's setup the source version. Firstly, clone this repository and install the package as given below.

pip3 install -r requirements.txt
python3 setup.py install

That's it. You can now run ted2zim from your terminal

ted2zim --topics [TOPICS] --name [NAME]

For the full list of arguments, see this file or run the following

ted2zim --help

Example usage

ted2zim --topics="augmented reality" --max-videos-per-topic=10 --debug --name="augumented_reality" --format=mp4 --title="Augmented Reality" --description="TED videos in AR category" --creator="TED" --publisher="openzim" --output="output" --keep --low-quality

This project can also be run with docker. Use the provided Dockerfile to run it with docker. See steps here.

Features :robot:

You can create ZIMs for multiple topics (should be same as given here), choose between different video formats (webm/mp4), different compression rates, and even use an S3 based cache.

Want more flexibility? There's a multitool

ted2zim-multi is an extra command available that allows you to do much more with the scraper. It falls back to ted2zim if normal commands are passed. It supports creation of multiple ZIMs with single command for both playlists and topics and even getting metadata from a specified JSON file. It supports the following extra arguments -

  • --indiv-zims - Allows you to create one zim/topic or one zim/playlist
  • --{name|description|zim-file|title}-format - Allows you to add custom format for the equivalent ted2zim arguments. You can add {identity} as a placeholder in these values to get the playlist ID / topic name in it's place (spaces replaced by -). You can now also add {slug} to get the topic/playlist slug.
  • --metadata-from - Path to a JSON file containing the metadata.

Should be of the following format:

{
    "<playlist-id/topic-name-with-underscores>": {
        "name": "sample_name_{identity}",
        "description": "Sample description",
        "title": "Custom title",
        "zim-file": "sample.zim",
        "tags": "tag",
        "creator": "Yourself",
        "build-dir": "/custom_build_dir"
    }
}

See ted2zim-multi --help for details.

License :book:

GPLv3 or later, see LICENSE for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ted2zim-2.0.5.tar.gz (3.5 MB view details)

Uploaded Source

Built Distribution

ted2zim-2.0.5-py3-none-any.whl (3.6 MB view details)

Uploaded Python 3

File details

Details for the file ted2zim-2.0.5.tar.gz.

File metadata

  • Download URL: ted2zim-2.0.5.tar.gz
  • Upload date:
  • Size: 3.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for ted2zim-2.0.5.tar.gz
Algorithm Hash digest
SHA256 0178528d3298a15b8a0af83a87a78d71325fb341d86c7ff5714a6dcefc734bd0
MD5 4aef2f867d31e89d38e9b9372e2a7821
BLAKE2b-256 84febb578717e18ea97b6559edcb4a40d219ee5f2aff35cb903ea6d461978812

See more details on using hashes here.

File details

Details for the file ted2zim-2.0.5-py3-none-any.whl.

File metadata

  • Download URL: ted2zim-2.0.5-py3-none-any.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for ted2zim-2.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 10dab464297bf24c2352f8fb65082b3b8f2f5152aefe1d5c90e4d9fcf29bfe9d
MD5 85c17d8a905519d87b9c88efd9c8eb90
BLAKE2b-256 be1f9aca6bf64c929dfadb713c604d14213f6ae6481be0998dcf5c5885c7e480

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page