Skip to main content

Make ZIM file from TED Talks

Project description

ted2zim

Get the best :bulb: TED videos offline :arrow_down:

An offliner to create ZIM :package: files from TED talks

CodeFactor License: GPL v3 codecov PyPI version shields.io PyPI - Python Version Docker

TED (Technology, Entertainment, Design) is a global set of conferences under the slogan "ideas worth spreading". They address a wide range of topics within the research and practice of science and culture, often through storytelling. The speakers are given a maximum of 18 minutes to present their ideas in the most innovative and engaging ways they can. One can eaisly find all the TED videos here.

This project is aimed at creating a sustainable solution to make TED accessible offline by creating ZIM files providing these videos in a similar manner like online.

ted2zim adheres to openZIM's Contribution Guidelines.

ted2zim has implemented openZIM's Python bootstrap, conventions and policies v1.0.2.

Getting started :rocket:

Install the dependencies

Make sure that you have python3, unzip, ffmpeg, wget and curl installed on your system before running the scraper (otherwise you'll get a warning to install them).

Setup the package

One can easily install the PyPI version but let's setup the source version.

First, clone this repository.

If you do not already have it on your system, install hatch to build the software and manage virtual environments (you might be interested by our detailed Developer Setup as well).

pip3 install hatch

Start a hatch shell: this will install software including dependencies in an isolated virtual environment.

hatch shell

That's it. You can now run ted2zim from your terminal

ted2zim --topics [TOPICS] --name [NAME]

For the full list of arguments, see this file or run the following

ted2zim --help

Example usage Sample of creating a ZIM for augmented reality topic, with a custom name, mp4 format and so on.

ted2zim --topics="augmented reality" --debug --name="augumented_reality" --format=mp4 --title="Augmented Reality" --description="TED videos in AR category" --creator="TED" --publisher="openzim" --output="output" --keep --low-quality

Sample of creating a ZIM for specific URLs, with a custom name, mp4 format and so on.

ted2zim --links https://www.ted.com/talks/gautam_shah_can_the_metaverse_bring_us_closer_to_wildlife,https://www.ted.com/talks/micaela_mantegna_how_to_stop_the_metaverse_from_becoming_the_internet_s_bad_sequel --debug --name="sample_links" --format=mp4 --title="Sample Links" --description="TED talks from two different URLs" --creator="TED" --publisher="openzim" --output="output" --keep --low-quality

This project can also be run with docker. Use the provided Dockerfile or pre-build images to run it with Docker. See steps here.

Features :robot:

You can create ZIMs for multiple topics (should be same as given here), choose between different video formats (webm/mp4), different compression rates, and even use an S3 based cache.

Want more flexibility? There's a multitool

ted2zim-multi is an extra command available that allows you to do much more with the scraper. It falls back to ted2zim if normal commands are passed. It supports creation of multiple ZIMs with single command for both playlists and topics and even getting metadata from a specified JSON file. It supports the following extra arguments -

  • --indiv-zims - Allows you to create one zim/topic or one zim/playlist
  • --{name|description|zim-file|title}-format - Allows you to add custom format for the equivalent ted2zim arguments. You can add {identity} as a placeholder in these values to get the playlist ID / topic name in it's place (spaces replaced by -). You can now also add {slug} to get the topic/playlist slug.
  • --metadata-from - Path to a JSON file containing the metadata.

Should be of the following format:

{
    "<playlist-id/topic-name-with-underscores>": {
        "name": "sample_name_{identity}",
        "description": "Sample description",
        "title": "Custom title",
        "zim-file": "sample.zim",
        "tags": "tag",
        "creator": "Yourself",
        "build-dir": "/custom_build_dir"
    }
}

See ted2zim-multi --help for details.

License :book:

GPLv3 or later, see LICENSE for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ted2zim-3.1.0.tar.gz (451.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ted2zim-3.1.0-py3-none-any.whl (3.8 MB view details)

Uploaded Python 3

File details

Details for the file ted2zim-3.1.0.tar.gz.

File metadata

  • Download URL: ted2zim-3.1.0.tar.gz
  • Upload date:
  • Size: 451.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for ted2zim-3.1.0.tar.gz
Algorithm Hash digest
SHA256 6a9ce1c2517ba37ea55d747bb88fe5f11333da9c75f2f77abc28341e6ca3d09a
MD5 cfaee3fa8f9b9c8544f3d9536b1259db
BLAKE2b-256 81144706614f15197796ab6d1e397f5c981c6fbdb60ff6cb5f8c92971d381bb1

See more details on using hashes here.

Provenance

The following attestation bundles were made for ted2zim-3.1.0.tar.gz:

Publisher: Publish.yaml on openzim/ted

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ted2zim-3.1.0-py3-none-any.whl.

File metadata

  • Download URL: ted2zim-3.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for ted2zim-3.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cdcbd0fb1572ec08a30437147cb73cf7762ddff8e2d5905f6af5f9d88a244b68
MD5 2837e3f320eba53ba15273389d718e9b
BLAKE2b-256 2cc1343eeccaa8f8a1dd3cd8b5f1318f1aa33ead2a3c4a9c1e0baf01fdd66a1f

See more details on using hashes here.

Provenance

The following attestation bundles were made for ted2zim-3.1.0-py3-none-any.whl:

Publisher: Publish.yaml on openzim/ted

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page