Skip to main content

Python parser for Apache/nginx-style HTML directory listing.

Project description

Python parser for Apache/nginx-style HTML directory listing

import htmllistparse
cwd, listing = htmllistparse.fetch_listing(some_url, timeout=30)

# or you can get the url and make a BeautifulSoup yourself, then use
# cwd, listing = htmllistparse.parse(soup)

where cwd is the current directory, listing is a list of FileEntry named tuples:

  • name: File name, str. Have a trailing / if it’s a directory.

  • modified: Last modification time, time.struct_time or None. Timezone is not known.

  • size: File size, int or None. May be estimated from the prefix, such as “K”, “M”.

  • description: File description, file type, or any other things found. str as HTML, or None.

Supports:

  • Vanilla Apache/nginx/lighttpd/darkhttpd autoindex

  • Most <pre>-style index

  • Many other <table>-style index

  • <ul>-style

ReHTTPFS

Reinvented HTTP Filesystem.

  • Mounts most HTTP file listings with FUSE.

  • Gets directory tree and file stats with less overhead.

  • Supports Range requests.

  • Supports Keep-Alive.

usage: rehttpfs.py [-h] [-o OPTIONS] [-t TIMEOUT] [-u USER_AGENT] [-v] [-d]
                   url mountpoint

Mount HTML directory listings.

positional arguments:
  url                   URL to mount
  mountpoint            filesystem mount point

optional arguments:
  -h, --help            show this help message and exit
  -o OPTIONS            comma separated FUSE options
  -t TIMEOUT, --timeout TIMEOUT
                        HTTP request timeout
  -u USER_AGENT, --user-agent USER_AGENT
                        HTTP User-Agent
  -v, --verbose         enable debug logging
  -d, --daemon          run in background

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

htmllistparse-0.5.2.tar.gz (9.3 kB view details)

Uploaded Source

File details

Details for the file htmllistparse-0.5.2.tar.gz.

File metadata

  • Download URL: htmllistparse-0.5.2.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.4.2 requests/2.21.0 setuptools/41.2.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.5

File hashes

Hashes for htmllistparse-0.5.2.tar.gz
Algorithm Hash digest
SHA256 ed864e1778257bbbc1ed0a865855ab846da4bf2b54733df2244978ae76cb474b
MD5 46823583cc09ef4c8ab89612b0ae7d97
BLAKE2b-256 425340c7f5f9220cdbb4156dd7eca035d50bd964b9b05c65a73d8493819dd342

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page