Skip to main content

Check and Fix Outdated URLs

Project description

urlfix: Check and Fix Outdated URLs

PyPI version fury.io DOI Project Status Codecov Test-Package Travis Build PyPI license Documentation Status Total Downloads Monthly Downloads Weekly Downloads Maintenance GitHub last commit GitHub issues GitHub issues-closed

urlfix aims to find all outdated URLs in a given file and fix them.

Features List

  • Commandline and programmer-friendly modes.

  • Replace outdated URLs/links in a single file

  • Replace outdated URLs/links in a directory

  • Replace outdated URLs/links in the same file or in the same files in a directory i.e. inplace.

  • Replace outdated links in files in nested directories

  • Replace outdated links in files in sub-nested directories

Supported file formats

urlfix fixes URLs given a file of the following types:

  • MarkDown (.md)

  • Plain Text files (.txt)

  • RMarkdown (.rmd)

  • ReStructured Text (.rst)

  • PDF (.pdf)

  • Word (.docx)

  • ODF (.odf)

Installation

The simplest way to install the latest release is as follows:

pip install urlfix

To install the development version:

Open the Terminal/CMD/Git bash/shell and enter

pip install git+https://github.com/Nelson-Gon/urlfix.git

# or for the less stable dev version
pip install git+https://github.com/Nelson-Gon/urlfix.git@dev

Otherwise:

# clone the repo
git clone git@github.com:Nelson-Gon/urlfix.git
cd urlfix
python3 setup.py install

Sample usage

Script Mode

To use at the commandline, please use:

python -m urlfix --mode "f" --verbose 1 --inplace 1 --inpath myfile.md

If not replacing within the same file, then:

python -m urlfix --mode "f" --verbose 1 --inplace 0 --inpath myfile.md --output-file myoutputfile.md

To get help:

python -m urlfix -h 

#usage: main.py [-h] -m MODE -in INPUT_FILE [-o OUTPUT_FILE] -v {False,false,0,True,true,1} -i {False,false,0,True,true,1}
#
#optional arguments:
#  -h, --help            show this help message and exit
#  -m MODE, --mode MODE  Mode to use. One of f for file or d for directory
#  -in INPUT_FILE, --input-file INPUT_FILE
#                        Input file for which link updates are required.
#  -o OUTPUT_FILE, --output-file OUTPUT_FILE
#                        Output file to write to. Optional, only necessary if not replacing inplace
#  -v {False,false,0,True,true,1}, --verbose {False,false,0,True,true,1}
#                        String to control verbosity. Defaults to True.
#  -i {False,false,0,True,true,1}, --inplace {False,false,0,True,true,1}
#                        Should links be replaced inplace? This should be safe but to be sure, test with an output file first.

Programmer-Friendly Mode

from urlfix.urlfix import URLFix
from urlfix.dirurlfix import DirURLFix

Create an object of class URLFix

urlfix_object = URLFix("testfiles/testurls.txt", output_file="replacement.txt")

Replacing URLs

After creating our object, we can replace outdated URLs as follows:

urlfix_object.replace_urls(verbose=1)

The above uses default arguments and will not replace a file inplace. This is a safety mechanism to ensure one does not damage their files.

Since we set verbose to True, we get the following output:

urlfix_object.replace_urls()

To replace silently, simply set verbose to False (which is the default).

urlfix_object.replace_urls()

If there are URLs known to be valid, pass these to the correct_urls argument to save some time.

urlfix_object.replace_urls(correct_urls=[urls_here]) # Use a Sequence eg tuple, list, etc

Replacing several files in a directory

To replace several files in a directory, we can use DirURLFix as follows.

  • Instantiate an object of class DirURLFix
replace_in_dir = DirURLFix("path_to_dir")
  • Call replace_urls
replace_in_dir.replace_urls()

Recursively replacing links in nested directories

To replace outdated links in several files located in several directories, we set recursive to True. Currently, replacing links in directories nested within nested directories is not (yet) supported.

recursive_object = DirURLFix("path_to_root_directory", recursive=True)

We can then proceed as above

recursive_object.replace_urls() # provide other arguments as you may wish. 

To report any issues, suggestions or improvement, please do so at issues.

If you would like to cite this work, please use:

Nelson Gonzabato (2021) urlfix: Check and Fix Outdated URLs https://github.com/Nelson-Gon/urlfix

Thank you very much.

“Before software can be reusable it first has to be usable.” – Ralph Johnson

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

urlfix-0.2.2.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

urlfix-0.2.2-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file urlfix-0.2.2.tar.gz.

File metadata

  • Download URL: urlfix-0.2.2.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.4.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.6.13

File hashes

Hashes for urlfix-0.2.2.tar.gz
Algorithm Hash digest
SHA256 ecd4e94e52677490c98c16b87828ce7bd0c3eb64e15d22035c9a5d740a7627ef
MD5 ec09c411c37ac7278814902d88a99373
BLAKE2b-256 f6a4409c915a2b6a4eaf7a0d741e9f568ab724029143506da6d8e2b6520e9b32

See more details on using hashes here.

File details

Details for the file urlfix-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: urlfix-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.4.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.6.13

File hashes

Hashes for urlfix-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3ee7c5c647f831ee1e1a702a14e9f9a9497332a5b6fd9f2accdbc8ab5477aa2f
MD5 b3c3b53dad55035f56011301b8670060
BLAKE2b-256 44d7246f2f2e6f973830b1ea38bd60223a568b253a65532bf3469672b617d94f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page