Skip to main content

tools for parsing, extracting, reconciling, and unshortening urls

Project description

A newslynx-opinionated collection of utilities for dealing with urls.

Install

git clone http://github.com/newslynx/newslynx-urls.git
pip install -e newslynx-urls

Test

requires nose

nosetests

Usage

This module contains various methods that are used throughout newslnyx-core. but the main functions are unshorten_url, is_article_url, and prepare_url:

from newslynx_url import (
  unshorten_url, is_article_url, prepare_url
)

print unshorten_url('bit.ly/1j3SrUC')
# http://towcenter.org/blog/tow-fellows-brian-abelson-and-michael-keller-to-study-the-impact-of-journalism/

print is_article_url(
  'http://towcenter.org/blog/tow-fellows-brian-abelson-and-michael-keller-to-study-the-impact-of-journalism'
  )
# True

print is_article_url(
  'http://towcenter.org/blog/tow-fellows-brian-abelson-and-michael-keller-to-study-the-impact-of-journalism',
  pattern = r'.*towcenter\.org/blog/.*'
)
# True

import re
pattern = re.compile(r'.*towcenter\.org/blog/.*')
print is_article_url(
  'http://towcenter.org/blog/tow-fellows-brian-abelson-and-michael-keller-to-study-the-impact-of-journalism',
  pattern = pattern
)
# True

print prepare_url(
  'http://towcenter.org/blog/tow-fellows-brian-abelson-and-michael-keller-to-study-the-impact-of-journalism/?q=lfjad&f=lkfdjsal'
  )
# http://towcenter.org/blog/tow-fellows-brian-abelson-and-michael-keller-to-study-the-impact-of-journalism

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

newslynx-url-0.0.4.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

newslynx-url-0.0.4.macosx-10.9-intel.exe (78.8 kB view details)

Uploaded Source

File details

Details for the file newslynx-url-0.0.4.tar.gz.

File metadata

File hashes

Hashes for newslynx-url-0.0.4.tar.gz
Algorithm Hash digest
SHA256 af65dfcf66a764d304362450b7296eea3405b910165dfa8968cae3a00285030f
MD5 265ed73ececa5e69cf60fe822f38cafc
BLAKE2b-256 27298f1e9260df48f3978040d16f6f4b292303075b6995457af1b3e7cf8687a7

See more details on using hashes here.

File details

Details for the file newslynx-url-0.0.4.macosx-10.9-intel.exe.

File metadata

File hashes

Hashes for newslynx-url-0.0.4.macosx-10.9-intel.exe
Algorithm Hash digest
SHA256 e61a657c14405cf2374ccfa766d5b9ebff30ec0f712b6112971d959be00a5898
MD5 f14c2929d82b74a559cfdb0a045c3bc8
BLAKE2b-256 bd2519ecddc290975711b226e8abddfa60c1d014ca7900730f1befa3f744733e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page