Skip to main content

tools for parsing, extracting, reconciling, and unshortening urls

Project description

newslynx-url
========
A newslynx-opinionated collection of utilities for dealing with urls.


## Install
```
git clone http://github.com/newslynx/newslynx-urls.git
pip install -e newslynx-urls
```

## Test
requires `nose`
```
nosetests
```

## Usage

This module contains various methods that are used throughout `newslnyx-core`.
but the main functions are `unshorten_url`, `is_article_url`, and `prepare_url`:

```python
from newslynx_url import (
unshorten_url, is_article_url, prepare_url
)

print unshorten_url('bit.ly/1j3SrUC')
# http://towcenter.org/blog/tow-fellows-brian-abelson-and-michael-keller-to-study-the-impact-of-journalism/

print is_article_url(
'http://towcenter.org/blog/tow-fellows-brian-abelson-and-michael-keller-to-study-the-impact-of-journalism'
)
# True

print is_article_url(
'http://towcenter.org/blog/tow-fellows-brian-abelson-and-michael-keller-to-study-the-impact-of-journalism',
pattern = r'.*towcenter\.org/blog/.*'
)
# True

import re
pattern = re.compile(r'.*towcenter\.org/blog/.*')
print is_article_url(
'http://towcenter.org/blog/tow-fellows-brian-abelson-and-michael-keller-to-study-the-impact-of-journalism',
pattern = pattern
)
# True

print prepare_url(
'http://towcenter.org/blog/tow-fellows-brian-abelson-and-michael-keller-to-study-the-impact-of-journalism/?q=lfjad&f=lkfdjsal'
)
# http://towcenter.org/blog/tow-fellows-brian-abelson-and-michael-keller-to-study-the-impact-of-journalism
```

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

newslynx-url-0.0.2.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

newslynx-url-0.0.2.macosx-10.9-intel.exe (78.7 kB view details)

Uploaded Source

File details

Details for the file newslynx-url-0.0.2.tar.gz.

File metadata

File hashes

Hashes for newslynx-url-0.0.2.tar.gz
Algorithm Hash digest
SHA256 4a30bab590fbe92911584dcbd4f91075fd5478a95179deda77f6b389df2d7a0f
MD5 c92c133e02c9690b1d34d846d7cbcd3d
BLAKE2b-256 7045ba11b5e557fa34c16349a1a03cc9aaab5e7a5edf0add63df744945a191c0

See more details on using hashes here.

File details

Details for the file newslynx-url-0.0.2.macosx-10.9-intel.exe.

File metadata

File hashes

Hashes for newslynx-url-0.0.2.macosx-10.9-intel.exe
Algorithm Hash digest
SHA256 3240a175efe98a39cf544648f6f22b37864c5ab1f1b56fb16814fbfc51892c79
MD5 721e10fdbf2beafd8f9bb08a7ceefff1
BLAKE2b-256 89069776841943aaa923a0d556af4731af947d452f9254163165e1c1560954cb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page