Skip to main content

CSS selectors for parsing html on the command line

Project description

Slice and dice html on the command line using CSS selectors.

Quick start

Let’s say you want to grab all the links on http://example.com/foo/bar:

$ curl http://example.com/foo/bar | que "a->href"

Let’s say that gave you 3 lines that looked like this:

/some/url?val=1
/some/url2?val=2
/some/url3?val=3

Ugh, that’s not very helpful, so let’s modify our argument a bit:

$ curl http://example.com/foo/bar | que "a->http://example.com{href}"

Now, that will print:

http://example.com/some/url?val=1
http://example.com/some/url2?val=2
http://example.com/some/url3?val=3

Selecting

Not sure how to use CSS Selectors?

The selector is divided into two parts separated by ->, the first part is the traditional selector talked about in the above links and the second part is the attributes you want to print to the screen for each match:

$ css.selector->attribute,selector

The Selector part uses Python’s string formatting syntax so you can embed the attributes you want within a larger string.

Examples

Find all the “Download” links on a page:

que has support for the the non-standard :contains css selector

$ curl http://example.com | que "a:contains(Download)->href"

Select all the links with attribute data that starts with “foo”:

$ curl http://example.com | que "a[data|=foo]->href"

Installation

You can use pip to install stable:

$ pip install que

or the latest and greatest (which might be different than what’s on pypi:

$ pip install git+https://github.com/jaymon/que#egg=que

Notes

  • If you need a way more fully featured html command line parser, try hq.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
que-0.0.2.tar.gz (3.5 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page