Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

Datasette plugin that adds a custom SQL function for executing matches using the Rust regular expression engine

Project description

datasette-rure

PyPI CircleCI License

Datasette plugin that adds a custom SQL function for executing matches using the Rust regular expression engine

Install this plugin in the same environment as Datasette to enable the regexp() SQL function.

$ pip install datasette-rure

The plugin is built on top of the rure-python library by David Blewett.

regexp() to test regular expressions

You can test if a value matches a regular expression like this:

select regexp('hi.*there', 'hi there')
-- returns 1
select regexp('not.*there', 'hi there')
-- returns 0

You can also use SQLite's custom syntax to run matches:

select 'hi there' REGEXP 'hi.*there'
-- returns 1

This means you can select rows based on regular expression matches - for example, to select every article where the title begins with an E or an F:

select * from articles where title REGEXP '^[EF]'

Try this out: REGEXP interactive demo

regexp_match() to extract groups

You can extract captured subsets of a pattern using regexp_match().

select regexp_match('.*( and .*)', title) as n from articles where n is not null
-- Returns the ' and X' component of any matching titles, e.g.
--     and Recognition
--     and Transitions Their Place
-- etc

This will return the first parenthesis match when called with two arguments. You can call it with three arguments to indicate which match you would like to extract:

select regexp_match('.*(and)(.*)', title, 2) as n from articles where n is not null

The function will return null for invalid inputs e.g. a pattern without capture groups.

Try this out: regexp_match() interactive demo

regexp_matches() to extract multiple matches at once

The regexp_matches() function can be used to extract multiple patterns from a single string. The result is returned as a JSON array, which can then be further processed using SQLite's JSON functions.

The first argument is a regular expression with named capture groups. The second argument is the string to be matched.

select regexp_matches(
    'hello (?P<name>\w+) the (?P<species>\w+)',
    'hello bob the dog, hello maggie the cat, hello tarquin the otter'
)

This will return a list of JSON objects, each one representing the named captures from the original regular expression:

[
    {"name": "bob", "species": "dog"},
    {"name": "maggie", "species": "cat"},
    {"name": "tarquin", "species": "otter"}
]

Try this out: regexp_matches() interactive demo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for datasette-rure, version 0.3
Filename, size File type Python version Upload date Hashes
Filename, size datasette_rure-0.3-py3-none-any.whl (7.5 kB) File type Wheel Python version py3 Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page