export/interface with firefox history/site metadata
Project description
ffexport
This backs up firefox history and parses the resulting history (sqlite) files.
Primary function here is to export/interact with my firefox history. Functionality for Chrome are vestigal and I've left them there in case someone wants to mess with it. I recommend you take a look at promnesia if you want immediate support for that.
See here for how firefox stores its history.
Install
pip3 install ffexport
Requires python3.6+
Usage
Usage: ffexport [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
inspect Extracts history/site metadata from one sqlite database.
merge Extracts history/site metadata from multiple sqlite databases.
save Backs up the current firefox sqlite history file.
The inspect and merge commands also accept a --json flag, which dumps the result to STDOUT as JSON. Dates are serialized to epoch time.
Logs are hidden by default. To show the debug logs set export FFEXPORT_LOGS=10 (uses logging levels)
save
Usage: ffexport save [OPTIONS]
Backs up the current firefox sqlite history file.
Options:
--browser [firefox|chrome] Provide either 'firefox' or 'chrome' [defaults
to firefox]
--profile TEXT Use to pick the correct profile to back up. If
unspecified, will assume a single profile
--to PATH Directory to store backup to [required]
Since firefox (and browsers in general) seem to remove old history seemingly randomly, I'd recommend running the following periodically:
$ ffexport save --to ~/data/firefox
[D 200828 15:30:58 save_hist:67] backing up /home/sean/.mozilla/firefox/jfkdfwx.dev-edition-default/places.sqlite to /home/sean/data/firefox/places-20200828223058.sqlite
[D 200828 15:30:58 save_hist:71] done!
That atomically copies the firefox sqlite database which contains your history --to some backup directory.
inspect
Usage: ffexport inspect [OPTIONS] SQLITE_DB
Extracts history/site metadata from one sqlite database.
Provide a firefox history sqlite databases as the first argument. Drops
you into a REPL to access the data.
Options:
--json Print result to STDOUT as JSON
As an example:
$ ffexport inspect ~/data/firefox/places-20200828231237.sqlite
[D 200828 17:08:23 parse_db:73] Parsing visits from /home/sean/data/firefox/places-20200828231237.sqlite...
[D 200828 17:08:23 parse_db:92] Parsing sitedata from /home/sean/data/firefox/places-20200828231237.sqlite...
Demo: Your most common sites....
[('github.com', 13778),
('www.youtube.com', 8114),
('duckduckgo.com', 8054),
('www.google.com', 6542),
('discord.com', 6141),
('stackoverflow.com', 2528),
('gitlab.com', 1608),
('trakt.tv', 1362),
('letterboxd.com', 1053),
('www.reddit.com', 708)]
Use mvis or msite to access raw visits/site data, vis for the merged data
In [1]: ....
That drops you into a REPL with access to the history from that database (vis and mvis/msite)
merge
Similar to inspect, but accepts multiple database backups, merging the Visits together and dropping you into a REPL
Usage: ffexport merge [OPTIONS] SQLITE_DB...
Extracts history/site metadata from multiple sqlite databases.
Provide multiple sqlite databases as positional arguments, e.g.: ffexport
merge ~/data/firefox/dbs/*.sqlite
Provides a similar interface to inspect; drops you into a REPL to access
the data.
Options:
--include-live In addition to any provided databases, copy
current (firefox) history to /tmp and merge it
as well
--json Print result to STDOUT as JSON
(also accepts the --browser and --profile flags like the save command, provide those if you have multiple profiles and are using the --include-live flag.
Example:
$ ffexport merge --include-live ~/data/firefox/*.sqlite
[D 200828 18:53:54 save_hist:67] backing up to /tmp/tmp8tvyotv9/places-20200829015354.sqlite
[D 200828 18:53:54 save_hist:71] done!
[D 200828 18:53:54 merge_db:52] merging information from 3 databases...
[D 200828 18:53:54 parse_db:71] Parsing visits from /home/sean/data/firefox/places-20200828223058.sqlite...
[D 200828 18:53:55 parse_db:90] Parsing sitedata from /home/sean/data/firefox/places-20200828223058.sqlite...
[D 200828 18:53:56 parse_db:71] Parsing visits from /home/sean/data/firefox/places-20200828231237.sqlite...
[D 200828 18:53:56 parse_db:90] Parsing sitedata from /home/sean/data/firefox/places-20200828231237.sqlite...
[D 200828 18:53:57 parse_db:71] Parsing visits from /tmp/tmp8tvyotv9/places-20200829015354.sqlite...
[D 200828 18:53:58 parse_db:90] Parsing sitedata from /tmp/tmp8tvyotv9/places-20200829015354.sqlite...
[D 200828 18:53:59 merge_db:64] Summary: removed 183,973 duplicates...
[D 200828 18:53:59 merge_db:65] Summary: returning 92,066 visit entries...
Python 3.8.5 (default, Jul 27 2020, 08:42:51)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.17.0 -- An enhanced Interactive Python. Type '?' for help.
Use merged_vis to access merged data from all databases
To dump all that info to json:
$ ffexport merge --include-live --json ~/data/firefox/*.sqlite > ./data.json
[D 201029 02:46:19 save_hist:66] backing up /home/sean/.mozilla/firefox/lsinsptf.dev-edition-default/places.sqlite to /tmp/tmpdvi8kir1/places-20201029094619.sqlite
[D 201029 02:46:19 save_hist:70] done!
[D 201029 02:46:19 merge_db:48] merging information from 3 databases...
[D 201029 02:46:19 parse_db:69] Parsing visits from /home/sean/data/firefox/places-20200828223058.sqlite...
[D 201029 02:46:20 parse_db:88] Parsing sitedata from /home/sean/data/firefox/places-20200828223058.sqlite...
[D 201029 02:46:20 parse_db:69] Parsing visits from /home/sean/data/firefox/places-20201010031025.sqlite...
[D 201029 02:46:21 parse_db:88] Parsing sitedata from /home/sean/data/firefox/places-20201010031025.sqlite...
[D 201029 02:46:21 parse_db:69] Parsing visits from /tmp/tmpdvi8kir1/places-20201029094619.sqlite...
[D 201029 02:46:22 parse_db:88] Parsing sitedata from /tmp/tmpdvi8kir1/places-20201029094619.sqlite...
[D 201029 02:46:22 merge_db:60] Summary: removed 220,876 duplicates...
[D 201029 02:46:22 merge_db:61] Summary: returning 149,649 visit entries...
$ du -h ./data.json
41M data.json
Library Usage
Can also import and provide files from python elsewhere.
>>> import ffexport, glob
>>> visits = list(ffexport.read_and_merge(*glob.glob('data/firefox/*.sqlite'))) # note the splat, read_and_merge accepts variadic arguments
>>> visits[10000]
Visit(
url="https://github.com/python-mario/mario",
visit_date=datetime.datetime(2020, 6, 24, 2, 23, 32, 482000, tzinfo=<UTC>),
visit_type=1,
title="python-mario/mario: Powerful Python pipelines for your shell",
description="Powerful Python pipelines for your shell . Contribute to python-mario/mario development by creating an account on GitHub.",
preview_image="https://repository-images.githubusercontent.com/185277224/2ce27080-b915-11e9-8abc-088ab263dbd9",
)
For another example, see my HPI integration.
Notes
See here for what the visit_type enum means.
I considered using cachew but because of the volume of the data, it ends up being slower than reading directly from the sqlite database exports. Both the visits and sitedata functions are cachew compliant though, you'd just have to wrap it yourself. See here for more info.
save_hist.py/initial structure is modified from karlicoss/promnesia
Testing
git clone https://github.com/seanbreckenridge/ffexport
cd ./ffexport
pip install '.[testing]'
mypy ./ffexport
pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ffexport-0.1.5.tar.gz.
File metadata
- Download URL: ffexport-0.1.5.tar.gz
- Upload date:
- Size: 18.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b99c19d34080426f5a315a86ab3fa7d4904db301795c1c9e2cce92e03f1aa50
|
|
| MD5 |
8cb53d3d7848c84ca68a6bd9097dd16b
|
|
| BLAKE2b-256 |
c5a0706a5a8255d17883b50a66fd4baae399ce3d4ae9894a048f884ef4155626
|
File details
Details for the file ffexport-0.1.5-py3-none-any.whl.
File metadata
- Download URL: ffexport-0.1.5-py3-none-any.whl
- Upload date:
- Size: 14.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad6fa90e46fbcda8e5ab4459389baa471b0a34591ff02167ae05b782019992a0
|
|
| MD5 |
ace8d882496b2d26476feb181b099625
|
|
| BLAKE2b-256 |
0d5df201a1a25a57b57f21dce5332e21c238736ec01c7e961f62a93c12c008a6
|