A Bookmark.html file parser, merger and data viewer using pure python
Reason this release was yanked:
setup.py was incorrect causing import to fail
Project description
PyBookmark
A bookmark.html parser, merger and viewer using pure python
- parse bookmark.html files from browsers with html structure included
- merge the parsed bookmark.html files
- export the parsed and merged bookmarks as a JSON archive
- GUI to view, edit, and add to the bookmarks stored in JSON archive
Package Justification
PyBookmark exists to solve a problem you may not have. Read the following to understand the trade off.
Why
You should use PyBookmark if:
- you have many different bookmark html files saved over time
- you wish to merge your bookmark history from multiple computers or files into one view
- you wish to separate the bookmark manager from the browser
- reduce possibility for tracking fingerprint (what bookmarks exist, unique icon file checksums or URLs)
- you wish to reduce clutter in bookmarks (icons)
- you are tired of Firefox (or other) changing which fields are supported to edit/view
- example: description, keywords, tags are intermittently viewable
- you are tired of Firefox (or other) breaking or changing how bookmark edit occurs
- example: recently Firefox made it so edits in the bookmark organizer did not save
- you want a more powerful bookmark search method
- you like control
Why Not
You should not use PyBookmark if:
- you are happy with native browser bookmark management
- you have very few bookmarks or all your bookmarks are in one file already
- you need or want in application multiple device synchronization or cloud backup support
- you primarily browse the internet using a smartphone or proprietary platform apps (facebook/reddit)
- you do not use bookmarks (why did you read this far?)
- you have no interest in understanding code or data structure
- eventually a browser change will mean the file format you try to import won't work and you will have to figure out why
Implementation Details
Assumptions
- Bookmark data is stored in html format. It is possible to extend to merge in json and other backups but that has not been the focus.
- Bookmark data has additional folder structure that
- is important
- indicates relationships between bookmarks
- these assumptions are why a complex parsing of beautiful soup is implemented to extract the URLs and related content
- Colons are useful separators of descriptive location in bookmark labels (not the URL)
- Duplicate bookmarks are bad but merging should be controlled
- You intend to migrate to a separate bookmark manager
- You will always be on a platform that can read the output json structure
Run Options (How to Use)
- parse single file
- library: pybookmark.bookmarks_parse.py
- merge files
- scripts: scripts.bookmarks_merge.py
- parses single or multiple bookmark.html files using pybookmark.bookmarks_parse.py library
- merges bookmarks across html files
- reduces duplication of information based on user defined mappings
- you only need to do this once if you start using the viewer as your bookmark manager
- viewer:
- viewer allows view, edit, add/remove of json bookmark collection
- library: pybookmark.pybookmarkjsonviewer.py
- can be called from command line
- $ python pybookmarkjsonviewer.py -f /path_to_json_file/sample.json
- script: scripts.PyBookmark_viewer.py
- runs against predefined yaml configuration in the same path
- Uses Tk to provide GUI
- note to run from a desktop launcher in linux may require a separate shell script with interactive mode enabled see reference
File Layout
- Data contains
- reference YAML configurations
- example input bookmark.html files
- example output json files
- pybookmark
- where the library code is, see run options above for types
- where the icon file is
- scripts
- where command line tools live
- see run options above for more details
Data Structures
The core data structure is AddrStruct.
addrStruct: dictionary of url keys with list of list values
key = URL address
[0] = label
[1] = age
[2] = tags
[3] = location
[4] = description
[5] = file location
Requirements Overview
Created using Python 3.7 or higher and Beautiful Soup 4.
Version History
Version | Description |
---|---|
1.0.0 | first release |
1.1.0 | refactored to use classes |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
PyBookmark-1.1.0.tar.gz
(111.9 kB
view hashes)
Built Distributions
PyBookmark-1.1.0-py3.9.egg
(166.0 kB
view hashes)
PyBookmark-1.1.0-py3-none-any.whl
(20.4 kB
view hashes)
Close
Hashes for PyBookmark-1.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 24235da497a126862bc33e40de37fb688f5f485b14a3fba45aac06205d402d22 |
|
MD5 | da6bba3d93a8d9c73841a991485027a5 |
|
BLAKE2b-256 | 21bfa45f4c66eff3d2938773c77bd4d8aade2734c31438fc2029eb5adf0e3d68 |