Detect duplicated pages in a Notion database and optionallly delete them
Project description
Purpose
Detect the duplicated pages in a Notion database and optionally delete the dupes
What's a duplicated page?
It's a page with the both same title and last_edited_time as another document.
Motivation
I recently decided to move away from Evernote (after being a subsciber since 2008). My reason? They started to jack up their price to a level that wasn't justifiable to me.
The price of the yearly subscription went from $35 in 2022, to $50 in 2023 and for this year they want $130!
</RANT>
After I imported many pages from Evernote, I ended up with 100s if not 1000s of duplicated pages.
This script solved the problem!
Install
pip install notion-duplicates
Prerequisites
You first need to create an integration from Notion that will create a token:
-
Click on [ + New Integration ]
-
Specify the name say: notion_duplicates
-
Click on Show under Internal Integration Secret and copy the secret which looks like:
secret_WhGbvv7jUxt88WXYZDlhxoiBtgtzGXBqPrVSA00aaBo
-
That's the value to use as NOTION_TOKEN
Next, you need to connect the notion_duplicates integration with your Notion database:
- Navigate to your Notion database such as: https://www.notion.so/a769a042d8f544ce860ba408d295ab28?v=8603013e8753451cb46496a62e6ac55f
- Click on the . . . at the top right of the page
- Select Connect To and select notion_duplicates from the list, and confirm
Finally, you need your database_id that can easily be extracted from your database URL:
It's the 32 characters from the / to the ?. See the example below where the database_id=a769a042d8f544ce860ba408d295ab28
https://www.notion.so/a769a042d8f544ce860ba408d295ab28?v=8603013e8753451cb46496a62e6ac55f
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Usage
Help (-h)
notion_duplicates -h
usage: notion_duplicates [-h] [-m [MAX_PAGE_COUNT]] [-D] [-M [MAX_DELETE_PAGE_COUNT]] database_id
Detect duplicated pages in a Notion database and optionally delete them
positional arguments:
database_id Notion database on which to conduct the duplicate search. See README.md for more details
optional arguments:
-h, --help show this help message and exit
-m [MAX_PAGE_COUNT], --max_page_count [MAX_PAGE_COUNT]
Maximum number of pages to scan for duplicated pages (default: None)
-D, --delete Do the actual deletion (set in_trash=True) (default: False)
-M [MAX_DELETE_PAGE_COUNT], --max_delete_page_count [MAX_DELETE_PAGE_COUNT]
Maximum number of pages to delete (default: None)
Example with no duplicate
notion_duplicates a769a042d8f544ce860ba408d295ab28
Iterated over 3 pages in the database:a769a042d8f544ce860ba408d295ab28. Found 0 duplicated page(s) and deleted 0 page(s)
Elapased time:0.12 seconds
Example showing duplicates only (no deletion)
notion_duplicates 5ae487a972e345b09450c181150a7AAA
Scanned 100 in 0.61 secs or 164 pages/sec
Scanned 200 in 1.52 secs or 131 pages/sec
Scanned 300 in 2.22 secs or 135 pages/sec
Scanned 400 in 3.02 secs or 132 pages/sec
Scanned 500 in 3.63 secs or 138 pages/sec
This page is a dupe -> title:(1) Facebook | last_edited:2013-07-05T01:34:00.000Z | url:https://www.notion.so/1-Facebook-a7df306435694572be8460ac45b75950
This page is a dupe -> title:Patio Lounger RE 11.2in Nicollet : Target | last_edited:2013-07-04T23:09:00.000Z | url:https://www.notion.so/Patio-Lounger-RE-11-2in-Nicollet-Target-706e30effb4345b4b50ee0db3328ebbb
This page is a dupe -> title:ÄPPLARÖ Drop-leaf table - IKEA | last_edited:2013-07-04T23:03:00.000Z | url:https://www.notion.so/PPLAR-Drop-leaf-table-IKEA-9fe474b0f5424c499f3fe78aeb005deb
Reached max page count
Iterated over 521 pages in the database:5ae487a972e345b09450c181150a77b2. Found 3 duplicated page(s) and deleted 0 page(s)
Elapased time:4.52 seconds
Example deleting duplicates (use -D)
notion_duplicates -D 5ae487a972e345b09450c181150a7AAA
Scanned 100 in 0.61 secs or 164 pages/sec
Scanned 200 in 1.52 secs or 131 pages/sec
Scanned 300 in 2.22 secs or 135 pages/sec
Scanned 400 in 3.02 secs or 132 pages/sec
Scanned 500 in 3.63 secs or 138 pages/sec
DELETING dupe page -> title:(1) Facebook | last_edited:2013-07-05T01:34:00.000Z | url:https://www.notion.so/1-Facebook-a7df306435694572be8460ac45b75950
DELETING dupe page -> title:Patio Lounger RE 11.2in Nicollet : Target | last_edited:2013-07-04T23:09:00.000Z | url:https://www.notion.so/Patio-Lounger-RE-11-2in-Nicollet-Target-706e30effb4345b4b50ee0db3328ebbb
DELETING dupe page -> title:ÄPPLARÖ Drop-leaf table - IKEA | last_edited:2013-07-04T23:03:00.000Z | url:https://www.notion.so/PPLAR-Drop-leaf-table-IKEA-9fe474b0f5424c499f3fe78aeb005deb
Iterated over 521 pages in the database:5ae487a972e345b09450c181150a7AAA. Found 3 duplicated page(s) and deleted 3 page(s)
Elapased time:4.77 seconds
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file notion_duplicates-0.6.0.tar.gz
.
File metadata
- Download URL: notion_duplicates-0.6.0.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.9.5 Darwin/23.4.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a671797315a0af0161012695aca78d03292706e394124560c82bf9d97ad103d5 |
|
MD5 | e732bbf1e714130a7535b4a2e9a7fad0 |
|
BLAKE2b-256 | 35230fac009a0199344c11dd2188f73e3e03226dbb4dcb187642e514b00b3860 |
File details
Details for the file notion_duplicates-0.6.0-py3-none-any.whl
.
File metadata
- Download URL: notion_duplicates-0.6.0-py3-none-any.whl
- Upload date:
- Size: 4.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.9.5 Darwin/23.4.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e83dc031b071c27b93f84b49f19dba1f6413d852936e5139b517be985884715f |
|
MD5 | cc4e03f2df20cddf2977b32c623a186c |
|
BLAKE2b-256 | 1f5e055946e2f33532b90c06616edfb42aa0d538c900fc8779a6f12d3027ff8f |