Skip to main content

A tool designed to enhance patent discovery by leveraging MongoDB for efficient storage, querying, and analysis of patent data. This repository includes features to streamline patent searches, improve retrieval accuracy, and support advanced filtering and indexing capabilities.

Project description

icon


better-moles-patent-finder

A tool designed to enhance patent discovery by leveraging MongoDB for efficient storage, querying, and analysis of patent data. This repository includes features to streamline patent searches, improve retrieval accuracy, and support advanced filtering and indexing capabilities.

Coverage
PyPI Latest Release
Unit Tests
Powered by Fabio
License

Overview

This project offers a powerful platform for patent research, combining advanced search features with a MongoDB backend to store, retrieve, and analyze patent-related data efficiently. It allows users to search for patents associated with chemical compounds, leveraging SMILES, InChI, and other molecular representations. The system also supports filtering by molecular structure, patent ID, and other criteria.


This project is based on the PatCID paper, which focuses on the identification and classification of patent data related to molecular structures. The techniques and methodologies from the PatCID framework are utilized to enhance patent search results by leveraging chemical informatics and advanced query techniques. The core concept of this project builds upon PatCID's ability to match molecular structures with relevant patent information, improving the overall efficiency and accuracy of patent searches. To check out their incredible work, visit the PatCID GitHub repository. scratches(1).png


Key Features:

  • Patent Search: Search patents by their ID or associated molecular properties.
  • Advanced Filtering: Filter patents based on molecular structure, chemical formula, and other relevant fields.
  • Efficient Querying: Use MongoDB's indexing and querying capabilities to retrieve patents quickly.
  • Data Model: The system stores patents and associated molecules in a structured format, making it easy to extend and scale.

Authors:


What is it?

This tool is designed to assist researchers and patent professionals in finding relevant patents related to chemical compounds using molecular representations like SMILES and InChI. By using MongoDB as the backend, it efficiently stores and indexes large volumes of patent and molecular data. Users can easily query patents, filter based on molecular structures, and retrieve precise results with high speed.

Key Features:

  • Patent Search: Search patents by their ID or associated molecular properties.
  • Advanced Filtering: Filter patents based on molecular structure, chemical formula, and other relevant fields.
  • Efficient Querying: Use MongoDB's indexing and querying capabilities to retrieve patents quickly.
  • Data Model: The system stores patents and associated molecules in a structured format, making it easy to extend and scale.

Mongo Documents Format

The MongoDB documents used by this project follow the structure below, which includes information about the molecule (using SMILES, InChI, etc.) and the associated patent IDs:

{
  "molecule": {
    "smiles": "Brc1cc(-c2ccccc2)nc(-c2ccc3c4ccccc4c4ccccc4c3c2)c1",
    "inchi": "InChI=1S/C29H18BrN/c30-21-17-28(19-8-2-1-3-9-19)31-29(18-21)20-14-15-26-24-12-5-4-10-22(24)23-11-6-7-13-25(23)27(26)16-20/h1-18H",
    "inchikey": "UPAWJZOAEGLCFP-UHFFFAOYSA-N",
    "sum_formula": "C29H18BrN",
    "conf": 0.57
  },
  "patents": [
    {"id": "US20200136057A1"},
    {"id": "US20200136057"}
  ]
}

Mongo Documents Format

  • molecule: Contains the molecular data (SMILES, InChI, InChIKey, sum formula).
  • patents: A list of patent IDs that are associated with the molecule.

Usage

Installation

You can install the better-moles-patent-finder package via pip from PyPI or clone the repository to run locally:

Install from PyPI:

pip install better-moles-patent-finder

Basic Usage

Once installed, you can start querying patents using the provided API or Running as a Script

You can run the project as a script by passing a configuration file path:

better-moles-patent-finder --config-path /path/to/config_file.yaml
from patent_finder.patent_finder_mongo_db import PatentFinderMongoDB

# Create a PatentFinder instance
pf = PatentFinderMongoDB()

# Search for patents by molecule structure (SMILES)
result = pf.search_by_smiles('Brc1cc(-c2ccccc2)nc(-c2ccc3c4ccccc4c4ccccc4c3c2)c1')

# Print the result
print(result)

MongoDB Connection

Ensure MongoDB is running and accessible. The default connection string is configured in the project. You can modify it if necessary in the mongo_connector.py file.

from better_moles_patent_finder import MongoConnector

# Connect to the MongoDB database
mongo = MongoConnector()
mongo.connect()

# Perform queries and operations

License

This project is licensed under the terms of the GNU General Public License, Version 3.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

better_moles_patent_finder-0.1.1.tar.gz (49.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

better_moles_patent_finder-0.1.1-py3-none-any.whl (34.7 kB view details)

Uploaded Python 3

File details

Details for the file better_moles_patent_finder-0.1.1.tar.gz.

File metadata

File hashes

Hashes for better_moles_patent_finder-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6b7db5528792ece6e37b834f23c577101a00db250176d08d90a7a1b0575011d7
MD5 8c127664b2753244dfa1204cd193681f
BLAKE2b-256 ab60380b64e94040d17d9a3e658345c1a57d132c17fada2b2bdbe64acc2b561e

See more details on using hashes here.

File details

Details for the file better_moles_patent_finder-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for better_moles_patent_finder-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b44905a58be380574eb3b9cf733b0a28d94df12d266e9335b849ef862d62b5fc
MD5 0f09e19de2c9283209baa327ffef4c7d
BLAKE2b-256 29587f72836885515b85c5400c88be9ac04861e92295f0a8cd9d0d5dc88bfe9a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page