NLQL (Natural Language Query Language) is a tool that helps you search through text using simple commands that look like SQL. Just like how SQL helps you find information in databases, NLQL helps you find information in regular text.
Project description
NLQL (Natural Language Query Language)
A SQL-like query language designed specifically for natural language processing and text retrieval.
Overview
NLQL is a query language that brings the power and simplicity of SQL to natural language processing. It provides a structured way to query and analyze unstructured text data, making it particularly useful for RAG (Retrieval-Augmented Generation) systems and large language models.
Key Features
- SQL-like syntax for intuitive querying
- Multiple text unit support (character, word, sentence, paragraph, document)
- Rich set of operators for text analysis
- Semantic search capabilities
- Vector embedding support
- Extensible plugin system
- Performance optimizations with indexing and caching
Basic Syntax
SELECT <UNIT>
[FROM <SOURCE>]
[WHERE <CONDITIONS>]
[GROUP BY <FIELD>]
[ORDER BY <FIELD>]
[LIMIT <NUMBER>]
Query Units
CHAR: Character levelWORD: Word levelSENTENCE: Sentence levelPARAGRAPH: Paragraph levelDOCUMENT: Document level
Basic Operators
CONTAINS("text") -- Contains specified text
STARTS_WITH("text") -- Starts with specified text
ENDS_WITH("text") -- Ends with specified text
LENGTH(<|>|=|<=|>=) number -- Length conditions
Semantic Operators
SIMILAR_TO("text", threshold) -- Semantic similarity
TOPIC_IS("topic") -- Topic matching
SENTIMENT_IS("positive"|"negative"|"neutral") -- Sentiment analysis
Vector Operators
EMBEDDING_DISTANCE("text", threshold) -- Vector distance
VECTOR_SIMILAR("vector", threshold) -- Vector similarity
Usage Examples
Basic Queries
-- Find sentences containing "artificial intelligence"
SELECT SENTENCE WHERE CONTAINS("artificial intelligence")
-- Find paragraphs with less than 100 characters
SELECT PARAGRAPH WHERE LENGTH < 100
Advanced Queries
-- Find semantically similar sentences
SELECT SENTENCE
WHERE SIMILAR_TO("How to improve productivity", 0.8)
-- Find positive sentences about innovation
SELECT SENTENCE
WHERE CONTAINS("innovation")
AND SENTIMENT_IS("positive")
-- Here LENGTH is not a keyword, you need to register it manually. -> nlql.register_metadata_extractor("LENGTH", lambda x: len(x))
ORDER BY LENGTH
LIMIT 10
Implementation
The system is implemented with three main components:
- Tokenizer: Breaks down query strings into tokens
- Parser: Converts tokens into an abstract syntax tree (AST)
- Executor: Executes the query and returns results
Performance Optimizations
- Inverted index for text search
- Vector index for semantic search
- Query result caching
- Parallel processing for large datasets
Extension System
NLQL supports custom extensions through:
- Plugin System
- Register custom operators
- Add new query units
- Implement custom functions
Getting Started
- Install the package:
pip install nlql
- Basic usage:
from nlql import NLQL
# Initialize NLQL
nlql = NLQL()
# Add text for querying
raw_text = """
Natural Language Processing (NLP) is a branch of artificial intelligence
that helps computers understand human language. This technology is used
in many applications. For example, virtual assistants use NLP to
understand your commands.
"""
nlql.text(raw_text)
# Execute query
results = nlql.execute("SELECT SENTENCE WHERE CONTAINS('artificial intelligence')")
# Print results
for result in results:
print(result)
Contributing
We welcome contributions! Please see our contributing guidelines for more details.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nlql-0.1.3.tar.gz.
File metadata
- Download URL: nlql-0.1.3.tar.gz
- Upload date:
- Size: 29.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9353e8d7d70da3167600b4843b170e5b3031c5196f814c9350a16c761486c225
|
|
| MD5 |
88d951a3d44f9ea917b7fc2b154a0bd8
|
|
| BLAKE2b-256 |
0e987ac5185e3a3f5ef9a83ba5c2582db9da276cf32cca3693dbdb6aff8feda8
|
File details
Details for the file nlql-0.1.3-py3-none-any.whl.
File metadata
- Download URL: nlql-0.1.3-py3-none-any.whl
- Upload date:
- Size: 34.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93ad10b2baeb23d1915bd6e71458e004bd3b4360019fcb868e9f74b3f49d8532
|
|
| MD5 |
f9500d13a2cac7ee07ae9c8303aeebef
|
|
| BLAKE2b-256 |
01cfd604c83b1ea3bf2f354c8458975edf38d69205d8107df58266bb69762839
|