Wrapper for interacting with Nanowire platform
Project description
nanowire-service-py
Usage
This library is designed for tight integration with Nanowire platform (created by Spotlight Data).
The library does not have a hardcode requirement for a specific web server, but I'd recommend using fastapi due to it's simplicity and speed
The primary code logic should be placed in a sub-class of BaseHandler
. User is expected to implement validate_args
as well as handle_body
methods:
import os
from dotenv import load_dotenv
from fastapi import FastAPI, Response
from pydantic import BaseModel, validator
from typing import Any, List, Optional
import pandas as pd
from nanowire_service_py import BaseHandler, create, TaskBody
from toolbox import ClusterTool
load_dotenv()
allowed_methods = ["HDBSCAN", "DBSCAN"]
# pydantic used to verify function body
class Arguments(BaseModel):
contentUrl: str
textCol: str
indexCol: str
clusterSize: float = 0.2
nLabels: int = 10
method: str = "DBSCAN"
customStops: Optional[List[str]] = []
maxVocab: int = 5000
memSave: bool = False
withAnomalous: bool = False
@validator('method')
def method_check(cls, method):
if method not in allowed_methods:
raise ValueError("Method has to be one of: {}, received: {}".format(",".join(allowed_methods), method))
return method
# Our custom handler
class MyHandler(BaseHandler):
def __init__(self, *args):
super().__init__(*args)
self.cluster_tool = ClusterTool(self.logger)
def validate_args(self, args: Any, task_id: str) -> Arguments:
return Arguments(**args)
def handle_body(self, args: Arguments, meta: Any, task_id: str):
df = pd.read_csv(args.contentUrl, dtype='unicode')
if args.textCol not in df.columns:
raise RuntimeError("Could not find text column '{}' in CSV".format(args.textCol), { "origin": "CSV"})
if args.indexCol not in df.columns:
raise RuntimeError("Could not find index column '{}' in CSV".format(args.indexCol), { "origin": "CSV"})
result = self.cluster_tool.main(df, args)
return (result, meta)
# Always handled by the library, pass environment directly
executor = create(os.environ, MyHandler)
app = FastAPI()
# Let's DAPR know which topics should be subscribed to
@app.get("/dapr/subscribe")
def subscribe():
return executor.subscriptions
# Primary endpoint, where request will be delivered to
# TaskBody type here verifies the post body
@app.post("/subscription")
def subscription(body: TaskBody, response: Response):
status = executor.handle_request(body.data.id)
response.status = status
# Return empty body so dapr doesn't freak out
return {}
# Start heartbeat thread
executor.heartbeat()
Assuming the filename is main.py
the server can then be started via uvicorn main:app
Handling failure
The primary validation happens within validate_args
function by pydantic
models. This is where anything related to input should be checked.
If at any point you want the current task to fail, raise RuntimeError
. This will indicate the library, that we should fail and not retry again. For example:
- CSV missing columns or having incorrect text format
- Not enough data passed
Anything else, that raises unexpected exception should be retried automatically.
Contributing
Read CONTRIBUTING.md
🛡 License
This project is licensed under the terms of the MIT
license. See LICENSE for more details.
Credits
This project was generated with python-package-template
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for nanowire-service-py-1.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 58337c5515b38d4adda8991ab1976c426e579ede827b99db1572f65810532134 |
|
MD5 | 49a3e05805a7e6fa1e8e0dcfdbd64cac |
|
BLAKE2b-256 | 295822afdeb933e03a5f33155096d042494e5ddb5f9668d62332833e2b61c585 |
Hashes for nanowire_service_py-1.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a0dfe0d44a9e12d101ab4c0a008fd1635c09cf67b3b7553318e6d7a567337754 |
|
MD5 | 2419f212e0e8c943e569749e7eec0e1a |
|
BLAKE2b-256 | 4e94dad237bc321330fbc7d3163efbcbebbcbfc6cdf7cd1d808eb3ff3b479013 |