Skip to main content

A tool for detecting anomalies in XML data

Project description

Introduction

This repo. serves to develop the strategy to assess whether user has entered anomalous value while filling the expenses.

It assumes xml that will be exported from BPM solution and extracts meaning ful data from it, later uses language model to assess and detect anomalies.

Only for live devvelopment of the project

pip install -e . 
uvicorn xmlAnomalyDetection.app:app  --port 8357 --reload 

Build the project and run server

Following will build the project binaries so that it could be installed via pip

ist create a .env file with key GROQ_API_KEY , (get the free api key from groq server)

python setup.py bdist_wheel
python setup.py sdist 

This will create some binaries in dist and build folders, install it as package like:

cd dist
pip install xmlAnomalyDetection-2.0-py3-none-any.whl

Run server

Once the project is built , run following in the terminal, it will run the server:

xml_anomaly_detection

Run without building

Binaries are already pushed to pypi, do following to just pull them and install them:

pip install xmlAnomalyDetection==2.0

Then run following in the terminal, it will run the server:

xml_anomaly_detection

Test request

curl -X 'POST' \
  'http://0.0.0.0:8357/detect_anomaly' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F 'xml_file=@xml_sample.xml;type=text/xml'

It will return response like:

[
  {
    "user_entered_values": [
      "2022-11-08",
      "Travel | Car Rental",
      "200",
      "21"
    ],
    "entered_values_decriptions": [
      "Date of expense",
      "Type of expense",
      "Expense amount",
      "Brief description"
    ],
    "is_anomaly": "False",
    "reason": []
  },
  {
    "user_entered_values": [
      "2022-11-08",
      "Transportation | Fuel",
      "200",
      "212"
    ],
    "entered_values_decriptions": [
      "Date of expense",
      "Type of expense",
      "Expense amount",
      "Brief description"
    ],
    "is_anomaly": "False",
    "reason": []
  }
]

If is_anomaly is True in any of the returned records, it means there is anomaly. Rest of the items in record are metadata.

New possible prompt ✍️

# Expense Report Anomaly Detection

You are an AI system designed to analyze expense reports and detect potential anomalies or irregularities. Given an expense report, examine each entry and the report as a whole for the following types of issues:

1. Duplicate entries: Identify multiple entries for the same expense type on the same date.

2. Excessive amounts: Flag expenses that exceed typical or policy-defined limits for their category.

3. Unusual patterns: Detect atypical expense patterns, such as multiple meals of the same type in one day.

4. Documentation issues: Highlight entries lacking required receipts or supporting documents.

5. Verification problems: Note expenses marked as unverified or failing standard verification processes.

6. Date inconsistencies: Identify expenses outside the stated trip dates or clustered unusually.

7. Round number anomalies: Flag suspicious occurrences of round numbers or repeating patterns in amounts.

8. Error indicators: Pay attention to any system-generated error flags or comments.

9. Category mismatches: Detect expenses that seem miscategorized based on their description or amount.

10. Missing expected expenses: Note the absence of typical expense categories for the type of trip.

11. Policy violations: Identify any expenses that directly contradict stated company policies.

12. Unusual descriptions: Flag expenses with vague, inappropriate, or suspicious descriptions.

For each potential anomaly detected, provide:
- The specific entry or entries involved
- The type of anomaly detected
- A brief explanation of why it's considered anomalous
- A suggestion for further investigation or correction if applicable

Analyze the given expense report and provide a detailed list of any detected anomalies, following the guidelines above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xmlanomalydetection-8.0.tar.gz (51.5 kB view details)

Uploaded Source

Built Distribution

xmlAnomalyDetection-8.0-py3-none-any.whl (52.1 kB view details)

Uploaded Python 3

File details

Details for the file xmlanomalydetection-8.0.tar.gz.

File metadata

  • Download URL: xmlanomalydetection-8.0.tar.gz
  • Upload date:
  • Size: 51.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.5

File hashes

Hashes for xmlanomalydetection-8.0.tar.gz
Algorithm Hash digest
SHA256 8d2ef39d7a3a401a423bac0b20851deaea9ee3099360f81cf8948e7f49a6fd06
MD5 ad05158eaa259bb7ef0932f4db150bc4
BLAKE2b-256 a5ab7937dee1d1a2de4c3e2aadd7acd93ed20abcc093c9cddaf2644d958dfa66

See more details on using hashes here.

File details

Details for the file xmlAnomalyDetection-8.0-py3-none-any.whl.

File metadata

File hashes

Hashes for xmlAnomalyDetection-8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d67be5f1c2f028dd8692e1033dd5ec4b569467451d6e5cbfaeeaa84f2e08684a
MD5 0835b2cf15c63eac967a4bdb8b25b381
BLAKE2b-256 ecc54d174475bb5d2412bae4fa0dc8366a19f6e1d12692fbfd53d18524082cef

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page