Skip to main content

A tool for detecting anomalies in XML data

Project description

Introduction

This repo. serves to develop the strategy to assess whether user has entered anomalous value while filling the expenses.

It assumes xml that will be exported from BPM solution and extracts meaning ful data from it, later uses language model to assess and detect anomalies.

Only for live devvelopment of the project

pip install -e . 
uvicorn xmlAnomalyDetection.app:app  --port 8357 --reload 

Build the project and run server

Following will build the project binaries so that it could be installed via pip

ist create a .env file with key GROQ_API_KEY , (get the free api key from groq server)

python setup.py bdist_wheel
python setup.py sdist 

This will create some binaries in dist and build folders, install it as package like:

cd dist
pip install xmlAnomalyDetection-2.0-py3-none-any.whl

Run server

Once the project is built , run following in the terminal, it will run the server:

xml_anomaly_detection

Run without building

Binaries are already pushed to pypi, do following to just pull them and install them:

pip install xmlAnomalyDetection==2.0

Then run following in the terminal, it will run the server:

xml_anomaly_detection

Test request

curl -X 'POST' \
  'http://0.0.0.0:8357/detect_anomaly' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F 'xml_file=@xml_sample.xml;type=text/xml'

It will return response like:

[
  {
    "user_entered_values": [
      "2022-11-08",
      "Travel | Car Rental",
      "200",
      "21"
    ],
    "entered_values_decriptions": [
      "Date of expense",
      "Type of expense",
      "Expense amount",
      "Brief description"
    ],
    "is_anomaly": "False",
    "reason": []
  },
  {
    "user_entered_values": [
      "2022-11-08",
      "Transportation | Fuel",
      "200",
      "212"
    ],
    "entered_values_decriptions": [
      "Date of expense",
      "Type of expense",
      "Expense amount",
      "Brief description"
    ],
    "is_anomaly": "False",
    "reason": []
  }
]

If is_anomaly is True in any of the returned records, it means there is anomaly. Rest of the items in record are metadata.

New possible prompt ✍️

# Expense Report Anomaly Detection

You are an AI system designed to analyze expense reports and detect potential anomalies or irregularities. Given an expense report, examine each entry and the report as a whole for the following types of issues:

1. Duplicate entries: Identify multiple entries for the same expense type on the same date.

2. Excessive amounts: Flag expenses that exceed typical or policy-defined limits for their category.

3. Unusual patterns: Detect atypical expense patterns, such as multiple meals of the same type in one day.

4. Documentation issues: Highlight entries lacking required receipts or supporting documents.

5. Verification problems: Note expenses marked as unverified or failing standard verification processes.

6. Date inconsistencies: Identify expenses outside the stated trip dates or clustered unusually.

7. Round number anomalies: Flag suspicious occurrences of round numbers or repeating patterns in amounts.

8. Error indicators: Pay attention to any system-generated error flags or comments.

9. Category mismatches: Detect expenses that seem miscategorized based on their description or amount.

10. Missing expected expenses: Note the absence of typical expense categories for the type of trip.

11. Policy violations: Identify any expenses that directly contradict stated company policies.

12. Unusual descriptions: Flag expenses with vague, inappropriate, or suspicious descriptions.

For each potential anomaly detected, provide:
- The specific entry or entries involved
- The type of anomaly detected
- A brief explanation of why it's considered anomalous
- A suggestion for further investigation or correction if applicable

Analyze the given expense report and provide a detailed list of any detected anomalies, following the guidelines above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xmlanomalydetection-7.0.tar.gz (51.7 kB view details)

Uploaded Source

Built Distribution

xmlAnomalyDetection-7.0-py3-none-any.whl (52.5 kB view details)

Uploaded Python 3

File details

Details for the file xmlanomalydetection-7.0.tar.gz.

File metadata

  • Download URL: xmlanomalydetection-7.0.tar.gz
  • Upload date:
  • Size: 51.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.5

File hashes

Hashes for xmlanomalydetection-7.0.tar.gz
Algorithm Hash digest
SHA256 f09d9414ce54b6fc0eb4356ad7f9c8e85892ed0d9cb295570d54ea88f22da202
MD5 cd858670a733cee3e82b0e1ab27cca59
BLAKE2b-256 1650ff9d3c92087c4c2544fcd95938700fdcdc0f9bcc42bfd889a40b56efe6bb

See more details on using hashes here.

File details

Details for the file xmlAnomalyDetection-7.0-py3-none-any.whl.

File metadata

File hashes

Hashes for xmlAnomalyDetection-7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 beae5e7319d77a15b13a8d5d9d0c3ffd7836fc80195197f9e9cfcdbd389bc857
MD5 9fc00527341ca2f4e5bb54fd7fbc45fd
BLAKE2b-256 d547e3bf3c87f1d87767928b5ea76664274f85d6cb7fa8ea21c5c90a39646deb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page