Skip to main content

A tool for detecting anomalies in XML data

Project description

Introduction

This repo. serves to develop the strategy to assess whether user has entered anomalous value while filling the expenses.

It assumes xml that will be exported from BPM solution and extracts meaning ful data from it, later uses language model to assess and detect anomalies.

Only for live devvelopment of the project

pip install -e . 
uvicorn xmlAnomalyDetection.app:app  --port 8357 --reload 

Build the project and run server

Following will build the project binaries so that it could be installed via pip

ist create a .env file with key GROQ_API_KEY , (get the free api key from groq server)

python setup.py bdist_wheel
python setup.py sdist 

This will create some binaries in dist and build folders, install it as package like:

cd dist
pip install xmlAnomalyDetection-2.0-py3-none-any.whl

Run server

Once the project is built , run following in the terminal, it will run the server:

xml_anomaly_detection

Run without building

Binaries are already pushed to pypi, do following to just pull them and install them:

pip install xmlAnomalyDetection==2.0

Then run following in the terminal, it will run the server:

xml_anomaly_detection

Test request

curl -X 'POST' \
  'http://0.0.0.0:8357/detect_anomaly' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F 'xml_file=@xml_sample.xml;type=text/xml'

It will return response like:

[
  {
    "user_entered_values": [
      "2022-11-08",
      "Travel | Car Rental",
      "200",
      "21"
    ],
    "entered_values_decriptions": [
      "Date of expense",
      "Type of expense",
      "Expense amount",
      "Brief description"
    ],
    "is_anomaly": "False",
    "reason": []
  },
  {
    "user_entered_values": [
      "2022-11-08",
      "Transportation | Fuel",
      "200",
      "212"
    ],
    "entered_values_decriptions": [
      "Date of expense",
      "Type of expense",
      "Expense amount",
      "Brief description"
    ],
    "is_anomaly": "False",
    "reason": []
  }
]

If is_anomaly is True in any of the returned records, it means there is anomaly. Rest of the items in record are metadata.

New possible prompt ✍️

# Expense Report Anomaly Detection

You are an AI system designed to analyze expense reports and detect potential anomalies or irregularities. Given an expense report, examine each entry and the report as a whole for the following types of issues:

1. Duplicate entries: Identify multiple entries for the same expense type on the same date.

2. Excessive amounts: Flag expenses that exceed typical or policy-defined limits for their category.

3. Unusual patterns: Detect atypical expense patterns, such as multiple meals of the same type in one day.

4. Documentation issues: Highlight entries lacking required receipts or supporting documents.

5. Verification problems: Note expenses marked as unverified or failing standard verification processes.

6. Date inconsistencies: Identify expenses outside the stated trip dates or clustered unusually.

7. Round number anomalies: Flag suspicious occurrences of round numbers or repeating patterns in amounts.

8. Error indicators: Pay attention to any system-generated error flags or comments.

9. Category mismatches: Detect expenses that seem miscategorized based on their description or amount.

10. Missing expected expenses: Note the absence of typical expense categories for the type of trip.

11. Policy violations: Identify any expenses that directly contradict stated company policies.

12. Unusual descriptions: Flag expenses with vague, inappropriate, or suspicious descriptions.

For each potential anomaly detected, provide:
- The specific entry or entries involved
- The type of anomaly detected
- A brief explanation of why it's considered anomalous
- A suggestion for further investigation or correction if applicable

Analyze the given expense report and provide a detailed list of any detected anomalies, following the guidelines above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xmlanomalydetection-6.0.tar.gz (51.7 kB view details)

Uploaded Source

Built Distribution

xmlAnomalyDetection-6.0-py3-none-any.whl (52.5 kB view details)

Uploaded Python 3

File details

Details for the file xmlanomalydetection-6.0.tar.gz.

File metadata

  • Download URL: xmlanomalydetection-6.0.tar.gz
  • Upload date:
  • Size: 51.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.5

File hashes

Hashes for xmlanomalydetection-6.0.tar.gz
Algorithm Hash digest
SHA256 52bbb78960fbd665d7576b28683cd54ff9c94a836f78d430bc32df18f7828a36
MD5 ebaa15d6ee1430c092e07fa5f684f6f9
BLAKE2b-256 2fc167a6bb6f1626165e09f681137c67825473956ce9b6e54ea2b199ec938f29

See more details on using hashes here.

File details

Details for the file xmlAnomalyDetection-6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for xmlAnomalyDetection-6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f154b64c06ef22ccab7978592d923a947c45bded8587d4a2432719400a8467d3
MD5 06d76ac46ef9c08b14ddc473b9e2026a
BLAKE2b-256 402da8b4a9d637f61879190df1ecb217b501b13793e71be6a452083629535afb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page