Skip to main content

Tag Generation and Text Generation Inference for Network Packets using Transformers

Project description

NIDS Transformers

nids-transformers is a Python package used for generating tags and descriptive text for network packets. This is part of our research project utilizing transformer models in network security and Network Intrusion Detection Systems (NIDS). We have developed the PADEC (Packet Describer) module that generates tags and text for network packets using BERT and Falcon LLMs (Large Language Models).

Installation

Install the package with pip:

pip install nids-transformers

Usage

First, import the PADEC module and initialize the models and tokenizers:

from nids_transformers import PADEC

padec = PADEC()

To generate only text, you can pass text=False.

Preparing the Input

Prepare the flow information (from the Wireshark Conversations Window). Use 0 if flow information is not available:

forward_packets_per_second = 0
backward_packets_per_second = 4
bytes_transferred_per_second = 5493

Prepare the packet data in hexadecimal (from the Wireshark Hexadecimal View, copy as Hex Stream):

packet_hex = '...'  # your hexadecimal data here

Generating Tags

You can generate tags for a network packet with the GenerateTags function:

tags = padec.GenerateTags(packet_hex_stream=packet_hex,
                          forward_packets_per_second=forward_packets_per_second,
                          backward_packets_per_second=backward_packets_per_second,
                          bytes_transferred_per_second=bytes_transferred_per_second,
                          total_tags=10)

This will output a dictionary with the generated tags and their corresponding scores.

You can also pass context_similarity_factor=(default: 0), output_words_ngram=(default: 0), uncased_lemmatization=(default: True), single_word_split=(default: False), output_filter_factor=(default: 1) to control the generation of tags. Refer to BERTSimilar package for more details about these parameters.

Generating Text

You can generate a descriptive text explaining the generated tags:

text = padec.GenerateText(explain_tags=True, max_new_tokens=250)

You can control the randomness by passing a temperature value from 0 to 1, where 0 means less random and 1 means more random (default is temperature=0).

You can also generate a descriptive text explaining the network packet:

text = padec.GenerateText(explain_packet=True, max_new_tokens=250)

Demo

Input

The following is an example of how to prepare and use data for the PADEC module:

from nids_transformers import PADEC

# Initialize PADEC
padec = PADEC()

# Flow Information (From Wireshark Conversations Window)
forward_packets_per_second = 0
backward_packets_per_second = 4
bytes_transferred_per_second = 5493

# Packet Data in Hexadecimal (From Wireshark Hexadecimal View. Copy as Hex Stream)
packet_hex = "3ca6f60849b920b39957e74b0800450005c881dc0000f506c2790d235d2b86588b3301bbf95a94eccbfa554bbac980100085d54400000101080abcb794b10c6ab7722057d82613cc2c721b879ef00e6d925bca92a02d529fd587fd8e5a9cb93dd2a405d8315612500d7179cf7c01ca5e18cd137fe2044fe15898d5b42722f9e79bbc7431ce711171aa63a6b779367d745a0b5432fa326e8e7238d15033da601a4bb9c9bea464f6ca54b64698f31493d9da42fa6e0904a15fb1f944b96de8c55909f7e8780be2de10786b0ff623e503f94276a694bbf823686654ebcdafbfce9f5677e3d21ac1d25426a2be1badeadc5f29449a024419bba4d350ce7494563e9dabaa2c405e21a5fc918586193499139bd967d06ad188e8446ce0ddd406a336847bb64e1e70a73aaffdd1fdfc8cddd89b73433fe0fdcfe11dffa208710e0ecec840b632071872bb688353f59740f45d1efec153e2cc2b69f756b871073a8af9ca923eb213df7c1a67f5679d64e3e758394695fa486c32fd43d454bacc5b5f733eb5e28f70d605ff0947cf68e27dd51081b08ee083976d6b6eb277bd5e8787cb80e0bd574b6f6493e626999467e098ec329fd049d7d20ddc18547e2284e5560509692ce6e86fee5ece2997757697279dbbe418c37a86a79829b34cf8cb52e07e389c61373eff20705d8906aa6d98d5169bb316e963c6a85c8a4f5aea12d6e9a5402cb2aacc63be2b5a845bb5be1f416e19764f44b57837a854d233b764cbb8849f49a5c3deb77a0208cb512d973034c36d90870efdbad00c55fc3d85ef76fd275c21cf0cfbd6cf3cebbd0c62d3c4e8cb21a65b0983c1ed24d9f0a2bd1831316d62aeb6ec9e14a998803671b12d4dcf37151b75b69ec28cca72a36f67b5d3ec3f02606f94ebf941c0f705fd3ba39a154dcb20b1929df10c2ced9db7de3f2bfca59528e699591436b605ae5c174e3c3d7a237c72a0cce22d4cc370767d78a7ed485eb5fc96f6ae45e7e3114ecb1aab59acdcc14a7303b4f49484c2b834f8289e006bd4c6ae38018db9c48ea09caa095b25a0e626486713e07ca409ff52918d6bd390903db3b3a5f823cb91dab2d515c34f459c58dd242529322bc10428786451bd7c2d899f0398c9ffc37302b0d2dca95569d29db478705ed7c85a27ec00cb827c4671424ee33a49a80ec1e63b3a810af84ea42bdac72b6c9a5aa5438bdc4461a9bf3dafc676457072918c6c6a65aaed79a1be272f006edf7c2e930919a53a2eae0749d98cdd9c1b482d4db4adb7a9865ac613bb9a9d8110a72f3f4f40a58fe9fa8eec36e1eee61124d84e92001c617fb025e48e250a173e031552575b48e67d67c988c432364e945e5b3845d61090ccbb628504aac0d453a91c75fa23d6d59b65eadfe79c10f9878715780b9c5b68df37234ddd723b0023611c647f17fddaf0266eec2faa7e745fb06017cbcba1608fd3a9903036d3c5505a3185d0b31f512106509a4cc5582fe13283a18d817b95feb25a61782f2a571722c24979fb39efaf823be465483271e4c4dcc39a8cbc930492ed1b224aa37c50dc19e67b4f1117f92d0bd6ef81cbc72ac2189e27d893b838a19d7a2b8a9b46a6786fdbcfa3749cf564b0038440418a7c9fe2f477458ef743270aeafe0bf510f043a7e7d54787ab92ba80f97d75e06f4bc25cb521d54d221fd089d408d7c9166268376c5c2de1c2f44dc6c0402c35a0f55b2f3ea13f80a11a80f65d41bcb63dac7ae9cfa063a8c749231d6d2cd9b5a83252972f0dd424efa79b72bf558d1648dd2c78c202e7398eef6b8adeab334227e92534e7f3dd26bdaa856ce1feba77f87005e4ed87a6dae4c2bb2c72eecfaaf9e1299cb2f0ff1f3f8cff459e30396bf595d7c08a9a704a394211cc459e01a939cb6cbf8627ceefebb1b338d47079e3958009d2388b86e38a9a5c51f2134c304f98c21d00951c8aa15d3f47e9ba61fa43606d91698000bb7427365ef8b485d11bcdfcea0d52e40af2b76e9f3d372b15c9463b18660f23cd5f04e660f727467a34d8994b22f713f1bfaaf2cb1a0b2aaaa3b1caacd6955ec3e96fde2ca82b5caedc45521cb3978a7c3d65b4076ec96f069608"

# Generate 10 Tags
tags = padec.GenerateTags(packet_hex_stream=packet_hex,
                          forward_packets_per_second=forward_packets_per_second,
                          backward_packets_per_second=backward_packets_per_second,
                          bytes_transferred_per_second=bytes_transferred_per_second,
                          total_tags=10)

print(tags)

tags_text = padec.GenerateText(explain_tags=True, max_new_tokens=250)

print(tags_text)

packet_text = padec.GenerateText(explain_packet=True, max_new_tokens=250)

print(packet_text)

Output

# tags
{
  'considered regular expected': 0.9683609588041633,  
  'malicious intent': 0.9615794189652873,  
  'typical': 0.961300843268933,  
  'reference point': 0.9590238971149821,  
  'label signifies normal network behavior': 0.9575988655384243,  
  'standard network protocols': 0.9566183076210247,  
  'baseline': 0.9542804079487445,  
  'Average': 0.9535212932036061,  
  'abnormal traffic patterns enabling': 0.9528368377318245,  
  'expected traffic patterns': 0.952349645227717
}

# tags_text
Based on the majority of the tags, the network packet appears to be a normal packet. It is considered regular and expected, with typical behavior and standard network protocols. The label signifies normal network behavior, and the packet follows expected traffic patterns. There is no indication of malicious intent or abnormal traffic patterns. Overall, it is a regular and normal packet. However, further analysis may be required to ensure its security.

# packet_text
This network packet is an IPv4 packet with a length of 1480 bytes. The packet has a Time-to-Live (TTL) value of 245, indicating that it has been forwarded through 245 routers. The packet is using the TCP protocol and has a source IP address of 13.35.93.43 and a destination IP address of 134.88.139.51. The source port is 443 and the destination port is 63834. The packet has the ACK flag set, indicating that it is acknowledging a previous packet. The payload of the packet contains various words and phrases, such as "VPqy," "qqcy," "KdTFBn," and "DmY." These words do not provide much context or meaning, but they could be part of a message or data being transmitted. Overall, there are no abnormalities in the packet, and it appears to be a normal TCP packet with a specific payload. However, further analysis would be required to determine if any security issues or anomalies are present. The packet does not exhibit any suspicious or malicious behavior. The IP version is 4, indicating that it is an IPv4 packet.

Notes

It's recommended to ensure you have sufficient disk space and GPU memory when using this package.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nids_transformers-0.1.1.tar.gz (5.8 MB view details)

Uploaded Source

Built Distribution

nids_transformers-0.1.1-py3-none-any.whl (5.8 MB view details)

Uploaded Python 3

File details

Details for the file nids_transformers-0.1.1.tar.gz.

File metadata

  • Download URL: nids_transformers-0.1.1.tar.gz
  • Upload date:
  • Size: 5.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.7

File hashes

Hashes for nids_transformers-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0c95d2ee876830c045d4a14a6d3be870d99a94313f66273d204790198fc87e41
MD5 8442c789159097ffc323aca5abd5326a
BLAKE2b-256 46f72edb8adefa0ade73659f938e225e3e069016e29045b96f6f538cb33cf61c

See more details on using hashes here.

File details

Details for the file nids_transformers-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for nids_transformers-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8e5956a286d6ed8fe23a4f8782d61f6ca66fd47bb6ee72aad830ab5a03c10ea4
MD5 743233eb8d11c8068f40d47577e47aef
BLAKE2b-256 6650becdc3ddb6c5381c66e24477dc4f7f1c5ea070509b4ce730d37e65fbce74

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page