A transformer model with advanced features for casual language modeling.

These details have not been verified by PyPI

Project links

Homepage

Project description

TNSA Curiosity

TNSA Stable Curiosity is a transformer-based model architecture designed for casual language modeling tasks. It is an enhancement of the BERT model, optimized for various NLP tasks such as text classification, token classification, and language generation. The architecture features advanced mechanisms like gradient checkpointing, making it more efficient and scalable.

Installation

To install tnsa, you can use pip from PyPI:

pip install tnsa

How to use Curiosity OpenModel Architecture(Based on ARCH-X 9)

from tnsa.stable.curiosity import TNSAforCasualLM

# Initialize the model
model = TNSAforCasualLM(
    hidden_size=768,
    num_hidden_layers=12,
    num_attention_heads=12,
    intermediate_size=3072,
    intermediate_act_fn='gelu',  # Can also use other activations like 'relu'
    hidden_dropout_prob=0.1,
    attention_probs_dropout_prob=0.1,
    initializer_range=0.02,
)

# Example input
input_tensor = ...  # Your input tensor here, with shape [batch_size, seq_length, hidden_size]
attention_mask = ...  # Your attention mask tensor here

# Forward pass through the model
output = model(input_tensor=input_tensor, attention_mask=attention_mask)

print(output)

#Instialize you training loop you can keep the parameters to default to re-create NGen2-Nano Base on OpenWEB

Key Parameters

hidden_size: The size of the hidden layers. Defaults to 768 (same as BERT's base).

num_hidden_layers: The number of transformer layers. Defaults to 12.

num_attention_heads: The number of attention heads in each layer. Defaults to 12.

intermediate_size: The size of the intermediate (feedforward) layer. Defaults to 3072.

intermediate_act_fn: The activation function to use in the intermediate layer. Default is gelu.

hidden_dropout_prob: Dropout probability for hidden layers. Default is 0.1.

attention_probs_dropout_prob: Dropout probability for attention layers. Default is 0.1.

initializer_range: The standard deviation of the initializer. Default is 0.02.

use_gradient_checkpointing: A boolean flag to enable or disable gradient checkpointing for memory efficiency. Default is False.

How Curiosity `OpenModelArchitecture` Differs from ARCH-X 9`(Closed Source)`

The Curiosity architecture is based on the standard transformer architecture used in NGen2, with the following enhancements:

Gradient Checkpointing: An optional feature to enable gradient checkpointing, allowing for more efficient memory usage during training. This is particularly useful when working with large models.

Improved Attention Mechanism: The attention mechanism has been fine-tuned for better handling of long-range dependencies and more accurate attention distributions.

Optimized Architecture: Custom improvements to layer normalization and dropout mechanisms help improve the modelâ€™s performance on various NLP tasks.

Model Performance

While Curiosity is similar to NGen2, it has been fine-tuned to outperform NGen2 in some language modeling tasks by using a more efficient memory usage pattern, which makes it better suited for tasks with large datasets or longer sequences.

License

The code is licensed under the NGen2Community License. Please review the LICENSE file for more details.While the base of the code is still closed sourced. you i.e (user or developer) should use it to develop custom models but not copy or modify the code itself.

Copyrighted and Licensed by:

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

7.3.1

Nov 15, 2024

0.1.0

Jun 12, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tnsaai-7.3.1.tar.gz (4.8 kB view details)

Uploaded Nov 15, 2024 Source

Built Distribution

tnsaai-7.3.1-py3-none-any.whl (4.6 kB view details)

Uploaded Nov 15, 2024 Python 3

File details

Details for the file tnsaai-7.3.1.tar.gz.

File metadata

Download URL: tnsaai-7.3.1.tar.gz
Upload date: Nov 15, 2024
Size: 4.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for tnsaai-7.3.1.tar.gz
Algorithm	Hash digest
SHA256	`e6da25ea887e127ca1e2c639954b344360c0d12f53483eb0232817aa63367655`
MD5	`f30817d80297cf0c60a45d8624190fa1`
BLAKE2b-256	`a5c6091c6366bc68a806679660891df698c896e21d0a315452e165dc2851bd90`

See more details on using hashes here.

File details

Details for the file tnsaai-7.3.1-py3-none-any.whl.

File metadata

Download URL: tnsaai-7.3.1-py3-none-any.whl
Upload date: Nov 15, 2024
Size: 4.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for tnsaai-7.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`379e7756ce44e40285f083ea61a4326a6323cdfd3b0b107ba4caa9f5c4c3ee8e`
MD5	`8bb98bd625deff7a2e3d655fe0b83655`
BLAKE2b-256	`16cc34c2456ef8e95ed353d0df832c178008be197213e6168721c52e6cccc116`

See more details on using hashes here.

tnsaai 7.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TNSA Curiosity

Installation

How to use Curiosity OpenModel Architecture(Based on ARCH-X 9)

Key Parameters

How Curiosity `OpenModelArchitecture` Differs from ARCH-X 9`(Closed Source)`

Model Performance

License

Copyrighted and Licensed by:

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

tnsaai 7.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TNSA Curiosity

Installation

How to use Curiosity OpenModel Architecture(Based on ARCH-X 9)

Key Parameters

How Curiosity OpenModelArchitecture Differs from ARCH-X 9(Closed Source)

Model Performance

License

Copyrighted and Licensed by:

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

How Curiosity `OpenModelArchitecture` Differs from ARCH-X 9`(Closed Source)`