A deep learning package optimized for performing Deep Learning Tasks, easy to learn and integrate into projects
Project description
PyDeepFlow
Author & Creator: ravin-d-27
Overview
This documentation covers the development, structure, and features of PyDeepFlow created for Deep Learning workflows (Right Now, there is only support for Multi-Class Classification). The model is implemented from scratch using Python and NumPy and offers flexibility in terms of architecture, activation functions, and training methods. Future enhancement plans are also outlined, showcasing how the model can evolve to meet more complex and diverse requirements.
Table of Contents
- Introduction
- Model Architecture
- Implementation Details
- Features and Functionality
- Future Enhancement Plans
- How to Use the Model
- Example Code
- Conclusion
1. Introduction
This custom Multi-Layer Artificial Neural Network was built by ravin-d-27 to solve binary classification problems. The model is designed to be easy to use, extendable, and adaptable. With this custom-built architecture, users can define the number of layers, the activation functions, and control the learning process. Additionally, the ANN supports various loss functions, allowing flexibility depending on the specific problem at hand.
While the current version focuses on binary classification, several enhancements are planned to expand its capabilities, making it a more comprehensive neural network tool for both binary and multi-class classification tasks.
2. Model Architecture
The architecture of the neural network consists of several layers:
Input Layer:
- The number of neurons in the input layer is automatically determined by the number of features in the training data.
Hidden Layers:
- The number of hidden layers and neurons per layer is configurable.
- The model supports different activation functions for each layer, making it flexible to different types of tasks.
Output Layer:
- A single neuron with a sigmoid activation function is used in the output layer to output a probability score for binary classification (0 or 1).
The forward pass includes activation functions like ReLU, Leaky ReLU, Sigmoid, and Tanh. The backpropagation uses derivatives of these activation functions to update the weights during training.
Activation Functions Implemented:
- ReLU (Rectified Linear Unit)
- Leaky ReLU
- Sigmoid
- Tanh
- Softmax (planned for multi-class tasks)
3. Implementation Details
The ANN is structured into three core modules:
activations.py:
This file includes a set of commonly used activation functions and their derivatives, which are essential for forward propagation and backpropagation. The functions implemented are:
- ReLU and its derivative
- Leaky ReLU and its derivative
- Sigmoid and its derivative
- Tanh and its derivative
- Softmax (though only used for future multi-class implementations)
losses.py:
The losses.py
module handles different types of loss functions. It allows users to select from multiple loss functions based on their task requirements. Available loss functions include:
- Binary Crossentropy: Used for binary classification.
- Mean Squared Error (MSE): Useful for regression tasks.
- Categorical Crossentropy: Will be needed for multi-class classification.
- Hinge Loss: Typically used for support vector machines (SVM).
- Huber Loss: A hybrid loss function that is less sensitive to outliers.
Each loss function has its corresponding derivative, which is essential for backpropagation.
model.py:
The model.py
file contains the core Multi_Layer_ANN class, which implements:
- Weight Initialization: Weights are initialized using a He initialization method, which scales the weights relative to the number of neurons in each layer to avoid exploding/vanishing gradients.
- Forward Propagation: Handles the feed-forward phase of the network.
- Backpropagation: Computes the gradients of the loss with respect to weights and biases using the chain rule.
- Training Loop: Manages the optimization process over a set number of epochs, adjusting weights and biases based on the calculated gradients.
- Prediction Method: Outputs predictions after training by thresholding the sigmoid output for binary classification.
Training Loop Details
The training loop provides feedback on the loss and accuracy at regular intervals, making it easy to monitor performance. The model is trained using stochastic gradient descent (SGD) with a customizable learning rate. During training, the model:
- Executes forward propagation.
- Computes the loss using the specified loss function.
- Performs backpropagation to adjust weights and biases.
- Repeats this process over the defined number of epochs.
4. Features and Functionality
Core Features:
- Configurable Hidden Layers: The architecture can include any number of hidden layers with varying neuron counts and activation functions.
- Binary Classification Support: The current model is built for binary classification tasks, utilizing sigmoid activation in the output layer.
- Customizable Loss Functions: The model allows users to specify different loss functions depending on the task.
- Training Feedback: Detailed feedback on loss and accuracy during training, displayed at regular intervals.
Model Metrics:
- Accuracy: Calculated based on how well the model's predictions match the actual labels.
- Loss: Calculated using the chosen loss function, guiding the optimization process.
5. Future Enhancement Plans
To further extend the model’s functionality, the following enhancements will be added:
-
Regularization Techniques:
- L2 Regularization: Penalizes large weights to reduce overfitting.
- Dropout: Randomly disables neurons during training, increasing generalization ability.
-
Advanced Optimizers:
- Support for optimizers like Adam, RMSprop, and AdaGrad to enhance convergence speed and improve performance on complex datasets.
-
Learning Rate Scheduling:
- Dynamic adjustment of the learning rate over time (e.g., learning rate decay or cyclic learning rates) to improve training stability.
-
Early Stopping:
- Stop training when there is no significant improvement in validation loss to prevent overfitting and reduce unnecessary computation.
-
Support for Multi-Class Classification:Modify the output layer and implement softmax activation for tasks requiring multi-class predictions.
-
Model Checkpointing:
- Save the model’s weights at optimal points during training to prevent loss of progress, especially for long training times.
-
Batch Normalization:
- Add batch normalization layers between the hidden layers to stabilize and speed up training.
-
Gradient Clipping:
- Prevent exploding gradients by limiting the magnitude of gradient updates during backpropagation.
-
Visualization Tools:
- Introduce functions for plotting training metrics (loss, accuracy) over time for better tracking and debugging.
-
Hyperparameter Tuning Framework:
- Implement support for grid search or random search to allow for efficient exploration of hyperparameter combinations.
-
Additional Activation Functions:
- Add support for more advanced activation functions like Swish and ELU for better performance on deeper networks.
-
Cross-Validation Support:
- Add support for k-fold cross-validation to ensure the model generalizes well across different subsets of data.
-
Dynamic Model Architecture:
- Allow for more flexibility in configuring the architecture dynamically by specifying activation functions, number of layers, and neuron counts directly.
-
Support for Convolutional Layers:
- Add support for convolutional layers (CNN) to extend the model for image data processing.
-
Output Probabilities in Predictions:
- Modify the
predict()
method to return not just binary labels but also the predicted probability scores, providing more interpretability.
- Modify the
6. How to Use the Model
Prerequisites:
- Python 3.x
- NumPy
- Pandas
- Scikit-learn
- tqdm (for progress bars during training)
- colorama (for colored console outputs)
Steps to Use:
-
Prepare Data:
- The input data should be a 2D array where rows represent examples and columns represent features.
- The target (label) should be a binary value (0 or 1) for binary classification.
-
Split Data:
- Use
train_test_split
to divide your dataset into training and testing sets.
- Use
-
Standardize Data:
- Scale the features using StandardScaler to normalize the input data.
-
Initialize the ANN:
- Specify the architecture by defining the hidden layers and the activation functions.
-
Train the Model:
- Train the model by calling the
fit()
method with the desired number of epochs and learning rate.
- Train the model by calling the
-
Make Predictions:
- Use the
predict()
method to classify new data after training.
- Use the
7. Example Code
X = df.iloc[:, :-1].values
y = df.iloc[:, -1].values
y_one_hot = np.eye(len(np.unique(y)))[y]
X_train, X_test, y_train, y_test = train_test_split(X, y_one_hot, test_size=0.2, random_state=42)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Define the architecture
hidden_layers = [5, 5]
activations = ['relu', 'relu']
ann = Multi_Layer_ANN(X_train, y_train, hidden_layers, activations, loss='categorical_crossentropy')
ann.fit(epochs=1000, learning_rate=0.01)
# Make predictions
y_pred = ann.predict(X_test)
print(y_pred)
# Convert predictions back to original labels
y_test_labels = np.argmax(y_test, axis=1)
# Calculate accuracy
accuracy = np.mean(y_pred == y_test_labels)
print(f"Test Accuracy: {accuracy * 100:.2f}%")
9. References
-
Neural Networks and Deep Learning:
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. Link
-
Python for Data Analysis:
- McKinney, W. (2017). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. O'Reilly Media.
-
Scikit-learn Documentation:
- Scikit-learn. (n.d.). Scikit-learn: Machine Learning in Python. Link
-
TQDM Documentation:
- TQDM. (n.d.). TQDM: A fast, extensible progress bar for Python and CLI. Link
-
Colorama Documentation:
- Colorama. (n.d.). Colorama: Simple cross-platform print formatting. Link
10. Contributions
Contributions to this project are welcome! Here’s how you can contribute:
- Fork the Repository: Make a personal copy of the repository.
- Create a Feature Branch: Use a descriptive name for the branch that outlines the feature being added (e.g.,
feature/dropout
). - Make Your Changes: Implement the changes you wish to contribute.
- Commit Your Changes: Write clear, concise commit messages.
- Push to Your Fork: Push your changes back to your personal fork of the repository.
- Open a Pull Request: Describe the changes and the reasoning behind them.
Issues
If you encounter any bugs or have suggestions for improvement, please open an issue in the repository to discuss it!
11. Acknowledgments
- Inspirations: This project is inspired by numerous resources available in the machine learning community, particularly literature on neural networks and deep learning.
- Community Support: Thanks to the contributors and open-source community for their valuable insights and discussions that shaped this project.
12. License
This project is licensed under the MIT License - see the LICENSE file for details.
13. Contact
For any inquiries or suggestions regarding this project, please feel free to contact me:
- GitHub: ravin-d-27
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pydeepflow-0.1.0.tar.gz
.
File metadata
- Download URL: pydeepflow-0.1.0.tar.gz
- Upload date:
- Size: 16.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f4caceb7f7093996af8dd810b626f1c54ee8d875eaafbb7f37504b4165e8fd9f |
|
MD5 | 2f5f1d8e6a4f1ee726cb1fde5243d5a0 |
|
BLAKE2b-256 | 12ac70d5caa0aa157c420470313419dc8dc03ea08da56966b592a2d2e1b234b8 |
File details
Details for the file pydeepflow-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: pydeepflow-0.1.0-py3-none-any.whl
- Upload date:
- Size: 12.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c60e73f1e267030c415035912817773ed50904d9eb2ced6b51645c1dfedf7696 |
|
MD5 | 182e09f860e58a1c1492a764ba1eab02 |
|
BLAKE2b-256 | 8451b66c5d16d9e56e30c6a21cd79138cfa8047e5851005b60440954a134d105 |