flagai

FlagAI aims to help researchers and developers to freely train and test large-scale models for NLP tasks.

Project description

FlagAI 简体中文

FlagAI aims to help researchers and developers to freely train and test large-scale models for NLP tasks.

Now it supports GLM, Bert, RoBerta, GPT2, T5 and models from Huggingface Transformers.
It provides APIs to quickly download and use those pretrained models on a given text, fine-tune them on your own datasets and then share them with the community on our model hub.
These models can be applied on Text, for tasks like text classification, information extraction, question answering, summarization, text generation, especially in Chinese.
FlagAI is backed by the three most popular data/model parallel libraries — PyTorch/Deepspeed/Megatron-LM — with a seamless integration between them. Your can paralle your training/testing process with less than ten lines of code.

The code is partially based on Transformers and DeepSpeedExamples.

Requirements and Installation
Quick Started
Pretrained Models and examples
Tutorials
Learn More About FlagAI
Contributing
License

Requirements and Installation

PyTorch version >= 1.8.0
Python version >= 3.8
For training/testing models on GPUs, you'll also need install CUDA and NCCL

To install FlagAI and develop locally:

git clone https://github.com/BAAI-WuDao/Sailing.git
python setup.py install

[Optional] For faster training install NVIDIA's apex

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

[Optional] For ZeRO optimizers install DEEPSPEED

git clone https://github.com/microsoft/DeepSpeed
cd DeepSpeed
DS_BUILD_CPU_ADAM=1 DS_BUILD_AIO=1 DS_BUILD_UTILS=1 pip install -e .
ds_report # check the deespeed status

[Tips] For single-node docker enviroments, we need to setup ports for your ssh. e.g., root@127.0.0.1 with port 7110

>>> vim ~/.ssh/config
Host 127.0.0.1
    Hostname 127.0.0.1
    Port 7110
    User root

[Tips] For multi-node docker enviroments, generate ssh keys and copy the public key to all nodes (in ~/.ssh/)

>>> ssh-keygen -t rsa -C "xxx@xxx.com"

Quick Start

We provide many models which are trained to perform different tasks. You can load these models by AutoLoader to make prediction.

Load model and tokenizer

We provide the AutoLoad class to load the model and tokenizer quickly, for example:

from flagai.auto_model.auto_loader import AutoLoader

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

auto_loader = AutoLoader(
    task_name="seq2seq",
    model_name="bert_title_generation_en"
)
model = auto_loader.get_model()
tokenizer = auto_loader.get_tokenizer()

This example is for the title_generation task, and you can also model other tasks by modifying the task_name. Then you can use the model and tokenizer to finetune or test.

Predictor

We provide the Predictor class to predict for different tasks, for example:

import torch
from flagai.model.predictor.predictor import Predictor
predictor = Predictor(model, tokenizer)
test_data = [
    "Four minutes after the red card, Emerson Royal nodded a corner into the path of the unmarked Kane at the far post, who nudged the ball in for his 12th goal in 17 North London derby appearances. Arteta's misery was compounded two minutes after half-time when Kane held the ball up in front of goal and teed up Son to smash a shot beyond a crowd of defenders to make it 3-0.The goal moved the South Korea talisman a goal behind Premier League top scorer Mohamed Salah on 21 for the season, and he looked perturbed when he was hauled off with 18 minutes remaining, receiving words of consolation from Pierre-Emile Hojbjerg.Once his frustrations have eased, Son and Spurs will look ahead to two final games in which they only need a point more than Arsenal to finish fourth.",
]

for text in test_data:
    print(
        predictor.predict_generate_beamsearch(text,
                                              out_max_length=50,
                                              beam_size=3))

Pretrained Models and examples

This session explains how the base NLP classes work, how you can load pre-trained models to tag your text, how you can embed your text with different word or document embeddings, and how you can train your own language models, sequence labeling models, and text classification models. Let us know if anything is unclear.

Tutorials

We provide a set of quick tutorials to get you started with the library:

Learn More About FlagAI

Contributing

Thanks for your interest in contributing! There are many ways to get involved; start with our contributor guidelines and then check these open issues for specific tasks.

License

Copyright [2022] [BAAI]

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Project details

Release history Release notifications | RSS feed

1.8.4

Nov 30, 2023

1.8.3

Nov 10, 2023

1.8.2

Oct 12, 2023

1.8.1

Oct 12, 2023

1.8.0

Oct 7, 2023

1.7.5

Aug 7, 2023

1.7.4

Aug 7, 2023

1.7.3

Jun 14, 2023

1.7.2

Jun 13, 2023

1.7.1

Jun 10, 2023

1.7.0

Jun 9, 2023

1.6.3

Jun 9, 2023

1.6.2

Mar 23, 2023

1.6.1

Mar 2, 2023

1.5.1

Jan 12, 2023

1.5.0

Nov 28, 2022

1.4.5

Nov 27, 2022

1.4.4

Nov 22, 2022

1.4.3

Nov 19, 2022

1.4.2

Nov 16, 2022

1.4.1

Nov 16, 2022

1.4.0

Nov 11, 2022

1.3.2

Sep 20, 2022

1.3.0

Aug 25, 2022

1.2.0

Jul 21, 2022

1.1.3

Jul 11, 2022

1.1.2

Jul 8, 2022

1.1.1

Jul 6, 2022

1.1.0

Jun 29, 2022

1.0.5

Jun 24, 2022

1.0.4

Jun 19, 2022

1.0.3

Jun 15, 2022

1.0.2

Jun 10, 2022

1.0.1

Jun 7, 2022

1.0.0

Jun 1, 2022

1.0.0b9 pre-release

Jun 1, 2022

1.0.0b7 pre-release

May 28, 2022

1.0.0b6 pre-release

May 28, 2022

1.0.0b5 pre-release

May 24, 2022

1.0.0b4 pre-release

May 18, 2022

1.0.0b3 pre-release

May 18, 2022

1.0.0b2 pre-release

May 18, 2022

This version

1.0.0b1 pre-release

May 16, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flagai-1.0.0b1.tar.gz (191.9 kB view details)

Uploaded May 16, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

flagai-1.0.0b1-py3-none-any.whl (232.9 kB view details)

Uploaded May 16, 2022 Python 3

File details

Details for the file flagai-1.0.0b1.tar.gz.

File metadata

Download URL: flagai-1.0.0b1.tar.gz
Upload date: May 16, 2022
Size: 191.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.0 CPython/3.9.7

File hashes

Hashes for flagai-1.0.0b1.tar.gz
Algorithm	Hash digest
SHA256	`7b7b3ec5983c8ea1c4729180419db6fd2a26705041e34f61ee8dc4ff6cb540ec`
MD5	`368afc6322d03ab8cce68f484915b42a`
BLAKE2b-256	`b877f1b57a21e885740701aaab89d101385cfa74578b71b825b1abeca9ac164a`

See more details on using hashes here.

File details

Details for the file flagai-1.0.0b1-py3-none-any.whl.

File metadata

Download URL: flagai-1.0.0b1-py3-none-any.whl
Upload date: May 16, 2022
Size: 232.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.0 CPython/3.9.7

File hashes

Hashes for flagai-1.0.0b1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`89e5c731dc169ee17a1be214ac3be8eb400ae3a6aec2e5350a3cf19b78d414dd`
MD5	`c596e33a23d8f091e5f3523fa341e3ad`
BLAKE2b-256	`900adcaac8e8cb41c49a38ef2334b7dba220c74a1d61649ccd63a691835ad9ad`

See more details on using hashes here.

flagai 1.0.0b1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Requirements and Installation

Quick Start

Load model and tokenizer

Predictor

Pretrained Models and examples

Tutorials

Learn More About FlagAI

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes