MASA SDK - Masa's AI Software Architecture

These details have not been verified by PyPI

Project links

Project description

Masa AI Software Architecture

MASA is a project for data retrieval, quality control, and orchestration. It currently provides tools to retrieve data from Twitter using the Masa Protocol Node API, with plans to expand to other data sources and functionalities in the future.

Currently this SDK requires a Masa Protocol Node to be running on the system. Instructions on how to install and run a node can be found here.

Quick Start

Install the MASA package:
```
pip install masa-ai
```

Create a request_list.json file with the queries you'd like to process. This file can be placed anywhere on your system. Here is an example of what the request_list.json might look like:

[
    {
        "query": "#example",
        "max_results": 100
    },
    {
        "query": "from:example_user",
        "max_results": 50
    }
]

An example request_list.json file is included in the package. You can find it using the following command:

EXAMPLE_PATH=$(pip show masa-ai | grep Location | awk '{print $2"/masa_ai/request_list.json"}')
echo "Example request_list.json path: $EXAMPLE_PATH"

Use the MASA CLI:
```
masa-ai-cli <action> [arguments]
```
Available actions:
- process [path_to_requests_json]: Process all requests (both resumed and new)
- --docs [page_name]: Rebuild and view the documentation for the specified page
- --data: List the scraped data files
For example:
```
masa-ai-cli process /path/to/request_list.json
masa-ai-cli --docs usage
masa-ai-cli --data
```
Accessing Scraped Data:

The data that is scraped is saved within the package directory under the data folder. To list all scraped data files, use the following command:
```
masa-ai-cli --data
```
This will display the structure of the data folder and list all the files contained within it.
Recommendations for Accessing and Using Scraped Data:
- Command Line: You can navigate to the data folder using the command line to view and manipulate the files directly. Here is a step-by-step example:
IMPORTANT: The data folder is not included in the package. It is only created when you run the masa-ai-cli process [path_to_requests_json] command.
```
# Find the installation path of the masa package
PACKAGE_PATH=$(pip show masa-ai | grep Location | awk '{print $2"/masa_ai"}')
echo "Masa package path: $PACKAGE_PATH"
```
You can use this path to access data for further process, analysis, and utilization with agents.
For detailed usage instructions, please refer to the Usage Guide.

Configuration

The project uses YAML files for configuration:

configs/settings.yaml: Main configuration file containing settings for Twitter API, request management, and logging.
configs/.secrets.yaml: (Optional) File for storing sensitive information like API keys. Not currently in use.

The settings.yaml file is loaded using Dynaconf, which allows for easy environment-based configuration management.

Advanced Twitter Search

The Masa Protocol Node API provides advanced search capabilities for retrieving Twitter data. Some of the available search options include:

Hashtag Search: #hashtag
Mention Search: @username
From User Search: from:username
Keyword Exclusion: -keyword
OR Operator: term1 OR term2
Geo-location Based Search: geocode:latitude,longitude,radius
Language-Specific Search: lang:language_code

For more details, refer to the Masa Protocol Twitter Docs.

Project Structure

masa_ai/: Main package directory
- configs/: Configuration files
- connections/: API connection handlers
- tools/: Core functionality modules
  - qc/: Quality control tools
  - retrieve/: Data retrieval tools
  - utils/: Utility functions
- orchestration/: Request management and processing

Dependencies

Key dependencies include:

Data processing: numpy, pandas
API interaction: requests
Configuration: dynaconf
Quality control: colorlog
Natural Language Processing: langchain, openai
Data visualization: matplotlib, streamlit

For a full list of dependencies, refer to pyproject.toml.

Contributing

We welcome contributions! Please see our Contributing Guidelines for more information on how to get started, including documentation best practices.

Documentation

The MASA project uses Sphinx to generate its documentation. The documentation is automatically rebuilt and viewed when using the --docs option with the masa-ai-cli command.

To view the documentation:

masa-ai-cli --docs [page_name]

This command will rebuild and view the documentation for the specified page. Note that the page name is optional. If no page name is provided, the documentation for the entire project will be displayed.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.6

Nov 1, 2024

0.2.5

Nov 1, 2024

0.2.4

Oct 19, 2024

0.2.3

Oct 11, 2024

0.2.2

Oct 11, 2024

0.2.1

Oct 11, 2024

0.2.0

Oct 9, 2024

0.1.2

Sep 26, 2024

0.1.1

Sep 25, 2024

This version

0.1.0

Sep 17, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

masa_ai-0.1.0.tar.gz (42.8 kB view hashes)

Uploaded Sep 17, 2024 Source

Built Distribution

masa_ai-0.1.0-py3-none-any.whl (58.9 kB view hashes)

Uploaded Sep 17, 2024 Python 3

Hashes for masa_ai-0.1.0.tar.gz

Hashes for masa_ai-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`b563aa2eaea7fd815145d5129edcb8b04b641ea1a74cb59a4269926e257e442d`
MD5	`7a6d9fdb44ef55f7a791fb51eceba416`
BLAKE2b-256	`15531b78677bda13dff4fcfcc2a15010b35673a0171dae04daae95b167d3b1ec`

Hashes for masa_ai-0.1.0-py3-none-any.whl

Hashes for masa_ai-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4a917df110b5afaca40be0604ac6c62826f833c42d3c7c731f3f2a5e80ce83e4`
MD5	`33de8d797540886cf2ea762729a3da77`
BLAKE2b-256	`0c00bfc67e7c3a623e3350836380d46bfe7d9932d080f710b7e85618a80e6f49`