Atlas is a search engine for parsing a collection of web pages and making it searchable.
Project description
Atlas
A search engine for the internet.
Atlas stands for Atila's Tool for Learning any Subject. To start, Atlas will be focused on effectively indexing and searching crypto and web3 related content with plans to add other subjects in the future. However, you're welcome to use Atlas to index and search any type of content you want.
Installation
pip install atila-atlas
Set your environment variables:
export ATLAS_ALGOLIA_APPLICATION_ID=""
export ATLAS_ALGOLIA_API_KEY=""
export ATLAS_ALGOLIA_INDEX_NAME=""
Development Installation
source install.sh
pip install -e .
Quickstart
atlas initialize_index
atlas add_content --file data/urls_to_parse.txt
atlas add_content --urls https://ethereum.org/en/nft,https://en.wikipedia.org/wiki/Ethereum
atlas search "what is an nft"
atlas get_inbound_links --min-inbound-links=2
from atlas.content_parser import ContentParser, ContentIndex
sample_urls = [
"https://ethereum.org/en/nft",
"https://en.wikipedia.org/wiki/Ethereum",
"https://linda.mirror.xyz/df649d61efb92c910464a4e74ae213c4cab150b9cbcc4b7fb6090fc77881a95d",
"https://chain.link/education/nfts",
"https://medium.com/superrare/no-cryptoartists-arent-harming-the-planet-43182f72fc61",
"https://andrewsteinwold.substack.com/p/-quick-overview-of-the-nft-ecosystem",
"https://medium.com/superrare/no-cryptoartists-arent-harming-the-planet-43182f72fc61"
]
content_bot = ContentParser(urls=sample_urls)
content_bot.parse_all_content()
content_bot.save_to_file()
content_index = ContentIndex()
content_index.initialize_index()
results = content_index.search("what is an nft")
Development Quickstart
Note: Make sure you've put your environment variables into the newly created
.env
file that was taken from shared.env
# 1. Parse and index your content:
python atlas/content_parser.py
# 2. Initialize your content:
python atlas/content_index.py
# 3. Run the API
python api/api.py
# 4. Send a GET request to your api
curl --location --request GET 'http://127.0.0.1:8080/api/search?q=what+is+an+NFT'
# or open your browser to `http://127.0.0.1:8080/api/search?q=<your_search_term>`
Publishing Package to PyPi
python -m build
- Uploading to test PYPi server first to practice:
python -m twine upload --repository testpypi dist/*
- Note the use of no-dependencies flag:
--no-deps
. This is because the dependencies might not be in the TestPyPi - Set your username to
__token__
- Set your password to the token value, including the pypi- prefix
- Test: https://test.pypi.org/manage/account/#api-tokens
- Prod: https://pypi.org/manage/account/#api-tokens
- Note the use of no-dependencies flag:
- Upload to the real PyPI server:
python -m twine upload dist/*
Troubleshooting
ModuleNotFoundError: No module named 'atlas'
Set your $PYTHONPATH. See this SO answer
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file atila-atlas-0.0.2.tar.gz
.
File metadata
- Download URL: atila-atlas-0.0.2.tar.gz
- Upload date:
- Size: 10.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d763094fda6379b22ab0aa0dd5b265071c8e0d5a945321b89dc732b3d245ba4c |
|
MD5 | c3e70cf971c3b583a0b8453e2d28dd21 |
|
BLAKE2b-256 | 4595fc5f259b584a47b9b41ec017a527af50cb5c7efc0985393c17c7c57fc7b1 |
File details
Details for the file atila_atlas-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: atila_atlas-0.0.2-py3-none-any.whl
- Upload date:
- Size: 11.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9fdefe45319d559ed9ccf94909200ee11cad5e3579e995f4bb9c77b8beb57622 |
|
MD5 | ee7cfe1f9742eff9c05e7c6029212e22 |
|
BLAKE2b-256 | a6bc151bc0e8823a1ec63e8e2a291a2bec6de837867d849a388580f227e4d514 |