Skip to main content

Atlas is a search engine for parsing a collection of web pages and making it searchable.

Project description

Atlas

A search engine for the internet.

Atlas stands for Atila's Tool for Learning any Subject. To start, Atlas will be focused on effectively indexing and searching crypto and web3 related content with plans to add other subjects in the future. However, you're welcome to use Atlas to index and search any type of content you want.

Installation

pip install atila-atlas

Set your environment variables:

export ATLAS_ALGOLIA_APPLICATION_ID=""
export ATLAS_ALGOLIA_API_KEY=""
export ATLAS_ALGOLIA_INDEX_NAME=""

Development Installation

source install.sh
pip install -e .

Quickstart

atlas initialize_index
atlas add_content --file data/urls_to_parse.txt
atlas add_content --urls https://ethereum.org/en/nft,https://en.wikipedia.org/wiki/Ethereum
atlas search "what is an nft"
atlas get_inbound_links --min-inbound-links=2
from atlas.content_parser import ContentParser, ContentIndex

sample_urls = [
   "https://ethereum.org/en/nft",
   "https://en.wikipedia.org/wiki/Ethereum",
   "https://linda.mirror.xyz/df649d61efb92c910464a4e74ae213c4cab150b9cbcc4b7fb6090fc77881a95d",
   "https://chain.link/education/nfts",
   "https://medium.com/superrare/no-cryptoartists-arent-harming-the-planet-43182f72fc61",
   "https://andrewsteinwold.substack.com/p/-quick-overview-of-the-nft-ecosystem",
   "https://medium.com/superrare/no-cryptoartists-arent-harming-the-planet-43182f72fc61"
]

content_bot = ContentParser(urls=sample_urls)
content_bot.parse_all_content()
content_bot.save_to_file()

content_index = ContentIndex()
content_index.initialize_index()
results = content_index.search("what is an nft")

Development Quickstart

Note: Make sure you've put your environment variables into the newly created .env file that was taken from shared.env

# 1. Parse and index your content:
python atlas/content_parser.py

# 2. Initialize your content:
python atlas/content_index.py

# 3. Run the API
python api/api.py

# 4. Send a GET request to your api
curl --location --request GET 'http://127.0.0.1:8080/api/search?q=what+is+an+NFT'
# or open your browser to `http://127.0.0.1:8080/api/search?q=<your_search_term>` 

Publishing Package to PyPi

  1. python -m build
  2. Uploading to test PYPi server first to practice:
  3. python -m twine upload --repository testpypi dist/*
    1. Note the use of no-dependencies flag: --no-deps. This is because the dependencies might not be in the TestPyPi
    2. Set your username to __token__
    3. Set your password to the token value, including the pypi- prefix
    4. Test: https://test.pypi.org/manage/account/#api-tokens
    5. Prod: https://pypi.org/manage/account/#api-tokens
  4. Upload to the real PyPI server: python -m twine upload dist/*

Troubleshooting

ModuleNotFoundError: No module named 'atlas'

Set your $PYTHONPATH. See this SO answer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atila-atlas-0.0.2.tar.gz (10.0 kB view details)

Uploaded Source

Built Distribution

atila_atlas-0.0.2-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file atila-atlas-0.0.2.tar.gz.

File metadata

  • Download URL: atila-atlas-0.0.2.tar.gz
  • Upload date:
  • Size: 10.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for atila-atlas-0.0.2.tar.gz
Algorithm Hash digest
SHA256 d763094fda6379b22ab0aa0dd5b265071c8e0d5a945321b89dc732b3d245ba4c
MD5 c3e70cf971c3b583a0b8453e2d28dd21
BLAKE2b-256 4595fc5f259b584a47b9b41ec017a527af50cb5c7efc0985393c17c7c57fc7b1

See more details on using hashes here.

File details

Details for the file atila_atlas-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: atila_atlas-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 11.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for atila_atlas-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9fdefe45319d559ed9ccf94909200ee11cad5e3579e995f4bb9c77b8beb57622
MD5 ee7cfe1f9742eff9c05e7c6029212e22
BLAKE2b-256 a6bc151bc0e8823a1ec63e8e2a291a2bec6de837867d849a388580f227e4d514

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page