Skip to main content

Extract links from bio link services like Linktree

Project description

Biosites

A Python package for extracting links from bio link aggregator services like Linktree, inpock, lit.link, and others.

Features

  • Extract links from 10+ popular bio link services
  • Automatic redirect resolution for shortened URLs
  • Configurable user agents
  • Async/await support (Async API Only)
  • Type-safe with Pydantic models
  • Command-line interface

Supported Services

  • Linktree (linktr.ee, linktree.com)
  • Lit.link (lit.link)
  • Littly (litt.ly)
  • Inpock (link.inpock.co.kr)
  • Bio.site (bio.site)
  • Instabio (instabio.cc)
  • LinkBio (linkbio.co)
  • Link.me (link.me)
  • Generic HTML link extraction for unsupported services

Installation

pip install biosites

Or with poetry/uv:

poetry add biosites
# or
uv add biosites

Usage

Command Line

Extract links from a bio page:

biosites https://linktr.ee/username

Extract from shortened URL (automatically follows redirects):

biosites https://bit.ly/shortened-link

Python API

import asyncio
from biosites import LinkExtractor

async def main():
    extractor = LinkExtractor()

    # Extract links from a bio page
    result = await extractor.extract("https://linktr.ee/username")

    # Access extracted links
    for link in result.links:
        print(f"{link.title}: {link.url}")
        if link.metadata:
            print(f"  Metadata: {link.metadata}")

    # Check service type
    print(f"Service: {result.service_type}")

asyncio.run(main())

Custom User Agent

from biosites import LinkExtractor

extractor = LinkExtractor(
    user_agent="MyBot/1.0 (https://example.com/bot)"
)

Check if URL is Supported

from biosites import LinkExtractor

extractor = LinkExtractor()
supported, service = extractor.can_handle("https://linktr.ee/username")
if supported:
    print(f"URL will be handled by {service}")

Handle Redirects

The package automatically handles redirected URLs from shorteners:

result = await extractor.extract("https://bit.ly/shortened")
# Automatically follows to final bio link service

# Access redirect information
if result.metadata and "redirect_chain" in result.metadata:
    print(f"Original URL: {result.metadata['original_url']}")
    print(f"Redirect chain: {result.metadata['redirect_chain']}")

Development

Setup

# Clone the repository
git clone https://github.com/yourusername/biosites.git
cd biosites

# Install dependencies
uv venv
uv pip install -e ".[dev]"

Running Tests

pytest

Type Checking

mypy biosites

Linting

ruff check biosites tests
ruff format biosites tests

Architecture

The package uses a modular architecture:

  • Base Extractor: Abstract base class defining the interface
  • Service Extractors: Specialized extractors for each bio service
  • Link Extractor: Main entry point that routes to appropriate extractor
  • Redirect Handler: Handles URL shorteners and redirects
  • Models: Pydantic models for type safety

Each extractor implements:

  • can_handle(url): Check if the extractor supports the URL
  • extract_links(html, url): Extract links from the HTML content

Adding New Services

To add support for a new bio link service:

  1. Create a new extractor in biosites/extractors/
  2. Inherit from BaseLinkExtractor
  3. Implement can_handle() and extract_links() methods
  4. Register the extractor in biosites/extractor.py

Example:

from biosites.base import BaseLinkExtractor
from biosites.models import ExtractedLink

class NewServiceExtractor(BaseLinkExtractor):
    @classmethod
    def can_handle(cls, url: str) -> bool:
        return "newservice.com" in url.lower()

    async def extract_links(self, html: str, url: str) -> list[ExtractedLink]:
        # Parse HTML and extract links
        links = []
        # ... extraction logic ...
        return links

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/new-service)
  3. Write tests for your changes
  4. Ensure all tests pass and type checking is clean
  5. Commit your changes with descriptive messages
  6. Push to your branch and create a Pull Request

Requirements

  • Python 3.10+
  • aiohttp
  • pydantic
  • beautifulsoup4
  • selectolax
  • click

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biosites-1.0.0.tar.gz (302.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

biosites-1.0.0-py3-none-any.whl (28.1 kB view details)

Uploaded Python 3

File details

Details for the file biosites-1.0.0.tar.gz.

File metadata

  • Download URL: biosites-1.0.0.tar.gz
  • Upload date:
  • Size: 302.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for biosites-1.0.0.tar.gz
Algorithm Hash digest
SHA256 7c9cc8504acb77a3ddd657da2db82df63588512943eabe038868b9732fed9ab3
MD5 25d41f2fd07ce57999ae7742533e9599
BLAKE2b-256 0b3b46d5497febcc65fa36a197f8a66b2d51c17b7aa624b76033adeaf32f99c5

See more details on using hashes here.

Provenance

The following attestation bundles were made for biosites-1.0.0.tar.gz:

Publisher: publish.yml on ssut/biosites

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file biosites-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: biosites-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 28.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for biosites-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9abe9ad81728837d7c65078f31e8bd214a407dfcaff5957f288b7a4101f5c2af
MD5 5eedf9a8e613f7f2ce6ce4206a9c0ad1
BLAKE2b-256 80b82bd94178bc1aac9300dd91ee41bc61fcd040e8ee859e570ffcaf7f658e82

See more details on using hashes here.

Provenance

The following attestation bundles were made for biosites-1.0.0-py3-none-any.whl:

Publisher: publish.yml on ssut/biosites

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page