Skip to main content

RCLCO Python library - Core data connectivity for databases, APIs, storage, and SharePoint

Project description

RCLCO Python Library

A Python library for RCLCO data access and analytics.

Quick Start

The library uses a geography-first approach: load shapes, enrich them with data, then analyze or visualize.

from rclco.shapes import Shape

# Load census tracts
shapes = Shape.from_census("tract", state="NC", county="Wake")

# Enrich with Census income distribution data
shapes.enrich.census_table("B19001")

# Visualize on an interactive map
shapes.map(color_by="$200,000_or_more", cmap="YlOrRd")

Loading Shapes

Every workflow starts by loading shapes from one of several sources.

From Census Geography

Load tracts, block groups, counties, or other census units by state/county name:

from rclco.shapes import Shape

# Tracts in a county
shapes = Shape.from_census("tract", state="NC", county="Wake")

# Block groups for a whole state
shapes = Shape.from_census("block_group", state="CA")

# Specific geographies by GEOID
shapes = Shape.from_census("tract", geoids=["37183052802", "37183052803"])

Supported geo_unit values: tract, block_group, county, state, zip, place, cbsa, msa

From Coordinates, Geometry, or Files

# From a point (optionally buffered to a polygon)
shapes = Shape.from_point(35.9, -78.9, buffer_meters=1000)

# From WKT
shapes = Shape.from_wkt(
    "POLYGON((-78.9 35.9, -78.8 35.9, -78.8 35.8, -78.9 35.8, -78.9 35.9))",
    properties={"name": "Study Area"}
)

# From GeoJSON (dict, Feature, or FeatureCollection)
shapes = Shape.from_geojson({
    "type": "Feature",
    "geometry": {"type": "Point", "coordinates": [-78.9, 35.9]},
    "properties": {"name": "Downtown"}
})

# From a file (GeoJSON, Shapefile, GeoPackage)
shapes = Shape.from_file("boundaries.geojson")

# From an existing GeoDataFrame
shapes = Shape.from_geodataframe(gdf)

# From a named shape in the database
shapes = Shape.from_database("Downtown Raleigh Boundary")

Enriching Shapes with Data

Enrichment adds data columns to your shapes. Each enrichment is stored as a named Polars DataFrame in shapes.datasets so you can pull multiple datasets without column-name collisions. Data is automatically merged for visualization.

Census Bureau Data (ACS)

Individual Variables

Use human-readable names or raw Census codes. The library includes a curated catalog with friendly names.

shapes.enrich.census(
    variables=["median_household_income", "total_population"],
    year=2023
)

# Data is stored as a Polars DataFrame
shapes.datasets["census"]

Pre-Built Variable Groups

For common analysis patterns, use the built-in variable groups instead of listing variables one by one:

from rclco.data.census import Census

census = Census()

# See all available groups
census.groups.basic_demographics    # population, age, households
census.groups.race_ethnicity        # race and ethnicity breakdown
census.groups.income_poverty        # income, poverty, gini index
census.groups.housing_basics        # units, tenure, rent, home value
census.groups.housing_detailed      # structure types (SF, duplex, multifamily)
census.groups.education             # educational attainment
census.groups.employment            # labor force, earnings
census.groups.commute               # transportation mode, travel time

# Use a group to enrich
shapes.enrich.census(variables=census.groups.housing_basics, name="housing")
shapes.enrich.census(variables=census.groups.income_poverty, name="income")

# Access each dataset separately
shapes.datasets["housing"]
shapes.datasets["income"]

Entire Census Tables

When you need all the cross-tabulated detail from a table (e.g., every income bracket, every age group):

# Full income distribution (all 16 brackets)
shapes.enrich.census_table("B19001")

# Full age-by-sex breakdown
shapes.enrich.census_table("B01001")

# Housing structure types
shapes.enrich.census_table("B25024")

# Access the data
shapes.datasets["B19001"]

Race/Ethnicity Breakdowns

Many Census tables have race-specific variants (suffixes A-I). Fetch them individually or combine them all at once:

# Single race variant
shapes.enrich.census_table("B19001I", name="income_hispanic")  # Hispanic
shapes.enrich.census_table("B19001B", name="income_black")     # Black

# Or combine ALL race variants into one dataset with prefixed columns
shapes.enrich.census_table("B19001", add_race_variants=True)
# Creates columns like: white_total, black_total, hispanic_total, ...

Race/ethnicity suffixes: A=White, B=Black, C=AIAN, D=Asian, E=NHPI, F=Other, G=Two+, H=White Non-Hispanic, I=Hispanic

Discovering Variables

The Census helper makes it easy to find variables without memorizing codes:

from rclco.data.census import Census

census = Census()

# Browse by topic
census.list_topics()
# -> ['education', 'employment', 'health', 'households', 'housing',
#     'income', 'population', 'race', 'transportation']

# List variables within a topic
census.list_variables("housing")

# Search across all variables
census.search("median rent")

# Look up a specific variable
census.variable_info("median_household_income")
# -> {'name': 'median_household_income', 'code': 'B19013_001E', 'topic': 'income', ...}

# Find available detailed tables
census.list_tables("income")
# -> [{'id': 'B19001', 'description': 'Household Income in Past 12 Months', ...}, ...]

# See the column labels for a table before fetching
census.get_table_labels("B25024")

Using the Census Client Directly

For analysis outside the Shape workflow, the Census class can fetch data independently:

from rclco.data.census import Census

census = Census(year=2023)

# Get a Polars DataFrame
df = census.get(
    variables=["median_household_income", "total_population"],
    state="NC", county="Wake", geo_level="tract"
)

# Get an entire table
df = census.get_table("B19001", state="CA", geo_level="county")

# Get data for specific GEOIDs
df = census.get_for_geoids(
    variables=["median_household_income"],
    geoids=["37183052802", "37183052803"],
    geo_level="tract"
)

# Fetch geometry directly (returns GeoDataFrame via pygris)
gdf = census.get_shapes("tract", state="NC", county="Wake")

Esri Demographics

Enriches shapes with Esri demographic data from the RCLCO database:

# Default demographics dataset
shapes.enrich.demographics(year=2024)

# Income by age (current year and future year)
shapes.enrich.demographics(dataset="income_by_age_cy", year=2024, name="income_cy")
shapes.enrich.demographics(dataset="income_by_age_fy", year=2024, name="income_fy")

# Compare years
shapes.enrich.demographics(dataset="demographics", year=2020, name="demo_2020")
shapes.enrich.demographics(dataset="demographics", year=2024, name="demo_2024")

Available Esri datasets: demographics, income_by_age_cy, income_by_age_fy

Custom Data

Join your own DataFrames by attribute or spatial relationship:

# Attribute join on a shared column
shapes.enrich.custom(my_dataframe, on="geoid")

# Spatial join (requires a GeoDataFrame)
shapes.enrich.custom(points_gdf, spatial=True)

Accessing Enriched Data

Each enrichment is stored as a separate Polars DataFrame in shapes.datasets:

# List all enriched datasets
print(shapes.datasets.keys())  # e.g., dict_keys(['B19001', 'B25024', 'demographics'])

# Access a specific dataset
income_df = shapes.datasets["B19001"]

# The base GeoDataFrame (geometry + any columns from loading)
shapes.data

When you call shapes.map() or shapes.plot(), all datasets are automatically merged onto the GeoDataFrame via GEOID for visualization.


Spatial Operations

All spatial methods return a new Shape, so you can chain them:

# Find census tracts that intersect a study area
study_area = Shape.from_wkt("POLYGON((...))")
tracts = study_area.intersecting(geo_unit="tract")

# Buffer (distance in meters)
buffered = shapes.buffer(1000)

# Dissolve all shapes into one, or group by a column
merged = shapes.dissolve()
by_county = shapes.dissolve(by="county")

# Clip to a boundary
clipped = shapes.clip(boundary)

# Other operations
centroids = shapes.centroid()
unified = shapes.union()
simplified = shapes.simplify(tolerance=0.001)
hulls = shapes.convex_hull()
boxes = shapes.envelope()

Filtering and Slicing

Shape supports DataFrame-like operations:

# Filter by condition
high_income = shapes.filter(shapes.data["median_household_income"] > 100000)

# Slice
first_ten = shapes.head(10)
last_five = shapes.tail(5)
random_sample = shapes.sample(20)

# Sort
sorted_shapes = shapes.sort_values("median_household_income", ascending=False)

# Drop or rename columns
cleaned = shapes.drop_columns(["unwanted_col"])
renamed = shapes.rename_columns({"old_name": "new_name"})

# Index and iterate
single = shapes[0]
subset = shapes[5:10]
for shape in shapes:
    print(shape.data)

Visualization

Interactive Maps (Folium)

# Basic boundary map
shapes.map()

# Choropleth colored by a data column (from any enriched dataset)
shapes.map(color_by="median_household_income", cmap="YlOrRd")

# Preview first N shapes
shapes.preview(5)

Static Plots (Matplotlib)

# Simple plot
shapes.plot()

# Choropleth with legend
shapes.plot(column="renter_occupied", cmap="Blues", legend=True)

Standalone Chart Functions

from rclco.viz import choropleth, bar_chart, histogram, scatter

# Choropleth from a GeoDataFrame
choropleth(gdf, column="population", title="Population by Tract")

# Bar chart, histogram, scatter
bar_chart(df, x="county", y="median_income", top_n=10)
histogram(df, column="rent", bins=30)
scatter(df, x="income", y="rent", color_by="county")

Export

# GeoDataFrame (with geometry)
gdf = shapes.to_geodataframe()

# Polars DataFrame (no geometry)
df = shapes.to_dataframe()

# pandas DataFrame (no geometry)
pdf = shapes.to_pandas()

# Files
shapes.to_file("output.geojson")
shapes.to_file("output.shp")
shapes.to_file("output.gpkg", layer="tracts")

# Serialization
wkt_list = shapes.to_wkt()
geojson_dict = shapes.to_geojson()
wkb_list = shapes.to_wkb()
records = shapes.to_dict()

End-to-End Example: Multifamily Market Analysis

This example mirrors a real RCLCO workflow — loading multi-county tract data, enriching with Census tables, and computing derived metrics:

from rclco.shapes import Shape

# Define study area
counties = {"Los Angeles": "Los Angeles", "Orange": "Orange"}
census_tables = ["B25003", "B25024", "B19001", "B19013", "B01001", "B23025", "B08301"]

# Load and enrich each county
county_shapes = {}
for name, county in counties.items():
    shapes = Shape.from_census("tract", state="CA", county=county)
    for table_id in census_tables:
        shapes.enrich.census_table(table_id, year=2023)
    county_shapes[name] = shapes

# Combine into a single GeoDataFrame for analysis
import geopandas as gpd
import pandas as pd

all_gdfs = []
for county_name, shapes in county_shapes.items():
    gdf = shapes._get_merged_data()
    gdf["county"] = county_name
    all_gdfs.append(gdf)

gdf_all = gpd.GeoDataFrame(pd.concat(all_gdfs, ignore_index=True))

# Now compute derived metrics (renter share, affordability, etc.)
gdf_all["renter_pct"] = (
    gdf_all["renter_occupied"] /
    (gdf_all["owner_occupied"] + gdf_all["renter_occupied"]) * 100
)

Properties Reference

Property Returns Description
shapes.data GeoDataFrame Underlying GeoDataFrame
shapes.geometry GeoSeries Geometry column
shapes.columns list[str] Column names
shapes.crs CRS Coordinate reference system
shapes.bounds DataFrame Per-shape bounding boxes
shapes.total_bounds tuple Overall bounding box (minx, miny, maxx, maxy)
shapes.datasets dict[str, DataFrame] Named Polars DataFrames from enrichment
shapes.enrich ShapeEnricher Enrichment interface
len(shapes) int Number of shapes

Installation

Install from PyPI:

pip install rclco

Or using uv:

uv pip install rclco

Development

This project uses uv for dependency management. uv is an extremely fast Python package manager written in Rust that replaces pip, poetry, pyenv, and virtualenv.

Installing uv

Windows (PowerShell):

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

macOS/Linux:

curl -LsSf https://astral.sh/uv/install.sh | sh

Alternative (via pip):

pip install uv

After installation, restart your terminal or run refreshenv to ensure uv is available in your PATH.

Getting Started (Full Workflow)

  1. Clone the repository:
git clone https://github.com/RCLCO-RFA/python-rclco.git
cd python-rclco
  1. Install all dependencies (including dev dependencies):
uv sync --all-extras

This command will:

  • Create a virtual environment in .venv (if it doesn't exist)
  • Install all project dependencies
  • Install the package in editable mode
  1. Activate the virtual environment (optional):

uv commands automatically use the virtual environment, but if you want to activate it manually:

# Windows PowerShell
.venv\Scripts\Activate.ps1

# Windows Command Prompt
.venv\Scripts\activate.bat

# macOS/Linux
source .venv/bin/activate

Common uv Commands

Task Command
Install all dependencies uv sync
Install with dev dependencies uv sync --all-extras
Add a new dependency uv add <package>
Add a dev dependency uv add --dev <package>
Remove a dependency uv remove <package>
Update all dependencies uv lock --upgrade then uv sync
Run a command in the venv uv run <command>
Run Python uv run python
Run tests uv run pytest

Adding Dependencies

Add a runtime dependency:

uv add requests

Add a dev-only dependency:

uv add --dev black ruff mypy

Add a dependency with version constraints:

uv add "pandas>=2.0"

After adding dependencies, the pyproject.toml and uv.lock files will be updated automatically. Commit both files to version control.

Running Tests

uv run pytest

To run with verbose output:

uv run pytest -v

Building and Publishing

This project uses tag-based versioning with hatch-vcs. The version is automatically derived from git tags — no need to manually edit version strings in code.

How Versioning Works

  • The version is determined by git tags (e.g., v0.1.2 → version 0.1.2)
  • During development, the version includes git metadata (e.g., 0.1.2.dev3+g1234567)
  • When you build from a tagged commit, you get a clean version (e.g., 0.1.2)

Creating a Release

  1. Ensure all changes are committed and pushed to main

  2. Create and push a version tag:

git tag v0.2.0
git push origin v0.2.0
  1. GitHub Actions automatically:
    • Runs all tests
    • Builds the package
    • Creates a GitHub Release with auto-generated release notes
    • Publishes to PyPI

Manual Build (for testing)

Build the package locally:

uv build

This creates distribution files in the dist/ directory.

Publish manually (if needed):

uv publish --token YOUR_PYPI_TOKEN

Setting Up PyPI Publishing (for maintainers)

To enable automatic publishing to PyPI:

  1. Create a PyPI API Token:

  2. Add the token to GitHub Secrets:

    • Go to your repo → SettingsSecrets and variablesActions
    • Click New repository secret
    • Name: PYPI_TOKEN
    • Value: paste your PyPI token
    • Click Add secret

Version Tag Format

Use semantic versioning with a v prefix:

Tag Version Description
v0.1.0 0.1.0 Initial release
v0.1.1 0.1.1 Patch release (bug fixes)
v0.2.0 0.2.0 Minor release (new features)
v1.0.0 1.0.0 Major release (breaking changes)

Git Workflow for Contributors

This section outlines our team's Git workflow. It's designed to be simple and safe for developers of all experience levels while maintaining code quality.

Overview: Branch-Based Development

We use a feature branch workflow:

main (protected) ← Pull Requests ← feature branches
  • main is our stable branch — always deployable
  • All changes go through feature branches and Pull Requests (PRs)
  • No one pushes directly to main — changes are merged via PR only

Quick Reference Card

What you want to do Command(s)
Start new work git checkout maingit pullgit checkout -b your-name/feature-description
Save your work git add .git commit -m "description"
Push to GitHub git push -u origin your-branch-name (first time) or git push (after)
Update your branch with latest main git checkout maingit pullgit checkout your-branchgit merge main
See what's changed git status
See your branches git branch
Switch branches git checkout branch-name

Step-by-Step Workflow

1. Before Starting Any New Work

Always start from an up-to-date main branch:

# Switch to main branch
git checkout main

# Get the latest changes from GitHub
git pull origin main

2. Create a Feature Branch

Create a new branch for your work. Use this naming convention:

your-name/short-description

Examples:

  • sarah/add-census-data-loader
  • john/fix-api-timeout
  • maria/update-documentation
# Create and switch to your new branch
git checkout -b your-name/feature-description

3. Make Your Changes

Edit code, add files, etc. Check your changes often:

# See what files have changed
git status

# See the actual changes in files
git diff

4. Save Your Work (Commit Often!)

Commit your changes frequently with clear messages:

# Stage all your changes
git add .

# Or stage specific files
git add path/to/file.py

# Commit with a descriptive message
git commit -m "Add function to load census data from API"

Good commit messages:

  • ✅ "Add census data loader function"
  • ✅ "Fix timeout error in API requests"
  • ✅ "Update README with installation steps"

Avoid vague messages:

  • ❌ "Fixed stuff"
  • ❌ "WIP"
  • ❌ "Changes"

5. Push Your Branch to GitHub

# First time pushing this branch
git push -u origin your-name/feature-description

# After the first push, just use
git push

6. Create a Pull Request (PR)

  1. Go to the repository on GitHub
  2. You'll see a prompt to "Compare & pull request" — click it
  3. Fill out the PR template:
    • Title: Brief description of what this PR does
    • Description: Explain what you changed and why
  4. Request a review from a teammate
  5. Click "Create pull request"

7. Address Review Feedback

If reviewers request changes:

# Make the requested changes in your code
# Then commit and push
git add .
git commit -m "Address review feedback: improve error handling"
git push

The PR will automatically update with your new commits.

8. Merge Your PR

Once approved:

  1. Click "Squash and merge" (recommended) or "Merge pull request"
  2. Delete your branch when prompted

9. Clean Up Locally

After your PR is merged:

# Switch back to main
git checkout main

# Get the merged changes
git pull origin main

# Delete your local feature branch (optional but recommended)
git branch -d your-name/feature-description

Handling Common Situations

Your Branch is Behind Main

If main has been updated while you were working:

# Make sure your changes are committed first
git add .
git commit -m "Your commit message"

# Get the latest main
git checkout main
git pull origin main

# Go back to your branch and merge main into it
git checkout your-name/feature-description
git merge main

If there are merge conflicts, see the section below.

Resolving Merge Conflicts

When Git can't automatically merge changes, you'll see conflict markers in files:

<<<<<<< HEAD
# Your version of the code
def load_data():
    return pd.read_csv("data.csv")
=======
# The other version of the code
def load_data():
    return pd.read_excel("data.xlsx")
>>>>>>> main

To resolve:

  1. Open the conflicted file(s)
  2. Decide which code to keep (or combine both)
  3. Remove the conflict markers (<<<<<<<, =======, >>>>>>>)
  4. Save the file
  5. Stage and commit:
git add .
git commit -m "Resolve merge conflicts with main"
git push

If you're unsure how to resolve a conflict, ask for help! It's better to ask than to accidentally delete someone's work.

Oops! I Made Changes on Main

If you accidentally started working on main:

# Create a new branch with your changes
git checkout -b your-name/accidental-changes

# Your changes are now on the new branch
# Push and create a PR as normal
git push -u origin your-name/accidental-changes

Oops! I Need to Undo My Last Commit

If you haven't pushed yet:

# Undo the commit but keep the changes
git reset --soft HEAD~1

If you already pushed, ask for help before trying to undo — it's more complicated.

I Want to See What's on Another Branch

# List all branches
git branch -a

# Switch to another branch (make sure your work is committed first!)
git checkout branch-name

Best Practices

Do ✅

  • Pull before you start — always git pull on main before creating a branch
  • Commit often — small, frequent commits are easier to understand and undo
  • Write clear commit messages — your future self will thank you
  • Push your branch daily — protects your work and lets others see your progress
  • Ask for help — Git can be confusing; there's no shame in asking
  • Run tests before pushinguv run pytest

Don't ❌

  • Don't push directly to main — always use a PR
  • Don't force push (git push --force) — unless you really know what you're doing
  • Don't commit sensitive data — API keys, passwords, etc. (use .env files)
  • Don't commit large data files — use .gitignore for CSVs, Excel files, etc.

Git Glossary

Term What it means
Repository (repo) The project folder tracked by Git
Branch An independent line of development
Commit A saved snapshot of your changes
Push Upload your commits to GitHub
Pull Download changes from GitHub
Merge Combine changes from one branch into another
Pull Request (PR) A request to merge your branch into main
Clone Download a copy of a repository
Stage Mark files to be included in the next commit
HEAD The current commit you're working on
Origin The default name for the remote GitHub repository

Getting Help

Git command help:

git help <command>
# Example: git help merge

Check repository status:

git status

See commit history:

git log --oneline -10

Still stuck? Ask the team! Git mistakes are almost always recoverable.


Development Workflow Summary

1. Clone repo          → git clone ... && cd python-rclco
2. Install deps        → uv sync --all-extras
3. Update main         → git checkout main && git pull
4. Create branch       → git checkout -b your-name/feature-description
5. Make changes        → edit code
6. Add dependencies    → uv add <package> or uv add --dev <package>
7. Run tests           → uv run pytest
8. Commit changes      → git add . && git commit -m "..."
9. Push branch         → git push -u origin your-name/feature-description
10. Open PR            → merge to main after review
11. Release (maintainer) → git tag v0.2.0 && git push origin v0.2.0

License

See LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rclco-0.3.2.tar.gz (361.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rclco-0.3.2-py3-none-any.whl (144.9 kB view details)

Uploaded Python 3

File details

Details for the file rclco-0.3.2.tar.gz.

File metadata

  • Download URL: rclco-0.3.2.tar.gz
  • Upload date:
  • Size: 361.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for rclco-0.3.2.tar.gz
Algorithm Hash digest
SHA256 4d8a64efd41602efe215251893cd9bb4dce8fc3f5e142459ab7c19b139e01d1a
MD5 81499add456b716ba814a6c0ae88f41a
BLAKE2b-256 e012ba5334a3f91f6f8f30f7577ade79c899a2fbbafc88f749217aab5194f159

See more details on using hashes here.

File details

Details for the file rclco-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: rclco-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 144.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for rclco-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 dd66f45f655299b033e2b2599bc64bacb35ad91f85eeba91ec23beb4b4c38fc6
MD5 ce1196bc10df11191f1c2959057808bb
BLAKE2b-256 7ac7e32bda41b52242ca6c5b63df1800c12a4144d6ab769abe380d11885cc525

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page