RCLCO Python library - Core data connectivity for databases, APIs, storage, and SharePoint
Project description
RCLCO Python Library
A Python library for RCLCO data access and analytics.
Quick Start
The library uses a geography-first approach: load shapes, enrich them with data, then analyze or visualize.
from rclco.shapes import Shape
# Load census tracts
shapes = Shape.from_census("tract", state="NC", county="Wake")
# Enrich with Census income distribution data
shapes.enrich.census_table("B19001")
# Visualize on an interactive map
shapes.map(color_by="$200,000_or_more", cmap="YlOrRd")
Loading Shapes
Every workflow starts by loading shapes from one of several sources.
From Census Geography
Load tracts, block groups, counties, or other census units by state/county name:
from rclco.shapes import Shape
# Tracts in a county
shapes = Shape.from_census("tract", state="NC", county="Wake")
# Block groups for a whole state
shapes = Shape.from_census("block_group", state="CA")
# Specific geographies by GEOID
shapes = Shape.from_census("tract", geoids=["37183052802", "37183052803"])
Supported geo_unit values: tract, block_group, county, state, zip, place, cbsa, msa
From Coordinates, Geometry, or Files
# From a point (optionally buffered to a polygon)
shapes = Shape.from_point(35.9, -78.9, buffer_meters=1000)
# From WKT
shapes = Shape.from_wkt(
"POLYGON((-78.9 35.9, -78.8 35.9, -78.8 35.8, -78.9 35.8, -78.9 35.9))",
properties={"name": "Study Area"}
)
# From GeoJSON (dict, Feature, or FeatureCollection)
shapes = Shape.from_geojson({
"type": "Feature",
"geometry": {"type": "Point", "coordinates": [-78.9, 35.9]},
"properties": {"name": "Downtown"}
})
# From a file (GeoJSON, Shapefile, GeoPackage)
shapes = Shape.from_file("boundaries.geojson")
# From an existing GeoDataFrame
shapes = Shape.from_geodataframe(gdf)
# From a named shape in the database
shapes = Shape.from_database("Downtown Raleigh Boundary")
Enriching Shapes with Data
Enrichment adds data columns to your shapes. Each enrichment is stored as a named Polars DataFrame in shapes.datasets so you can pull multiple datasets without column-name collisions. Data is automatically merged for visualization.
Census Bureau Data (ACS)
Individual Variables
Use human-readable names or raw Census codes. The library includes a curated catalog with friendly names.
shapes.enrich.census(
variables=["median_household_income", "total_population"],
year=2023
)
# Data is stored as a Polars DataFrame
shapes.datasets["census"]
Pre-Built Variable Groups
For common analysis patterns, use the built-in variable groups instead of listing variables one by one:
from rclco.data.census import Census
census = Census()
# See all available groups
census.groups.basic_demographics # population, age, households
census.groups.race_ethnicity # race and ethnicity breakdown
census.groups.income_poverty # income, poverty, gini index
census.groups.housing_basics # units, tenure, rent, home value
census.groups.housing_detailed # structure types (SF, duplex, multifamily)
census.groups.education # educational attainment
census.groups.employment # labor force, earnings
census.groups.commute # transportation mode, travel time
# Use a group to enrich
shapes.enrich.census(variables=census.groups.housing_basics, name="housing")
shapes.enrich.census(variables=census.groups.income_poverty, name="income")
# Access each dataset separately
shapes.datasets["housing"]
shapes.datasets["income"]
Entire Census Tables
When you need all the cross-tabulated detail from a table (e.g., every income bracket, every age group):
# Full income distribution (all 16 brackets)
shapes.enrich.census_table("B19001")
# Full age-by-sex breakdown
shapes.enrich.census_table("B01001")
# Housing structure types
shapes.enrich.census_table("B25024")
# Access the data
shapes.datasets["B19001"]
Race/Ethnicity Breakdowns
Many Census tables have race-specific variants (suffixes A-I). Fetch them individually or combine them all at once:
# Single race variant
shapes.enrich.census_table("B19001I", name="income_hispanic") # Hispanic
shapes.enrich.census_table("B19001B", name="income_black") # Black
# Or combine ALL race variants into one dataset with prefixed columns
shapes.enrich.census_table("B19001", add_race_variants=True)
# Creates columns like: white_total, black_total, hispanic_total, ...
Race/ethnicity suffixes: A=White, B=Black, C=AIAN, D=Asian, E=NHPI, F=Other, G=Two+, H=White Non-Hispanic, I=Hispanic
Discovering Variables
The Census helper makes it easy to find variables without memorizing codes:
from rclco.data.census import Census
census = Census()
# Browse by topic
census.list_topics()
# -> ['education', 'employment', 'health', 'households', 'housing',
# 'income', 'population', 'race', 'transportation']
# List variables within a topic
census.list_variables("housing")
# Search across all variables
census.search("median rent")
# Look up a specific variable
census.variable_info("median_household_income")
# -> {'name': 'median_household_income', 'code': 'B19013_001E', 'topic': 'income', ...}
# Find available detailed tables
census.list_tables("income")
# -> [{'id': 'B19001', 'description': 'Household Income in Past 12 Months', ...}, ...]
# See the column labels for a table before fetching
census.get_table_labels("B25024")
Using the Census Client Directly
For analysis outside the Shape workflow, the Census class can fetch data independently:
from rclco.data.census import Census
census = Census(year=2023)
# Get a Polars DataFrame
df = census.get(
variables=["median_household_income", "total_population"],
state="NC", county="Wake", geo_level="tract"
)
# Get an entire table
df = census.get_table("B19001", state="CA", geo_level="county")
# Get data for specific GEOIDs
df = census.get_for_geoids(
variables=["median_household_income"],
geoids=["37183052802", "37183052803"],
geo_level="tract"
)
# Fetch geometry directly (returns GeoDataFrame via pygris)
gdf = census.get_shapes("tract", state="NC", county="Wake")
Esri Demographics
Enriches shapes with Esri demographic data from the RCLCO database:
# Default demographics dataset
shapes.enrich.demographics(year=2024)
# Income by age (current year and future year)
shapes.enrich.demographics(dataset="income_by_age_cy", year=2024, name="income_cy")
shapes.enrich.demographics(dataset="income_by_age_fy", year=2024, name="income_fy")
# Compare years
shapes.enrich.demographics(dataset="demographics", year=2020, name="demo_2020")
shapes.enrich.demographics(dataset="demographics", year=2024, name="demo_2024")
Available Esri datasets: demographics, income_by_age_cy, income_by_age_fy
Custom Data
Join your own DataFrames by attribute or spatial relationship:
# Attribute join on a shared column
shapes.enrich.custom(my_dataframe, on="geoid")
# Spatial join (requires a GeoDataFrame)
shapes.enrich.custom(points_gdf, spatial=True)
Accessing Enriched Data
Each enrichment is stored as a separate Polars DataFrame in shapes.datasets:
# List all enriched datasets
print(shapes.datasets.keys()) # e.g., dict_keys(['B19001', 'B25024', 'demographics'])
# Access a specific dataset
income_df = shapes.datasets["B19001"]
# The base GeoDataFrame (geometry + any columns from loading)
shapes.data
When you call shapes.map() or shapes.plot(), all datasets are automatically merged onto the GeoDataFrame via GEOID for visualization.
Spatial Operations
All spatial methods return a new Shape, so you can chain them:
# Find census tracts that intersect a study area
study_area = Shape.from_wkt("POLYGON((...))")
tracts = study_area.intersecting(geo_unit="tract")
# Buffer (distance in meters)
buffered = shapes.buffer(1000)
# Dissolve all shapes into one, or group by a column
merged = shapes.dissolve()
by_county = shapes.dissolve(by="county")
# Clip to a boundary
clipped = shapes.clip(boundary)
# Other operations
centroids = shapes.centroid()
unified = shapes.union()
simplified = shapes.simplify(tolerance=0.001)
hulls = shapes.convex_hull()
boxes = shapes.envelope()
Filtering and Slicing
Shape supports DataFrame-like operations:
# Filter by condition
high_income = shapes.filter(shapes.data["median_household_income"] > 100000)
# Slice
first_ten = shapes.head(10)
last_five = shapes.tail(5)
random_sample = shapes.sample(20)
# Sort
sorted_shapes = shapes.sort_values("median_household_income", ascending=False)
# Drop or rename columns
cleaned = shapes.drop_columns(["unwanted_col"])
renamed = shapes.rename_columns({"old_name": "new_name"})
# Index and iterate
single = shapes[0]
subset = shapes[5:10]
for shape in shapes:
print(shape.data)
Visualization
Interactive Maps (Folium)
# Basic boundary map
shapes.map()
# Choropleth colored by a data column (from any enriched dataset)
shapes.map(color_by="median_household_income", cmap="YlOrRd")
# Preview first N shapes
shapes.preview(5)
Static Plots (Matplotlib)
# Simple plot
shapes.plot()
# Choropleth with legend
shapes.plot(column="renter_occupied", cmap="Blues", legend=True)
Standalone Chart Functions
from rclco.viz import choropleth, bar_chart, histogram, scatter
# Choropleth from a GeoDataFrame
choropleth(gdf, column="population", title="Population by Tract")
# Bar chart, histogram, scatter
bar_chart(df, x="county", y="median_income", top_n=10)
histogram(df, column="rent", bins=30)
scatter(df, x="income", y="rent", color_by="county")
Export
# GeoDataFrame (with geometry)
gdf = shapes.to_geodataframe()
# Polars DataFrame (no geometry)
df = shapes.to_dataframe()
# pandas DataFrame (no geometry)
pdf = shapes.to_pandas()
# Files
shapes.to_file("output.geojson")
shapes.to_file("output.shp")
shapes.to_file("output.gpkg", layer="tracts")
# Serialization
wkt_list = shapes.to_wkt()
geojson_dict = shapes.to_geojson()
wkb_list = shapes.to_wkb()
records = shapes.to_dict()
End-to-End Example: Multifamily Market Analysis
This example mirrors a real RCLCO workflow — loading multi-county tract data, enriching with Census tables, and computing derived metrics:
from rclco.shapes import Shape
# Define study area
counties = {"Los Angeles": "Los Angeles", "Orange": "Orange"}
census_tables = ["B25003", "B25024", "B19001", "B19013", "B01001", "B23025", "B08301"]
# Load and enrich each county
county_shapes = {}
for name, county in counties.items():
shapes = Shape.from_census("tract", state="CA", county=county)
for table_id in census_tables:
shapes.enrich.census_table(table_id, year=2023)
county_shapes[name] = shapes
# Combine into a single GeoDataFrame for analysis
import geopandas as gpd
import pandas as pd
all_gdfs = []
for county_name, shapes in county_shapes.items():
gdf = shapes._get_merged_data()
gdf["county"] = county_name
all_gdfs.append(gdf)
gdf_all = gpd.GeoDataFrame(pd.concat(all_gdfs, ignore_index=True))
# Now compute derived metrics (renter share, affordability, etc.)
gdf_all["renter_pct"] = (
gdf_all["renter_occupied"] /
(gdf_all["owner_occupied"] + gdf_all["renter_occupied"]) * 100
)
Properties Reference
| Property | Returns | Description |
|---|---|---|
shapes.data |
GeoDataFrame | Underlying GeoDataFrame |
shapes.geometry |
GeoSeries | Geometry column |
shapes.columns |
list[str] | Column names |
shapes.crs |
CRS | Coordinate reference system |
shapes.bounds |
DataFrame | Per-shape bounding boxes |
shapes.total_bounds |
tuple | Overall bounding box (minx, miny, maxx, maxy) |
shapes.datasets |
dict[str, DataFrame] | Named Polars DataFrames from enrichment |
shapes.enrich |
ShapeEnricher | Enrichment interface |
len(shapes) |
int | Number of shapes |
Installation
Install from PyPI:
pip install rclco
Or using uv:
uv pip install rclco
Development
This project uses uv for dependency management. uv is an extremely fast Python package manager written in Rust that replaces pip, poetry, pyenv, and virtualenv.
Installing uv
Windows (PowerShell):
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
macOS/Linux:
curl -LsSf https://astral.sh/uv/install.sh | sh
Alternative (via pip):
pip install uv
After installation, restart your terminal or run refreshenv to ensure uv is available in your PATH.
Getting Started (Full Workflow)
- Clone the repository:
git clone https://github.com/RCLCO-RFA/python-rclco.git
cd python-rclco
- Install all dependencies (including dev dependencies):
uv sync --all-extras
This command will:
- Create a virtual environment in
.venv(if it doesn't exist) - Install all project dependencies
- Install the package in editable mode
- Activate the virtual environment (optional):
uv commands automatically use the virtual environment, but if you want to activate it manually:
# Windows PowerShell
.venv\Scripts\Activate.ps1
# Windows Command Prompt
.venv\Scripts\activate.bat
# macOS/Linux
source .venv/bin/activate
Common uv Commands
| Task | Command |
|---|---|
| Install all dependencies | uv sync |
| Install with dev dependencies | uv sync --all-extras |
| Add a new dependency | uv add <package> |
| Add a dev dependency | uv add --dev <package> |
| Remove a dependency | uv remove <package> |
| Update all dependencies | uv lock --upgrade then uv sync |
| Run a command in the venv | uv run <command> |
| Run Python | uv run python |
| Run tests | uv run pytest |
Adding Dependencies
Add a runtime dependency:
uv add requests
Add a dev-only dependency:
uv add --dev black ruff mypy
Add a dependency with version constraints:
uv add "pandas>=2.0"
After adding dependencies, the pyproject.toml and uv.lock files will be updated automatically. Commit both files to version control.
Running Tests
uv run pytest
To run with verbose output:
uv run pytest -v
Building and Publishing
This project uses tag-based versioning with hatch-vcs. The version is automatically derived from git tags — no need to manually edit version strings in code.
How Versioning Works
- The version is determined by git tags (e.g.,
v0.1.2→ version0.1.2) - During development, the version includes git metadata (e.g.,
0.1.2.dev3+g1234567) - When you build from a tagged commit, you get a clean version (e.g.,
0.1.2)
Creating a Release
-
Ensure all changes are committed and pushed to main
-
Create and push a version tag:
git tag v0.2.0
git push origin v0.2.0
- GitHub Actions automatically:
- Runs all tests
- Builds the package
- Creates a GitHub Release with auto-generated release notes
- Publishes to PyPI
Manual Build (for testing)
Build the package locally:
uv build
This creates distribution files in the dist/ directory.
Publish manually (if needed):
uv publish --token YOUR_PYPI_TOKEN
Setting Up PyPI Publishing (for maintainers)
To enable automatic publishing to PyPI:
-
Create a PyPI API Token:
- Go to https://pypi.org/manage/account/token/
- Create a token scoped to the
rclcoproject - Copy the token (starts with
pypi-)
-
Add the token to GitHub Secrets:
- Go to your repo → Settings → Secrets and variables → Actions
- Click New repository secret
- Name:
PYPI_TOKEN - Value: paste your PyPI token
- Click Add secret
Version Tag Format
Use semantic versioning with a v prefix:
| Tag | Version | Description |
|---|---|---|
v0.1.0 |
0.1.0 | Initial release |
v0.1.1 |
0.1.1 | Patch release (bug fixes) |
v0.2.0 |
0.2.0 | Minor release (new features) |
v1.0.0 |
1.0.0 | Major release (breaking changes) |
Git Workflow for Contributors
This section outlines our team's Git workflow. It's designed to be simple and safe for developers of all experience levels while maintaining code quality.
Overview: Branch-Based Development
We use a feature branch workflow:
main (protected) ← Pull Requests ← feature branches
mainis our stable branch — always deployable- All changes go through feature branches and Pull Requests (PRs)
- No one pushes directly to
main— changes are merged via PR only
Quick Reference Card
| What you want to do | Command(s) |
|---|---|
| Start new work | git checkout main → git pull → git checkout -b your-name/feature-description |
| Save your work | git add . → git commit -m "description" |
| Push to GitHub | git push -u origin your-branch-name (first time) or git push (after) |
| Update your branch with latest main | git checkout main → git pull → git checkout your-branch → git merge main |
| See what's changed | git status |
| See your branches | git branch |
| Switch branches | git checkout branch-name |
Step-by-Step Workflow
1. Before Starting Any New Work
Always start from an up-to-date main branch:
# Switch to main branch
git checkout main
# Get the latest changes from GitHub
git pull origin main
2. Create a Feature Branch
Create a new branch for your work. Use this naming convention:
your-name/short-description
Examples:
sarah/add-census-data-loaderjohn/fix-api-timeoutmaria/update-documentation
# Create and switch to your new branch
git checkout -b your-name/feature-description
3. Make Your Changes
Edit code, add files, etc. Check your changes often:
# See what files have changed
git status
# See the actual changes in files
git diff
4. Save Your Work (Commit Often!)
Commit your changes frequently with clear messages:
# Stage all your changes
git add .
# Or stage specific files
git add path/to/file.py
# Commit with a descriptive message
git commit -m "Add function to load census data from API"
Good commit messages:
- ✅ "Add census data loader function"
- ✅ "Fix timeout error in API requests"
- ✅ "Update README with installation steps"
Avoid vague messages:
- ❌ "Fixed stuff"
- ❌ "WIP"
- ❌ "Changes"
5. Push Your Branch to GitHub
# First time pushing this branch
git push -u origin your-name/feature-description
# After the first push, just use
git push
6. Create a Pull Request (PR)
- Go to the repository on GitHub
- You'll see a prompt to "Compare & pull request" — click it
- Fill out the PR template:
- Title: Brief description of what this PR does
- Description: Explain what you changed and why
- Request a review from a teammate
- Click "Create pull request"
7. Address Review Feedback
If reviewers request changes:
# Make the requested changes in your code
# Then commit and push
git add .
git commit -m "Address review feedback: improve error handling"
git push
The PR will automatically update with your new commits.
8. Merge Your PR
Once approved:
- Click "Squash and merge" (recommended) or "Merge pull request"
- Delete your branch when prompted
9. Clean Up Locally
After your PR is merged:
# Switch back to main
git checkout main
# Get the merged changes
git pull origin main
# Delete your local feature branch (optional but recommended)
git branch -d your-name/feature-description
Handling Common Situations
Your Branch is Behind Main
If main has been updated while you were working:
# Make sure your changes are committed first
git add .
git commit -m "Your commit message"
# Get the latest main
git checkout main
git pull origin main
# Go back to your branch and merge main into it
git checkout your-name/feature-description
git merge main
If there are merge conflicts, see the section below.
Resolving Merge Conflicts
When Git can't automatically merge changes, you'll see conflict markers in files:
<<<<<<< HEAD
# Your version of the code
def load_data():
return pd.read_csv("data.csv")
=======
# The other version of the code
def load_data():
return pd.read_excel("data.xlsx")
>>>>>>> main
To resolve:
- Open the conflicted file(s)
- Decide which code to keep (or combine both)
- Remove the conflict markers (
<<<<<<<,=======,>>>>>>>) - Save the file
- Stage and commit:
git add .
git commit -m "Resolve merge conflicts with main"
git push
If you're unsure how to resolve a conflict, ask for help! It's better to ask than to accidentally delete someone's work.
Oops! I Made Changes on Main
If you accidentally started working on main:
# Create a new branch with your changes
git checkout -b your-name/accidental-changes
# Your changes are now on the new branch
# Push and create a PR as normal
git push -u origin your-name/accidental-changes
Oops! I Need to Undo My Last Commit
If you haven't pushed yet:
# Undo the commit but keep the changes
git reset --soft HEAD~1
If you already pushed, ask for help before trying to undo — it's more complicated.
I Want to See What's on Another Branch
# List all branches
git branch -a
# Switch to another branch (make sure your work is committed first!)
git checkout branch-name
Best Practices
Do ✅
- Pull before you start — always
git pullon main before creating a branch - Commit often — small, frequent commits are easier to understand and undo
- Write clear commit messages — your future self will thank you
- Push your branch daily — protects your work and lets others see your progress
- Ask for help — Git can be confusing; there's no shame in asking
- Run tests before pushing —
uv run pytest
Don't ❌
- Don't push directly to main — always use a PR
- Don't force push (
git push --force) — unless you really know what you're doing - Don't commit sensitive data — API keys, passwords, etc. (use
.envfiles) - Don't commit large data files — use
.gitignorefor CSVs, Excel files, etc.
Git Glossary
| Term | What it means |
|---|---|
| Repository (repo) | The project folder tracked by Git |
| Branch | An independent line of development |
| Commit | A saved snapshot of your changes |
| Push | Upload your commits to GitHub |
| Pull | Download changes from GitHub |
| Merge | Combine changes from one branch into another |
| Pull Request (PR) | A request to merge your branch into main |
| Clone | Download a copy of a repository |
| Stage | Mark files to be included in the next commit |
| HEAD | The current commit you're working on |
| Origin | The default name for the remote GitHub repository |
Getting Help
Git command help:
git help <command>
# Example: git help merge
Check repository status:
git status
See commit history:
git log --oneline -10
Still stuck? Ask the team! Git mistakes are almost always recoverable.
Development Workflow Summary
1. Clone repo → git clone ... && cd python-rclco
2. Install deps → uv sync --all-extras
3. Update main → git checkout main && git pull
4. Create branch → git checkout -b your-name/feature-description
5. Make changes → edit code
6. Add dependencies → uv add <package> or uv add --dev <package>
7. Run tests → uv run pytest
8. Commit changes → git add . && git commit -m "..."
9. Push branch → git push -u origin your-name/feature-description
10. Open PR → merge to main after review
11. Release (maintainer) → git tag v0.2.0 && git push origin v0.2.0
License
See LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rclco-0.3.0.tar.gz.
File metadata
- Download URL: rclco-0.3.0.tar.gz
- Upload date:
- Size: 337.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0455d56b96a43619820bd21dd652841e36d189b61010b425c35f04a05f2c3cd8
|
|
| MD5 |
d79dc076b557c4806e0e369b920f997b
|
|
| BLAKE2b-256 |
b5d32d2d8d0d98a6bcd2a1867065de3dff77e797259aec5009c0dbaff564131b
|
File details
Details for the file rclco-0.3.0-py3-none-any.whl.
File metadata
- Download URL: rclco-0.3.0-py3-none-any.whl
- Upload date:
- Size: 122.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f9569e2f8f37b458d7fadc7c9605dcf7b38ca83129981af075cc41316984148
|
|
| MD5 |
bdfe7715c833f0951091dc0cbb828060
|
|
| BLAKE2b-256 |
2dd1fb7110c26cd8c3b58a5342b2c5c6f42f45811807f9b7c5df5fe979b00f31
|