Skip to main content

A Python package for working with IPEDS data

Project description

[!NOTE] 2024 data: As of v0.0.8, scipeds now includes 2024 data!

A Python package for working with IPEDS data.

Read the full documentation here.

PyPI - Version PyPI - Python Version tests

Quickstart

Option 1: Use Colab (no installation required)

Click the link below to launch a pre-configured Google Colab notebook for scipeds:

Open in Colab

Open the Colab notebook using the link above (also here), and then follow the instructions in the notebook to explore and use scipeds in a cloud environment. This approach does not require you to install anything on your computer.

If you want to keep using scipeds this way, you'll need to make a copy of the notebook into your own Google Drive.

Option 2: Install scipeds on your computer

Alternatively, you can install scipeds on your own computer and work from there.

Install scipeds

Open a terminal and type:

pip install scipeds

Download the pre-processed database

You can download the pre-processed database in two ways.

Either from the shell:

scipeds download-db

or from within python (i.e. in a Python interactive shell or from a notebook):

import scipeds

scipeds.download_db()

Query completions data using the corresponding query engine

Now you are ready to try scipeds's functionality!

For example, you can look at completions data by gender:

from scipeds.data.completions import CompletionsQueryEngine
from scipeds.data.queries import (
    FieldTaxonomy,
    QueryFilters, 
)

engine = CompletionsQueryEngine()

Use a pre-baked query:

gender_df = engine.field_totals_by_grouping(
    grouping="gender", 
    taxonomy=FieldTaxonomy.ncses_field_group,
    query_filters=QueryFilters()
)
gender_df.head()

or write your own using duckdb SQL syntax:

from scipeds.constants import COMPLETIONS_TABLE

df = engine.get_df_from_query(f"""
    SELECT * 
    FROM {COMPLETIONS_TABLE}
    LIMIT 10;
""")
df.head()

For more detailed usage, see the Usage page or the engine API Reference.

About scipeds

What is scipeds?

scipeds is a Python package for working with data from IPEDS. Specifically, scipeds makes it easier for people to analyze data from IPEDS by pre-processing and standardizing IPEDS data into a database and providing some Python tooling for querying that database.

scipeds is not a tool for working with raw IPEDS data. For that, you should download data directly from IPEDS.

Full scipeds documentation can be found at this link, and the source code is available on GitHub.

Currently supported IPEDS surveys

scipeds currently supports the following datasets / survey components:

  • IPEDS Completions by program (6-digit CIP code), award level, race/ethnicity, and gender from 1984-2023
  • IPEDS Institutional Characteristics Directory Information from 2011-2023

Completions data preprocessing

We provide functionality to reproduce our pre-processing of the IPEDS data. To recreate the pre-processed database, you can clone the scipeds repository, download the raw data, and re-run the pipeline code in pipeline/. Decisions about how to convert / crosswalk data across different years and handle other edge cases such as missing data are contained in the pipeline code.

Why does scipeds exist?

While IPEDS provides a large volume of data about higher education in the United States, working with IPEDS data can be challenging! Many things have changed in the time that data has been reported to IPEDS, making it non-trivial to join datasets across different time periods to consistently measure changes over time.

In the process of their own work, the authors found it useful to create tools to make it easier to analyze IPEDS data and hoped that the tools they created would be useful to others as well.

Who created scipeds?

scipeds was created by Science for America (in collaboration with DrivenData) as part of its mission to address urgent challenges in STEM education.

The scipeds logo was designed by Adrianna Mena.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scipeds-0.0.8.tar.gz (31.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scipeds-0.0.8-py3-none-any.whl (32.0 kB view details)

Uploaded Python 3

File details

Details for the file scipeds-0.0.8.tar.gz.

File metadata

  • Download URL: scipeds-0.0.8.tar.gz
  • Upload date:
  • Size: 31.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for scipeds-0.0.8.tar.gz
Algorithm Hash digest
SHA256 dea95e626ccfbff4251c0212740ad702c073af6b67e1960fff7711017b590bf8
MD5 324ce381fad873e865383717e3f42c8e
BLAKE2b-256 8ff3a69e9bee6cb1e8a64d90739540e7cc11388c18d25ff9fca14a90da347f50

See more details on using hashes here.

File details

Details for the file scipeds-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: scipeds-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 32.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for scipeds-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 414a53c832c4ff29e8bd7f0b23351ecec44e70e13c19fb0c021b18253694750f
MD5 2e09d4b8c0d81a4f7fb7cf69cd963d17
BLAKE2b-256 e63d0350ba8e94e5f5a4ec4a0fcd4a56e67fd8fdbe5822734639fb5cc2cf20e4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page