Skip to main content

Extract tool for retrieving student data from Google Classroom

Project description

Google Classroom Extractor

This tool retrieves and writes out to CSV students, active sections, assignments, and submissions by querying the Google Classroom API. For more information on the this tool and its output files, please see the main repository readme.

Getting Started

  1. Download the latest code from the project homepage by clicking on the green "CODE" button and choosing an appropriate option. If choosing the Zip option, extract the file contents using your favorite zip tool.

  2. Open a command prompt* and change to this file's directory (* e.g. cmd.exe, PowerShell, bash).

  3. Ensure you have Python 3.8+ and Poetry.

  4. At a command prompt, install all required dependencies:

    poetry install
    
  5. Optional: make a copy of the .env.example file, named simply .env, and customize the settings as described in the Configuration section below.

  6. Place the service-account.json file described below into the root directory of this project.

  7. Run the extractor one of two ways:

    • Execute the extractor with minimum command line arguments:

      poetry run python edfi_google_classroom_extractor -a [admin account email]
      
    • Alternately, run with environment variables or .env file:

      poetry run python edfi_google_classroom_extractor
      
    • For detailed help, execute poetry run python canvas_extractor -h.

Configuration

Configuration

Application configuration is provided through environment variables or command line interface (CLI) arguments. CLI arguments take precedence over environment variables. Environment variables can be set the normal way, or by using a dedicated .env file like:

CLASSROOM_ACCOUNT[<email address of the Google Classroom admin account, required]
LOG_LEVEL=[Log level, optional]
OUTPUT_PATH=[The output directory for the csv files, optional]
START_DATE=[start date for usage data pull in yyyy-mm-dd format, optional]
END_DATE=[end date for usage data pull in yyyy-mm-dd format, optional]

Supported parameters:

Description Required Command Line Argument Environment Variable
The email address of the Google Classroom admin account. yes -a or --classroom-account CLASSROOM_ACCOUNT
The log level for the tool. no (default: INFO) -l or --log-level LOG_LEVEL
The output directory for the generated csv files. no (default: data/) -s or --usage-start-date OUTPUT_PATH
Start date*, yyyy-mm-dd format no (default: today) -s or --usage-start-date START_DATE
End date*, yyyy-mm-dd format no (default: today) -e or --usage-end-date END_DATE
Number of retry attempts for failed API calls no (default: 4) none REQUEST_RETRY_COUNT
Timeout window for retry attempts, in seconds no (default: 60 seconds) none REQUEST_RETRY_TIMEOUT_SECONDS

* Start Date and End Date are used in pulling system activity (usage) data and could span any relevant date range.

** Valid values for the optional log level:

  • DEBUG
  • INFO(default)
  • WARNING
  • ERROR
  • CRITICAL

Note: in order to make the extractor work, you still need to configure your service-account.json file. To do so, read the next section API Permissions

API Permissions

In order to extract data, the Google Classroom APIs must be enabled, and the application must be granted permission.

A Google Classroom administrator will need to enable both the Google Classroom API and the Admin SDK. This can be done here.

Next, the administrator will need to create a Service Account and API key. This is the account the application will use for access. This can be done here.

  1. Give the new service account a name like "Ed-Fi Extractor" and click Create.
  2. Grant the service account the Project Viewer role and click Continue then Done.
  3. The new service account will be displayed in a table. Click on the three dots for the account and select Create Key.
  4. Choose JSON and click Create.
  5. A JSON file will be downloaded from your browser, which is the API key. Rename it to service-account.json. Save this into the project directory.
  6. Finally, click on the service account to view details and copy the Unique ID field for the next step.

Finally, the administrator will need to specify the scope of access for the service account. This can be done here.

  1. Add a new API client and provide the service account Unique ID in the Client ID field.
  2. Paste the following scopes into the OAuth scopes field and click Authorize:

https://www.googleapis.com/auth/admin.directory.orgunit, https://www.googleapis.com/auth/admin.reports.usage.readonly, https://www.googleapis.com/auth/classroom.courses, https://www.googleapis.com/auth/classroom.coursework.students, https://www.googleapis.com/auth/classroom.profile.emails, https://www.googleapis.com/auth/classroom.rosters, https://www.googleapis.com/auth/classroom.student-submissions.students.readonly, https://www.googleapis.com/auth/admin.reports.audit.readonly

Generate LMS UDM CSV Files

To pull data from Google Classroom and generate csv files, run poetry run python google_classroom_extractor from the root directory of this project. CSV files are output into the data/ed-fi-udm-lms directory.

TLS/SSL proxying

Users on a corporate network that intercepts TLS/SSL traffic will need to have a copy of the corporate root certificate on file, and then add an environment variable pointing to this file: HTTPLIB2_CA_CERTS=[absolute path to certificate]. NOTE: this does not load properly through the .env file, and must be set as an actual environment variable.

Logging and Exit Codes

Log statements are written to the standard output. If you wish to capture log details, then be sure to redirect the output to a file. For example:

poetry run python google_classroom_extractor > 2020-12-07-15-43.log

If any errors occurred during the script run, then there will be a final print message to the standard error handler as an additional mechanism for calling attention to the error: "A fatal error occurred, please review the log output for more information."

The application will exit with status code 1 if there were any log messages at the ERROR or CRITICAL level, otherwise it will exit with status code 0.

Developer Operations

  1. Style check: poetry run flake8
  2. Static typing check: poetry run mypy .
  3. Run unit tests: poetry run pytest
  4. Run unit tests with code coverage: poetry run coverage run -m pytest
  5. View code coverage: poetry run coverage report

Also see build.py for use of the build script.

Visual Studio Code (Optional)

To work in Visual Studio Code install the Python Extension. Then type Ctrl-Shift-P, then choose Python:Select Interpreter, then choose the environment that includes .venv in the name.

Legal Information

Copyright (c) 2021 Ed-Fi Alliance, LLC and contributors.

Licensed under the Apache License, Version 2.0 (the "License").

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

See NOTICES for additional copyright and license notifications.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

File details

Details for the file edfi-google-classroom-extractor-1.0.0a2.tar.gz.

File metadata

  • Download URL: edfi-google-classroom-extractor-1.0.0a2.tar.gz
  • Upload date:
  • Size: 21.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.8.7

File hashes

Hashes for edfi-google-classroom-extractor-1.0.0a2.tar.gz
Algorithm Hash digest
SHA256 be1beb5b27cb8bb56d1cc890a39b008d2740c2de1e9f046f4272ff1c9218a8fb
MD5 cadf74d7c29d570b3448c4bca5b1c42f
BLAKE2b-256 362071d67e1d2e00ace48b62faeb406d019f2a779e7d107de92ab36f3a57ceb6

See more details on using hashes here.

File details

Details for the file edfi_google_classroom_extractor-1.0.0a2-py3-none-any.whl.

File metadata

File hashes

Hashes for edfi_google_classroom_extractor-1.0.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 d0dd3396993cb8aa0552eff7920a782a17204ac2c25409e3721925d979a517f5
MD5 ba51825ef2df999670827f8c851e8839
BLAKE2b-256 a89e8d8a7b9871fcc461f5a66efb52ab6606e7b9fd0b6c52c546d58b7d025208

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page