Extract tool for retrieving student data from Google Classroom
Project description
Google Classroom Extractor
This tool retrieves and writes out to CSV students, active sections, assignments, and submissions by querying the Google Classroom API. For more information on the this tool and its output files, please see the main repository readme.
Getting Started
-
Download the latest code from the project homepage by clicking on the green "CODE" button and choosing an appropriate option. If choosing the Zip option, extract the file contents using your favorite zip tool.
-
Open a command prompt* and change to this file's directory (* e.g. cmd.exe, PowerShell, bash).
-
Ensure you have Python 3.9+ and Poetry.
-
At a command prompt, install all required dependencies:
poetry install
-
Optional: make a copy of the
.env.example
file, named simply.env
, and customize the settings as described in the Configuration section below. -
Place the service-account.json file described below into the root directory of this project.
-
Run the extractor one of two ways:
-
Execute the extractor with minimum command line arguments:
poetry run python edfi_google_classroom_extractor -a [admin account email] -f assignments
-
Alternately, run with environment variables or
.env
file:poetry run python edfi_google_classroom_extractor
-
For detailed help, execute
poetry run python canvas_extractor -h
.
-
Configuration
Module Configuration
Application configuration is provided through environment variables or command
line interface (CLI) arguments. CLI arguments take precedence over environment
variables. Environment variables can be set the normal way, or by using a
dedicated .env
file like:
CLASSROOM_ACCOUNT[<email address of the Google Classroom admin account, required]
LOG_LEVEL=[Log level, optional]
OUTPUT_DIRECTORY=[The output directory for the csv files, optional]
START_DATE=[start date for usage data pull in yyyy-mm-dd format, optional]
END_DATE=[end date for usage data pull in yyyy-mm-dd format, optional]
Supported parameters:
Description | Required | Command Line Argument | Environment Variable |
---|---|---|---|
The email address of the Google Classroom admin account. | yes | -a or --classroom-account |
CLASSROOM_ACCOUNT |
The log level for the tool. ** | no (default: INFO) | -l or --log-level |
LOG_LEVEL |
The output directory for the generated csv files. | no (default: [working directory]/data) | -o or --output-directory |
OUTPUT_PATH |
Sync database directory | no (default: [working directory]/data) | -d or --sync-database-directory |
SYNC_DATABASE_DIRECTORY |
Start date*, yyyy-mm-dd format | no (default: today) | -s or --usage-start-date |
START_DATE |
End date*, yyyy-mm-dd format | no (default: today) | -e or --usage-end-date |
END_DATE |
Number of retry attempts for failed API calls | no (default: 4) | none | REQUEST_RETRY_COUNT |
Timeout window for retry attempts, in seconds | no (default: 60 seconds) | none | REQUEST_RETRY_TIMEOUT_SECONDS |
Feature*** | no (default: core, not removable) | -f or --feature |
FEATURE |
* Start Date and End Date are used in pulling system activity (usage) data and could span any relevant date range.
** Valid values for the optional log level:
- DEBUG
- INFO(default)
- WARNING
- ERROR
- CRITICAL
*** When there's no specified feature, the extractor will always process Users, Sections, and Section Associations, which are considered the core feature. Other features (can combine two or more):
- assignments (Enables the extraction of assignments and submissions)
- activities (Enables the extraction of section activities and system activities) - EXPERIMENTAL, subject to breaking changes
- grades (Enables the extraction of grades) - COMING SOON
When setting features via .env
file or through environment variable, combine
features by using a bracketed comma-separate list, e.g. FEATURE=[activities, attendance, assignments, grades]
. To combine features at the command line,
simply list them together: --feature activities, attendance, assignments, grades
.
Note: in order to make the extractor work, you still need to configure your
service-account.json
file. To do so, read the next section API Permissions
API Permissions
In order to extract data, the Google Classroom APIs must be enabled, and the application must be granted permission.
A Google Classroom administrator will need to enable both the Google Classroom API and the Admin SDK. This can be done here.
Next, the administrator will need to create a Service Account and API key. This is the account the application will use for access. This can be done here.
- Give the new service account a name like "Ed-Fi Extractor" and click Create.
- Grant the service account the "Viewer" role and click
Continue
then Done, skipping step 3: "Grant users access to this service account". - The new service account will be displayed in a table. Click on the three dots for the account and select Manage Keys.
- On the next page, click the
Add Key
button, then choose JSON and clickCreate
in the dialog box. - A JSON file will be downloaded from your browser, which is the API key.
Rename it to
service-account.json
. Save this into the project directory. - Finally, click on the service account to view details and copy the Unique ID field for the next step.
Finally, the administrator will need to specify the scope of access for the service account. This can be done here.
- Add a new API client and provide the service account Unique
ID (
client_id
in the json file) in theClient ID
field. - Paste the following scopes into the OAuth scopes field and
click
Authorize
:
https://www.googleapis.com/auth/admin.directory.orgunit, https://www.googleapis.com/auth/admin.reports.usage.readonly, https://www.googleapis.com/auth/classroom.courses, https://www.googleapis.com/auth/classroom.coursework.students, https://www.googleapis.com/auth/classroom.profile.emails, https://www.googleapis.com/auth/classroom.rosters, https://www.googleapis.com/auth/classroom.student-submissions.students.readonly, https://www.googleapis.com/auth/admin.reports.audit.readonly
Generate LMS UDM CSV Files
To pull data from Google Classroom and generate csv files, run
poetry run python edfi_google_classroom_extractor
from the root
directory of this project. CSV files are output into the
data/ed-fi-udm-lms
directory.
TLS/SSL proxying
Users on a corporate network that intercepts TLS/SSL traffic will need to have a
copy of the corporate root certificate on file, and then add an environment
variable pointing to this file: HTTPLIB2_CA_CERTS=[absolute path to certificate]
. NOTE: this does not load properly through the .env
file, and
must be set as an actual environment variable.
Logging and Exit Codes
Log statements are written to the standard output. If you wish to capture log details, then be sure to redirect the output to a file. For example:
poetry run python google_classroom_extractor > 2020-12-07-15-43.log
If any errors occurred during the script run, then there will be a final print
message to the standard error handler as an additional mechanism for calling
attention to the error: "A fatal error occurred, please review the log output for more information."
The application will exit with status code 1
if there were any log messages at
the ERROR or CRITICAL level, otherwise it will exit with status code 0
.
Course Aliases and SIS Section Identifiers
Course Aliases in Google Classroom are expected to be used to provide a mapping from a Google Classroom Course to a SIS section using a SIS section identifer. To enable this behavior, create a domain-scoped Course Alias in Google Classroom for each Course with the prefix "EdFiLMS." followed by the SIS section identifier. For example, the domain-scoped Course Alias "d:EdFiLMS.ALG-123" would be used for SIS section identifier "ALG-123". Only the first Course Alias found with the "EdFiLMS." prefix will be used.
See course.aliases for more information on Course Aliases.
Developer Operations
- Style check:
poetry run flake8
- Static typing check:
poetry run mypy .
- Run unit tests:
poetry run pytest
- Run unit tests with code coverage:
poetry run coverage run -m pytest
- View code coverage:
poetry run coverage report
Also see build.py for use of the build script.
Visual Studio Code (Optional)
To work in Visual Studio Code install the Python Extension.
Then type Ctrl-Shift-P
, then choose Python:Select Interpreter
,
then choose the environment that includes .venv
in the name.
Legal Information
Copyright (c) 2022 Ed-Fi Alliance, LLC and contributors.
Licensed under the Apache License, Version 2.0 (the "License").
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
See NOTICES for additional copyright and license notifications.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file edfi-google-classroom-extractor-1.2.0.tar.gz
.
File metadata
- Download URL: edfi-google-classroom-extractor-1.2.0.tar.gz
- Upload date:
- Size: 25.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c7ff15c11a67918c16ed82c6a7e8009a27289a6e5e537c243514392542328245 |
|
MD5 | 15add6600c69eb2b008c7b88e747a650 |
|
BLAKE2b-256 | 6e126381ddf395c26170e9e0525cd0f73b558780cc6f26d59d4224ba97c89fae |
File details
Details for the file edfi_google_classroom_extractor-1.2.0-py3-none-any.whl
.
File metadata
- Download URL: edfi_google_classroom_extractor-1.2.0-py3-none-any.whl
- Upload date:
- Size: 35.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | da58ba9445d5253e3eef611f1292c5440836ea92736a0fe12b084e40f3e752d2 |
|
MD5 | 0533b230fbf66983b6a74c8c8f68b12d |
|
BLAKE2b-256 | 92535a4074dadc9a8d2672b122417124a0f1ff7dfc14c5db8091c7808a618489 |