Skip to main content

an app that aggregates block level completion data for different block types for Open edX.

Project description

PyPI Travis Codecov Supported Python versions License

openedx-completion-aggregator is a Django app that aggregates block level completion data for different block types for Open edX.

What does that mean?

A standard Open edX installation can track the completion of individual XBlocks in a course, which is done using the completion library. This completion tracking is what powers the green checkmarks shown in the course outline and course navigation as the learner completes each unit in the course:

docs/completion.png

When completion tracking is enabled (and green checkmarks are showing, as seen above), it is only tracked at the XBlock level. You can use the Course Blocks API to check the completion status of any individual XBlock in the course, for a single user. For example, to get the completion of the XBlock with usage ID block-v1:OpenCraft+completion+demo+type@html+block@demo_block on the LMS instance courses.opencraft.com by the user MyUsername, you could call this REST API:

GET https://courses.opencraft.com/api/courses/v1/blocks/block-v1:OpenCraft+completion+demo+type@html+block@demo_block?username=MyUsername&requested_fields=completion

The response will include a completion value between 0 and 1.

However, what if you want to know the overall % completion of an entire course? (“Alex, you have completed 45% of Introduction to Statistics”) Or what if you as an instructor want to get a report of how much of Section 1 every student in a course has completed? Those queries are either not possible or too slow using the APIs built in to the LMS and completion.

This Open edX plugin, openedx-completion-aggregator watches course activity and asynchronously updates database tables with “aggregate” completion data. “Aggregate” data means completion data summed up over all XBlocks into a course and aggregated at higher levels, like the subsection, section, and course level. The completion aggregator provides a REST API that can provide near-instant answers to queries such as:

  • What % complete are each of the courses that I’m enrolled in?

  • What % of each section in Course X have my students completed?

  • What is the average completion % among all enrolled students in a course?

Notes:

  • This service only provides data, via a REST API. There is no user interface.

  • On production instances, the answers to these “aggregate” questions may be slightly out of date, because they are computed asynchronously (see below). How often they are updated is configurable.

Synchronous vs. Asynchronous calculations

openedx-completion-aggregator operates in one of two modes: synchronous or asynchronous.

With synchronous aggregation, each time a student completes a block, the aggregator code will re-calculate the aggregate completion values immediately. You will always have the freshest results from this API, but at a huge performance cost. Synchronous aggregation is only for development purposes and is not suitable for production. Synchronous aggregation can cause deadlocks when users complete XBlocks, leading to a partial outage of the LMS. Do not use it on a production site.

With asynchronous aggregation, the aggregator code will re-calculate the aggregate completion values asynchronously, at periodic intervals (e.g. every hour). How often the update can and should be run depends on many factors - you will have to experiment and find what works best and what is possible for your specific Open edX installation. (Running this too often can clog the celery tasks queue, which might require manual intervention.)

It’s important to note that in both modes the single-user, single-course API endpoints will always return up-to-date data. However, data that covers multiple users or multiple courses can be slightly out of date, until the aggregates are updated asynchronously.

API Details

For details about how the completion aggregator’s REST APIs can be used, please refer to the docstrings in views.py.

Event tracking

Like other parts of Open edX, the completion aggregator emits “tracking logs” events whenever completion aggregator records are created or updated by this plugin. These events are transformed into xAPI and routed using edx-event-routing-backends so they can be used for analytics, for example to track learner progress in a course.

Event tracking is enabled by default for edx-platform, and so event tracking is also enabled by default in the completion aggregator. This can result in a lot of events being generated — for example when a user completes the final block in a course, aggregator completion events will be generated for the containing unit, subsection, section, and course.

You can limit which aggregator events are emitted by modifying the COMPLETION_AGGREGATOR_TRACKING_EVENT_TYPES setting to limit which block types (course, chapter, sequential, vertical) cause tracking events to be emitted. To disable sending any completion aggregator tracking events, set COMPLETION_AGGREGATOR_TRACKING_EVENT_TYPES = None.

Installation and Configuration

openedx-completion-aggregator uses the pluggable django app pattern to ease installation. To use in edx-platform, do the following:

  1. Install the app into your virtualenv:

    $ pip install openedx-completion-aggregator
  2. By default, aggregate data is re-computed synchronously (with each created or updated BlockCompletion). While that is often useful for development, in most production instances, you will want to calculate aggregations asynchronously as explained above. To enable asynchronous calculation for your installation, set the following in your lms.yml file:

    ...
    COMPLETION_AGGREGATOR_ASYNC_AGGREGATION: true
    ...

    Then configure a pair of cron jobs to run ./manage.py run_aggregator_service and ./manage.py run_aggregator_cleanup as often as desired. (Start with hourly and daily, respectively, if you are unsure.) The run_aggregator_service task is what updates any aggregate completion data values that need to be updated since it was last run (it will in turn enqueue celery tasks to do the actual updating). The cleanup task deletes old database entries used to coordinate the aggregation updates, and which can build up over time but are no longer needed.

  3. If the aggregator is installed on an existing instance, then it’s sometimes desirable to fill “Aggregate” data for the existing courses. There is the reaggregate_course management command, which prepares data that will be aggregated during the next run_aggregator_service run. However, the process of aggregating data for existing courses can place extremely high loads on both your celery workers and your MySQL database, so on large instances this process must be planned with great care. For starters, we recommend you disable any associated cron jobs, scale up your celery worker pool significantly, and scale up your database cluster and storage.

Design: Technical Details

The completion aggregator is designed to facilitate working with course-level, chapter-level, and other aggregated percentages of course completion as represented by the BlockCompletion model (from the edx-completion djangoapp). By storing these values in the database, we are able to quickly return information for all users in a course.

Each type of XBlock (or XModule) is assigned a completion mode of “Completable”, “Aggregator”, or “Excluded”.

A “completable” block is one that can directly be completed, either by viewing it on the screen, by submitting a response, or by some custom defined means. When completed, a BlockCompletion is created for that user with a value of 1.0 (any value between 0.0 and 1.0 is allowed). Completable blocks always have a maximum possible value of 1.0.

An “excluded” block is ignored for the purposes of completion. It always has a completion value of 0.0, and a maximum possible value of 0.0. If an excluded block has children, those are also ignored for the purposes of completion.

An “aggregator” block is one that contains other blocks. It cannot be directly completed, but has an aggregate completion value equal to the sum of the completion values of its immediate children, and a maximum possible value equal to the sum of the maximum possible values of its immediate children (1.0 for completable blocks, 0.0 for excluded blocks, and the calculated maximum for any contained aggregators). If an aggregator has a maximum possible value of 0.0, (either it has no children, or all its children are excluded), it is always considered complete.

To calculate aggregations for a user, the course graph is retrieved from the modulestore (using block transformers) to determine which blocks are contained by each aggregator, and values are summed recursively from the course block on down. Values for every node in the whole tree can be calculated in a single traversal. These calculations can either be performed “read-only” (to get the latest data for each user), or “read-write” to store that data in the completion_aggregator.Aggregator model.

During regular course interaction, a learner will calculate aggregations on the fly to get the latest information. However, on-the-fly calculations are too expensive when performed for all users in a course, so periodically (e.g. every hour, but this is configurable), a task is run to calculate all aggregators that have gone out of date since the last run, and store those values in the database. These stored values are then used for reporting on course-wide completion (for course admin views).

By tracking which blocks have been changed recently (in the StaleCompletion table ), these stored values can also be used to shortcut calculations for portions of the course graph that are known to be up to date. If a user has only completed blocks in chapter 3 of a three-chapter course since the last time aggregations were stored, there is no need to redo the calculation for chapter 1 or chapter 2. The course-level aggregation can just sum the already-stored values for chapter 1 and chapter 2 with a freshly calculated value for chapter 3.

Currently, the major bottleneck in these calculations is creating the course graph for each user. We are caching the graph locally to speed things up, but this stresses the memory capabilities of the servers.

License

The code in this repository is licensed under the AGPL 3.0 unless otherwise noted.

Please see LICENSE.txt for details.

How To Contribute

Contributions are very welcome.

Please read our Contributing Guideline for details.

Reporting Security Issues

Please do not report security issues in public. Please email help@opencraft.com.

Getting Help

Have a question about this repository, or about Open edX in general? Please refer to this list of resources if you need any assistance.

Change Log

Unreleased

[4.2.0] - 2024-06-21

  • Transform openedx.completion_aggregator.progress.* tracking log events into xAPI using edx-event-routing-backends so they can be included in Aspects analytics data.

[4.1.0] - 2024-06-18

  • Emit openedx.completion_aggregator.progress.* tracking log events for the various block/course types

[4.0.3] - 2023-10-24

  • Replace xblockutils.* imports with xblock.utils.*. The old imports are used as a fallback for compatibility with older releases.

  • Remove xblockutils dependency.

[4.0.2] - 2023-03-03

  • Update GitHub workflows.

  • Update requirements to logically organize them and allow scheduled requirements updates.

  • Add base requirements to setup.py.

[4.0.1] - 2022-07-13

  • Add COMPLETION_AGGREGATOR_AGGREGATE_UNRELEASED_BLOCKS setting, which enables the use of course blocks with a release date set to a future date in the course completion calculation.

[4.0.0] - 2022-06-17

  • Add Maple support.

  • Drop support for Python 2.

  • Drop support for Django 2.X.

  • Replace Travis CI with GitHub Actions.

  • Fix docs quality checks.

  • Fix pylint quality checks.

  • Fix the build & release pipeline.

[3.2.0] - 2021-11-26

  • Add Lilac support.

[3.1.0] - 2021-04-28

  • Add Koa support.

  • Upgrade Python to 3.8.

[2.2.1] - 2020-06-05

  • Fix handling of invalid keys.

[2.1.3] - 2020-05-08

  • Fix all option in reaggregate_course.

[2.1.1] - 2020-04-20

  • Pass user.username to Celery task instead of user.

  • Convert course_key string to CourseKey in reaggregate_course.

[2.1.0] - 2020-04-17

  • Add locking mechanism to batch operations.

  • Replace course_key with course in reaggregate_course management command.

[2.0.1] - 2020-04-17

  • Convert course_key to string before sending it to Celery task.

[1.0.0] - 2018-01-04

  • First release on PyPI.

  • On-demand asynchronous aggregation of xblock completion.

  • Provides an API to retrieve aggregations for one or many users, for one or many courses.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openedx-completion-aggregator-4.2.0.tar.gz (83.4 kB view details)

Uploaded Source

Built Distribution

openedx_completion_aggregator-4.2.0-py2.py3-none-any.whl (75.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file openedx-completion-aggregator-4.2.0.tar.gz.

File metadata

File hashes

Hashes for openedx-completion-aggregator-4.2.0.tar.gz
Algorithm Hash digest
SHA256 a56a2f63fe6dabc1b6a3db504bcdfacc1dabb2d8e1f10631c9c03b6faf1c619f
MD5 ffb0614bd44d35d3c00d4fe1269ccee9
BLAKE2b-256 31389d4b1411b50b0458364685c110ba1c6ced7d3de671c5d11c98782236ee61

See more details on using hashes here.

File details

Details for the file openedx_completion_aggregator-4.2.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for openedx_completion_aggregator-4.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 8b707b912cf548f8d612a03678c9cbac8d2f7e4077dc479872611cc14fa138a7
MD5 347eaa1ed8ececc4f6e6b7108d699757
BLAKE2b-256 5e161bc4115c9661787127dd7c8427cf6c277cc244ea42c2352422e6e6588164

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page