Skip to main content

Task Crawler and Reporter with Dashboard for Atlassian Confluence On-Prem Installations

Project description

Why?

Chances are you read this because you have the same problem like many other projects around the globe: Confluence has become not only your knowledge management tool, but you also use it for task management. Of course there is JIRA for developers and teams, but Tasks in Confluence are used for all those other folks who don't directly work in development teams but are also important for the project as well as for tasks, which are not concrete enough to create a JIRA element.

A few months in the journey you become aware that not everybody works properly with their tasks. Occasionally you find outdated tasks in old protocols. You think: "If I find this one, most probably there are more!". So you fire up the built-in task list feature for all users. You can't believe what you see. 100s or 1000s of overdue tasks, partially of colleagues who are not even on the project anymore. Way too many (and no really nice UX from Atlassian in this area) to individually contact everybody.

Either you have a good Project Office, who will chase each individual, or you need another solution. Even when they chase each person, how do you check success of this measure? A weekly report, manually filled by your Project Office colleagues? Don't they have more valuable things to do?

Look no further! Help is here!

Annotated Screenshot of Version 0.0.5

Confluence-Task-Reporting

Reporting for Tasks (overdue, due soon, etc.) from a confluence on prem server.

Confluence built-in task viewer is great for a single person. Once you have 1000s of Users working in 100s of spaces it becomes tedious to keep track of overdue or low-quality tasks.

This little script is here to help. Most probably there exist paid plugins/add-ons for Confluence out there, but usually they come at some steep license costs. On the other hand they are natively integrated into Confluence and would provide much faster and more acurate results - so if you can spend the bucks for such plugins: Go ahead.

Also on the other hand: When people anyway don't deal with their tasks then why would you need expensive/paid real-time reporting on overdue tasks? Using this script you could have a daily update. Usually that's more than enough.

Important notice

This tool will access only the information that the user stated in the .env-File is authorized to view! If this user doesn't have rights to call Confluence-APIs the result will be 0 entries.

From the various API-Calls we'll always only receive pages that the user would anyway be allowed to see. So this tool does not help (nor support) doing things that you couldn't do manually anyways

How does it work?

Powered by a little local database we scan the users of the Confluence instance. Then we scan all their open tasks. Having gathered all the needed information we can create beautiful, customized reports. Those reports can be sent via E-Mail as PDF or stored in a Confluence-Page.

Another option to consume the results is via a nice little dashboard.

How to start?

Check prerequisits

  • You know what Python is
  • You have a confluence on-prem instance
  • You have a working username and password for this instance
  • Python >= 3.6 on your computer. To check fire up your console: python -V

Install

  • Hopefully you're comfortable with using the console or command prompt. Otherwise you won't make it. Sorry.
  • Get the repository git clone https://github.com/Athos1972/Confluence-Task-Reporting
  • Create a file named exactly .env in the root of the downloaded repository.
  • Enter CONF_USER=<your_user_name>, <CONF_PWD=<your_password_for_confluence> into the file
  • Enter CONF_BASE_URL="https://<path_to_your_confluence_instance>" into the .env file
  • Create a virtual environment (e.g. virutalenv venv)
    • then activate it by typing venv/bin/activate on Linux/Mac or /venv/scripts/activate on Windows
    • Install the needed dependencies: pip install -r requirements.txt

First steps

  • Start python user_crawler.py. This will do quite some stuff. It will initialize the database, connect to your Confluence instance and read all users (that your User is authorized to read). It may take some time depending on the size of your installation.
  • Start python user_task_crawler.py. This will run even longer. For all the users that were loaded in the previous step we'll search for their tasks. We'll also scan the pages, that those tasks are included and will derive due-date of the tasks as well as the space name.
    • TIPP: For permanent crawling it might be good if you set command line parameter OUWT (Only Users With Tasks) by calling python user_task_crawler.py -OUWT=1. This will - you guessed it - just crawl for users who anyway had already some tasks. In the majority of installations you'll find 1000s of users but only a few hundred with tasks.
    • TIPP: If you don't want to consume too much bandwidth you might consider setting sleep_between_crawl_tasks in config.toml to a value around 2-5 seconds. This would also
      seem less suspicious for people analyzing network traffic.
  • Start the dashboard: python dashboard.py. Navigate to URL http://127.0.0.1:8050/ and see the results

Other stuff

Statistics writer

If you want to be able to consume time series statistics, how open/overdue tasks per space, company and user evolve over time (e.g. because you set a new initiative to clear out overdue tasks and want to see whether it works or not) you can activate a statistics writer. You should run it e.g. daily or weekly or monthly (Recommendation: use your computer's scheduler to do that. Don't rely on yourself not forgetting it.)

python update_statistics.py

You should run that after you crawled all users and their tasks.

Additional crawlers

  • tasks_recrawl_by_page.py recrawls tasks from previously crawled pages.
  • task_recrawl_by_duedate.py goes through all tasks in the database sorted by last crawl date and. analyses those tasks again.

Reports

  • To receive an Excel-Sheet of all tasks simply run python tasks_to_excel.py and you'll find a file task_report_<date>.xlsx in the folder.
    • TIPP: To export only overdue tasks: python tasks_to_excel.py -OO=1 or python tasks_to_excel.py --onlyOverdue=1

Distribution

  • Currently no distribution of results

Future enhancements/developments

If there's a feature missing for your to make Confluence-Task-Reporter more useful: Create an issue right here on GitHub. Thank you!

Near future

  • Export of task list also to google sheets and CSV via script. Any takers?
  • Export graph elements and grid as Confluence-Page (either update fixed page-id or create a new page on each execution)

Future

  • Build Windows executable and have crawlers being called from the dashboard
  • Update task contents from the app (click on a task, add a comment)
  • Reminder function via E-Mail (Chose entries from the grid in the dashboard and click on "Send Reminder") to automatically send reminder E-Mails
    • That's a bit tricky as most Atlassian Customers are Corporations and there will be all kinds of E-Mail-Systems to deal with. Also we need an easy option to maintain a template text.
    • Additional option: "Mailbomb"-Mode to send for each overdue task 1 E-Mail to the user

Changelog

0.0.5:

  • Crawler
    • Added new method to crawl E-Mail-addresses as the old experimental API was deactivated during latest security patch
    • Fixed a bug in connection with previously crawled tasks where second date was removed
  • Dashboard
    • Button "Download Selection" downloads an XLSX-File with the current contents of the data grid to your computer
    • Changed display of age graph to include the date - not only the number of days in future/past to easier filter the data table for those tasks
    • New screen "Timeseries data" with a graph also updated prefill_database.py accordingly

0.0.4:

  • Crawler
    • Tasks from personal spaces are not considered any longer
  • Dashboard
    • Age distribution graph showing the age of tasks (future and past)
    • Space/Company as multiple select
    • Checkbox "Tasks without date"
    • Filtering enabled
  • Tests
    • Prefill database also with tasks without date (like in real world installations)

0.0.3:

  • Statistics table added. Also report to fill it.
  • Fixed bug in user_task_crawler in -OUWT-Mode.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Confluence-Task-Crawler-0.0.6.tar.gz (30.1 kB view details)

Uploaded Source

Built Distribution

Confluence_Task_Crawler-0.0.6-py3-none-any.whl (48.5 kB view details)

Uploaded Python 3

File details

Details for the file Confluence-Task-Crawler-0.0.6.tar.gz.

File metadata

File hashes

Hashes for Confluence-Task-Crawler-0.0.6.tar.gz
Algorithm Hash digest
SHA256 24f25c12db8a042f9b065c9b6b389b716a928c0581f97aaed2b54c2397bbab74
MD5 f1d41374d385a604905074adf87fafc2
BLAKE2b-256 7650a0da75b8da63cb42760bcb177b58ea211a03e18a4be1dc27c63ddc60a5c5

See more details on using hashes here.

File details

Details for the file Confluence_Task_Crawler-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for Confluence_Task_Crawler-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 fc622ee429d4b92fecdae50282d141d2f05925c82e1c891cf0a562f7f0c7feaa
MD5 aa861c3ee95c16f54f14063541cf3af3
BLAKE2b-256 b22819a8f3d71b603373a9b0c39f232980a5f265a2734c9eca7e1cece3048c3a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page