A package that uses natural language processing to convert PDF calendars to JSONs and to convert Google Calendar events to Excel files
This Python package is great for taking Arthur Murray Vernon’s Google Calendar events and arrange them in a calendar structure in an Excel file. That can then be copy-and-pasted into Microsoft Office Publisher to create a printable PDF calendar.
For those who want to go from the printable PDF calendars to a digital one, you’re in luck! I use machine learning to parse through printable PDF calendars and create JSONs out of them, where each event has a title, dance_style and time (if applicable), ripe for creating Google Calendar events from them.
While this project is geared towards use at Arthur Murray Dance Studios, feel free to take a look at the source code and modify it for your own calendar’s needs.
Setup from source code (GitHub)
- Clone the repo. git clone https://github.com/vincentchov/amvernon-cal.git
- Install Python 3.x with pip.
- Install Java 8.
- Create and activate a virtual environment.
- Install the corpora python -m textblob.download_corpora.
- Install dependencies: pip install -r requirements.txt.
Setup from from PyPI (Pip)
- Follow steps 2-5 from above.
- Install amvernoncal from PyPI. pip install amvernoncal
How to go from Google Calendar to an Excel file
- Activate the Google Calendar API for your account and obtain your client_secret.json file.
- Activate your virtual environment.
- Import the module that will use your client secret: from amvernoncal.gcal_to_xlsx import gcal_events_to_xlsx.
- Give the gcal_events_to_xlsx() function a month and year to search, and the name of the Google Calendar you’re converting from, making sure to surround each of the two arguments by quotes. Example: gcal_events_to_xlsx('September 2017', 'Classes')
- That will then create 3 folders: JSONs, PDFs, and Output. Your Excel file will be in the Output folder.
Alternatively, you can invoke gcal_events_to_xlsx() directly in the Terminal using amvernon_gcal_to_xlsx, which comes with a help screen, thanks to Docopt.
How to go from a printable PDF calendar to a JSON
- Follow steps 1 and from above.
- Import the function that will parse your calendar: from amvernoncal.pdfproc.pdf_to_json import parse_calendar
- Give the parse_calendar() function a path to your calendar, named based on the month and year, as well as tell it if you want to save to a JSON file or just return the JSON. Example: parse_calendar('september_2017.pdf', to_file=True)