This library allows you to extract all the speeches given for the general conferences of the Church of Jesus Christ of Latter-Day Saints from April 1971 to the most recent month and year.
Project description
general-conference-extractor
Install
pip install general_conference_extractor
How to Use
Example 1 - Just One Talk URL
Here’s what you could do with just one talk URL:
from general_conference_extractor.GeneralConferenceTalk import GeneralConferenceTalk
url = "https://www.churchofjesuschrist.org/study/general-conference/2024/04/15dushku?lang=eng"
talk = GeneralConferenceTalk(url, title=True, author=True, calling=True)
# Print the extracted text
print("**** Metadata **** \n")
print(talk.metadata)
print("\n")
print("**** Extracted Text **** \n")
print(talk.text[0:300])
**** Metadata ****
{'title': 'Pillars and Rays', 'author': 'Alexander Dushku', 'calling': 'Of the Seventy', 'year': 2024, 'month': 4, 'url': 'https://www.churchofjesuschrist.org/study/general-conference/2024/04/15dushku?lang=eng'}
**** Extracted Text ****
Pillars and Rays
By Elder Alexander Dushku
Of the Seventy
My message is for those who worry about their testimony because they haven’t had overwhelming spiritual experiences. I pray that I can provide some peace and assurance.
The Restoration of the gospel of Jesus Christ began with an explosion
Example 2 - Get All the Talks for One General Conference
Or, here’s an example of extracting every talk from a specific General Conference (i.e. April 2017 in this instance):
from general_conference_extractor.extract_URLs import generate_conference_url, extract_talk_urls
from general_conference_extractor.data_output import extract_conference_talks
# Step 1 - Get the URLs for the talks
# get the page URL that shows all the talks for that specific General Conference
gen_conf_page_url = generate_conference_url(2017, '04')
# get all the URLs for the talks that were given for that conference
talk_urls = extract_talk_urls(gen_conf_page_url)
# Step 2 - Save the talks as txt docs in folders and then their respective metadata in a seperate csv file
output_folder = './conference_talks'
metadata_csv_path = './metadata.csv'
# to produce the respective folders and documents
# extract_conference_talks(talk_urls, output_folder, metadata_csv_path)
Example 3 - Get All the Talks for a Specific Year
from general_conference_extractor.extract_URLs import extract_multiconference_talk_urls
from general_conference_extractor.data_output import extract_conference_talks
# As an example
multiconference_talk_urls = extract_multiconference_talk_urls(2017,2017)
# Step 2 - Save the talks as txt docs and their metadata in a csv file
output_folder = './conference_talks'
metadata_csv_path = './metadata.csv'
# to produce the respective folders and documents
# extract_conference_talks(multiconference_talk_urls, output_folder, metadata_csv_path)
Example 4 - Get All the Talks for a Specific Decade
from general_conference_extractor.extract_URLs import extract_multiconference_talk_urls
from general_conference_extractor.data_output import extract_conference_talks
# As an example
multiconference_talk_urls = extract_multiconference_talk_urls(2010,2019)
# Step 2 - Save the talks as txt docs and their metadata in a csv file
output_folder = './conference_talks'
metadata_csv_path = './metadata.csv'
# to produce the respective folders and documents
# extract_conference_talks(multiconference_talk_urls, output_folder, metadata_csv_path)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for general-conference-extractor-0.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 51f59dd04b44e321357d6b580904f25c14656ea52cedd33e162d92dfb3187456 |
|
MD5 | 369d5f2c908166dbf5b7daba7fd13d8a |
|
BLAKE2b-256 | 2cac977b46c714547e28ca4c1b5bab65358b4eae71a4d80e195b92c2498e8165 |
Close
Hashes for general_conference_extractor-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 68330dcc243657e270035ce6311ee18a5779f3e1e08d31883ad60de32bf83891 |
|
MD5 | 5dc3dec2f49ad3d48072708a352372d5 |
|
BLAKE2b-256 | 59c97fd4d0d7030e39c6512ba005b14bbd7e401fd8aa9523cac974cc77ea9363 |