Skip to main content

This library allows you to extract all the speeches given for the general conferences of the Church of Jesus Christ of Latter-Day Saints from April 1971 to the most recent month and year.

Project description

general-conference-extractor

Install

pip install general_conference_extractor

How to Use

Example 1 - Just One Talk URL

Here’s what you could do with just one talk URL:

from general_conference_extractor.GeneralConferenceTalk import GeneralConferenceTalk

url = "https://www.churchofjesuschrist.org/study/general-conference/2024/04/15dushku?lang=eng"

talk = GeneralConferenceTalk(url, title=True, author=True, calling=True)

# Print the extracted text
print("**** Metadata ****  \n")
print(talk.metadata)
print("\n")

print("**** Extracted Text **** \n")
print(talk.text[0:300])
**** Metadata ****  

{'title': 'Pillars and Rays', 'author': 'Alexander Dushku', 'calling': 'Of the Seventy', 'year': 2024, 'month': 4, 'url': 'https://www.churchofjesuschrist.org/study/general-conference/2024/04/15dushku?lang=eng'}


**** Extracted Text **** 

Pillars and Rays

By Elder Alexander Dushku

Of the Seventy

My message is for those who worry about their testimony because they haven’t had overwhelming spiritual experiences. I pray that I can provide some peace and assurance.

The Restoration of the gospel of Jesus Christ began with an explosion

Example 2 - Get All the Talks for One General Conference

Or, here’s an example of extracting every talk from a specific General Conference (i.e. April 2017 in this instance):

from general_conference_extractor.extract_URLs import generate_conference_url, extract_talk_urls
from general_conference_extractor.data_output import extract_conference_talks

# Step 1 - Get the URLs for the talks

# get the page URL that shows all the talks for that specific General Conference
gen_conf_page_url = generate_conference_url(2017, '04')

# get all the URLs for the talks that were given for that conference
talk_urls = extract_talk_urls(gen_conf_page_url)

# Step 2 - Save the talks as txt docs in folders and then their respective metadata in a seperate csv file
output_folder = './conference_talks'
metadata_csv_path = './metadata.csv'

# to produce the respective folders and documents
# extract_conference_talks(talk_urls, output_folder, metadata_csv_path)

Example 3 - Get All the Talks for a Specific Year

from general_conference_extractor.extract_URLs import extract_multiconference_talk_urls
from general_conference_extractor.data_output import extract_conference_talks

# As an example
multiconference_talk_urls = extract_multiconference_talk_urls(2017,2017)

# Step 2 - Save the talks as txt docs and their metadata in a csv file
output_folder = './conference_talks'
metadata_csv_path = './metadata.csv'

# to produce the respective folders and documents
# extract_conference_talks(multiconference_talk_urls, output_folder, metadata_csv_path)

Example 4 - Get All the Talks for a Specific Decade

from general_conference_extractor.extract_URLs import extract_multiconference_talk_urls
from general_conference_extractor.data_output import extract_conference_talks

# As an example
multiconference_talk_urls = extract_multiconference_talk_urls(2010,2019)

# Step 2 - Save the talks as txt docs and their metadata in a csv file
output_folder = './conference_talks'
metadata_csv_path = './metadata.csv'

# to produce the respective folders and documents
# extract_conference_talks(multiconference_talk_urls, output_folder, metadata_csv_path)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

general-conference-extractor-0.0.1.tar.gz (13.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

general_conference_extractor-0.0.1-py3-none-any.whl (14.2 kB view details)

Uploaded Python 3

File details

Details for the file general-conference-extractor-0.0.1.tar.gz.

File metadata

File hashes

Hashes for general-conference-extractor-0.0.1.tar.gz
Algorithm Hash digest
SHA256 51f59dd04b44e321357d6b580904f25c14656ea52cedd33e162d92dfb3187456
MD5 369d5f2c908166dbf5b7daba7fd13d8a
BLAKE2b-256 2cac977b46c714547e28ca4c1b5bab65358b4eae71a4d80e195b92c2498e8165

See more details on using hashes here.

File details

Details for the file general_conference_extractor-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for general_conference_extractor-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 68330dcc243657e270035ce6311ee18a5779f3e1e08d31883ad60de32bf83891
MD5 5dc3dec2f49ad3d48072708a352372d5
BLAKE2b-256 59c97fd4d0d7030e39c6512ba005b14bbd7e401fd8aa9523cac974cc77ea9363

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page