Skip to main content

helper pip package module for analyzing presidential speech records (including basic data)

Project description

president-speech

  • Presidents of the Republic of Korea Speeches
  • Parquet, provided in the form of sqlite db file
  • Comes with simple cli

SUMMARY OF PROVIDED DATA

  • data per case can be checked in the following ways

  • https://www.pa.go.kr/research/contents/speech/index.jsp?spMode=view&catid=c_pa02062&artid={division_number}

  • some data show date values as empty columns or years only image

    president size min(date) max(date)
    이승만 998 1948.07.24 1959.03.10
    윤보선 3 1960.08.13 1960.09.15
    박정희 1270 1963.12.17 1979.10.26
    최규하 58 1979.10.27 1980.08.16
    전두환 602 1980.06.05 1987.02.16
    노태우 601 1988.02.25 1992.10.05
    김영삼 728 1993.01.09 1998.01.23
    김대중 822 1998.02.25 2003.02.17
    노무현 780 2003.02.25 2008.01.28
    이명박 1027 2008.02.25 2013.02.07
    박근혜 493 2013.02.24 2016.10.26
    문재인 1389 2017.05.10 2022.03.30
    >>> df.info()
    
    <class 'pandas.core.frame.DataFrame'>
    Index: 8771 entries, 5368 to 102591
    Data columns (total 7 columns):
     #   Column           Non-Null Count  Dtype 
    ---  ------           --------------  ----- 
     0   division_number  8771 non-null   int64 
     1   president        8771 non-null   object
     2   title            8771 non-null   object
     3   date             8771 non-null   object
     4   location         8771 non-null   object
     5   kind             8771 non-null   object
     6   speech_text      8771 non-null   object
    dtypes: int64(1), object(6)
    memory usage: 548.2+ KB
    

Use

$ pip install president-speech
>>> from president_speech.db.parquet_interpreter import read_parquet, get_parquet_full_path
>>> get_parquet_full_path()
'/Users/f16/code/edu/president-speech/.venv/lib/python3.8/site-packages/president_speech/db/parquet/president_speech_ko.parquet'
>>> read_parquet().head(3)
      division_number president                title        date location kind                                        speech_text
5368          1305368       박정희  제5대 대통령 취임식 대통령 취임사  1963.12.17       국내  취임사  \n\n\n단군성조가 천혜의  강토 위에 국기를 닦으신지 반만년, 연면히 이어온 ...
5369          1305369       박정희            국회 개회식 치사  1963.12.17       국내  기념사   존경하는 국회의장, 의원제위 그리고 내외귀빈 여러분! 오늘  뜻깊은 제3공화국의...
5370          1305370       박정희               신년 메시지  1964.01.01       국내  신년사   친애하는 국내외의 동포 여러분! 혁명의 고된 시련을 겪고 민정이양으로 매듭을 지은...
>>> 

Use Cli

$ ps-wordcount -h     
usage: ps-word-count [-h] [-t | -p] word

Word frequency output from previous presidential speeches

positional arguments:
  word         Search word

optional arguments:
  -h, --help   show this help message and exit
  -t, --table  Table Format Output
  -p, --plot   Format Output

$ ps-word-count -p 독립
문재인  [954]  ****************************************
이승만  [430]  ******************
박정희  [361]  ****************
이명박  [176]  ********
김대중  [171]  ********
전두환  [169]  ********
노무현  [167]  *******
노태우  [131]  ******
김영삼  [114]  *****
박근혜  [ 71]  ***
최규하  [  4]  *
윤보선  [  0]
$ ps-word-count -t 독립
|    | president   |   mention |
|---:|:------------|----------:|
|  0 | 문재인      |       954 |
|  1 | 이승만      |       430 |
|  2 | 박정희      |       361 |
|  3 | 이명박      |       176 |
|  4 | 김대중      |       171 |
|  5 | 전두환      |       169 |
|  6 | 노무현      |       167 |
|  7 | 노태우      |       131 |
|  8 | 김영삼      |       114 |
|  9 | 박근혜      |        71 |
| 10 | 최규하      |         4 |
| 11 | 윤보선      |         0 |

Ref

Development environment setting

$ git clone ...
$ cd president-speech
$ pdm venv create
$ source .venv/bin/activate
$ pdm install
$ pdm add -dG test pytest pytest-cov
$ pdm test
$ pdm ptest

$ pdm ctest
---------- coverage: platform darwin, python 3.9.18-final-0 ----------
Name                                             Stmts   Miss  Cover
--------------------------------------------------------------------
src/president_speech/__init__.py                     0      0   100%
src/president_speech/db/__init__.py                  0      0   100%
src/president_speech/db/connection_manager.py       17      3    82%
src/president_speech/db/parquet_interpreter.py      25      1    96%
src/president_speech/db/search.py                   15      1    93%
tests/__init__.py                                    0      0   100%
tests/test_parquet_interpreter.py                   11      0   100%
tests/test_search.py                                 5      0   100%
--------------------------------------------------------------------
TOTAL                                               73      5    93%

Deploy to fly.io with Docker Technology

$ docker build -t president-speech-webapp .
$ docker run -it --rm -p 7942:8051 president-speech-webapp

$ fly deploy
Visit your newly deployed app at https://president-speech.fly.dev/

image

Give it a try. And opinions are always welcome. Of course, it's PR.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

president-speech-0.9.1.tar.gz (73.3 MB view details)

Uploaded Source

Built Distribution

president_speech-0.9.1-py3-none-any.whl (73.6 MB view details)

Uploaded Python 3

File details

Details for the file president-speech-0.9.1.tar.gz.

File metadata

  • Download URL: president-speech-0.9.1.tar.gz
  • Upload date:
  • Size: 73.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.9.1 CPython/3.8.17

File hashes

Hashes for president-speech-0.9.1.tar.gz
Algorithm Hash digest
SHA256 7646de9735bf49a6da29b7763e0f2608f568b68b9a59f015ec28b35aff76f1cc
MD5 068a853e1e572682c98055b581446aa2
BLAKE2b-256 0a439fc95a460a3a78c8f5260db40b06ad0bd715715010ea99819d4d1969e366

See more details on using hashes here.

File details

Details for the file president_speech-0.9.1-py3-none-any.whl.

File metadata

File hashes

Hashes for president_speech-0.9.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3c0546f06f0f445f53c5e90e3b654c86fa8fdbe392f2bb44c5058f4f0d8b1f44
MD5 05f633a7114a8a8c5a9086186cce1bf4
BLAKE2b-256 ebf6de87f88419bb68146cf00caa69940e2fa6d5b58b5917c366ed22a0eb4b17

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page