Skip to main content

Terry toolkit tkitreadability

Project description

一个从html中提取正文的库

from tkitreadability import tkitReadability
html = """

        <div class="full-component-wrapper">
        
                <div class="component component--text-image image-position--right" data-id="45290" data-type="c_sideimagetext_ttt">
      <div class="text-image--component-wrapper twb-container">
        <div class="text-image--content-wrapper row">
    
        
                  <div class="text-image--image col-12 col-xl-7 order-2 order-xl-3">
              
                <div class="field field--name-field-c-image field--type-entity-reference field--label-hidden field__item">   
                 <picture>
                      <source srcset="/sites/default/files/styles/ttt_image_690/public/2021-07/border-collie.webp?itok=1oyChjVg 2x" media="all and (min-width: 1140px)" type="image/webp">
                  <source srcset="/sites/default/files/styles/ttt_image_930/public/2021-07/border-collie.webp?itok=QxWrubxE 1x" media="all and (min-width: 992px)" type="image/webp">
                  <source srcset="/sites/default/files/styles/ttt_image_690/public/2021-07/border-collie.webp?itok=1oyChjVg 1x" media="all and (min-width: 768px)" type="image/webp">
                  <source srcset="/sites/default/files/styles/ttt_image_510/public/2021-07/border-collie.webp?itok=jhilnwqZ 1x" media="all and (min-width: 576px)" type="image/webp">
                  <source srcset="/sites/default/files/styles/ttt_image_510/public/2021-07/border-collie.webp?itok=jhilnwqZ 1x" type="image/webp">
                  <source srcset="/sites/default/files/styles/ttt_image_690/public/2021-07/border-collie.jpg?itok=1oyChjVg 2x" media="all and (min-width: 1140px)" type="image/jpeg">
                  <source srcset="/sites/default/files/styles/ttt_image_930/public/2021-07/border-collie.jpg?itok=QxWrubxE 1x" media="all and (min-width: 992px)" type="image/jpeg">
                  <source srcset="/sites/default/files/styles/ttt_image_690/public/2021-07/border-collie.jpg?itok=1oyChjVg 1x" media="all and (min-width: 768px)" type="image/jpeg">
                  <source srcset="/sites/default/files/styles/ttt_image_510/public/2021-07/border-collie.jpg?itok=jhilnwqZ 1x" media="all and (min-width: 576px)" type="image/jpeg">
                  <source srcset="/sites/default/files/styles/ttt_image_510/public/2021-07/border-collie.jpg?itok=jhilnwqZ 1x" type="image/jpeg">
                      <img src="/sites/default/files/styles/ttt_image_510/public/2021-07/border-collie.jpg?itok=jhilnwqZ" alt="Border Collie" typeof="foaf:Image" loading="lazy">
    
      </picture>
    
    </div>
          
            </div>
    <img src="/sites/default/files/styles/ttt_image_510/public/2021-07/border-collie.jpg?itok=jhilnwqZ" alt="Border Collie" typeof="foaf:Image" loading="lazy">
            <div class="text-image--text-wrapper col-12 col-xl-5 order-3 order-xl-2">
              
              <div class="text-image--text">
                
                <div class="clearfix text-formatted field field--name-field-c-sideimagetext-summary field--type-text-long field--label-hidden field__item"><h2>Pet Card</h2>
    
    <ul>
        <li><strong>Living Considerations:</strong> Not hypoallergenic, suitable for apartment living, good with older children</li>
        <li><strong>Size:</strong> Medium</li>
        <li><strong>Height:</strong> Males - 48 to 56 centimetres at the withers, Females - 45 to 53 centimetres at the withers</li>
        <li><strong>Weight:</strong> Males -13 to 20 kilograms, Females - 12 to 19 kilograms</li>
        <li><strong>Coat:</strong> Medium/Long</li>
        <li><strong>Energy:</strong> High</li>
        <li><strong>Colour:</strong> All colours or colour combinations</li>
        <li><strong>Activities:</strong> Agility, Conformation, Herding, Obedience, Rally Obedience, Tracking</li>
        <li><strong>Indoor/Outdoor:</strong> Both</li>
    </ul>
    </div>
          
              </div>
    
                      </div>
              </div>
      </div>
    </div>
    
    
    
          
    
          </div>


"""
Readability = tkitReadability()
content = Readability.html2text(html)
print(content)
# 输出为html
print(Readability.markdown2Html(content))

更新

version:'0.0.0.4'

加入的markdown的转换为html

文档查看 https://docs.terrychan.org/tkitreadability/

快速上传操作

可以自动查找依赖,然后上传

sh upload.sh

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tkitreadability-0.0.0.5.3.tar.gz (10.0 kB view details)

Uploaded Source

Built Distribution

tkitreadability-0.0.0.5.3-py2.py3-none-any.whl (9.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file tkitreadability-0.0.0.5.3.tar.gz.

File metadata

  • Download URL: tkitreadability-0.0.0.5.3.tar.gz
  • Upload date:
  • Size: 10.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.5

File hashes

Hashes for tkitreadability-0.0.0.5.3.tar.gz
Algorithm Hash digest
SHA256 fd6054f63eb1d89a05ed662f778d386a2115e624a6603a4d2776708b4b151e21
MD5 575b26760b07214b9f758681e18b2c3b
BLAKE2b-256 5b9110092029365fc555acce42221f7e023a015de8c6b1ab2a732200b4f33902

See more details on using hashes here.

File details

Details for the file tkitreadability-0.0.0.5.3-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for tkitreadability-0.0.0.5.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 ac5a3f1d52d7dc28e24752055c07ad6d4305a08578508fd49d55a64de333748b
MD5 64430dff81c33c63039a812c822d083e
BLAKE2b-256 1a373aec2a21014bda20fb09efa337b3302a3cb68fdcd981edfcbe4bacb777ed

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page