General extractor of news pages.

These details have not been verified by PyPI

Project links

Homepage

Project description

GNE: 通用新闻网站正文抽取器

GeneralNewsExtractor（GNE）是一个通用新闻网站正文抽取模块，输入一篇新闻网页的 HTML，输出正文内容、标题、作者、发布时间、正文中的图片地址和正文所在的标签源代码。GNE在提取今日头条、网易新闻、游民星空、观察者网、凤凰网、腾讯新闻、ReadHub、新浪新闻等数百个中文新闻网站上效果非常出色，几乎能够达到100%的准确率。

使用方式也非常简单：

from gne import GeneralNewsExtractor

extractor = GeneralNewsExtractor()
html = '网站源代码'
result = extractor.extract(html)
print(result)

安装

pip install gne

文档

https://generalnewsextractor.readthedocs.io/

帮助 GNE 变得更好

https://github.com/kingname/GeneralNewsExtractor

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.4.3

Mar 8, 2026

0.4.2

Mar 8, 2026

0.4.1

Mar 2, 2026

0.4.0

Mar 2, 2026

0.3.1

Apr 17, 2024

0.3.0

Oct 7, 2021

0.2.6

Feb 17, 2021

0.2.5

Dec 21, 2020

0.2.4

Oct 6, 2020

0.2.3

Sep 15, 2020

0.2.2

Aug 2, 2020

0.2.1

Jun 27, 2020

0.2.0

Jun 27, 2020

0.1.9

Jun 6, 2020

0.1.8

Mar 11, 2020

0.1.7

Feb 21, 2020

0.1.6

Feb 13, 2020

0.1.5

Jan 4, 2020

0.1.4

Dec 31, 2019

0.1.3

Dec 31, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gne-0.4.3.tar.gz (32.7 MB view details)

Uploaded Mar 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gne-0.4.3-py3-none-any.whl (31.1 kB view details)

Uploaded Mar 8, 2026 Python 3

File details

Details for the file gne-0.4.3.tar.gz.

File metadata

Download URL: gne-0.4.3.tar.gz
Upload date: Mar 8, 2026
Size: 32.7 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for gne-0.4.3.tar.gz
Algorithm	Hash digest
SHA256	`26ed77fc2b96d5e9dd28b288ebfdeaf6bf7f034a92b8f75067c64eefffd0b4a9`
MD5	`b0d62367596d8dd1aa25f8c05fc31a7c`
BLAKE2b-256	`8e41d72fc42048fafcda9e5b88a3e1648cfeb0f1b59c813856b4d6422df70c90`

See more details on using hashes here.

File details

Details for the file gne-0.4.3-py3-none-any.whl.

File metadata

Download URL: gne-0.4.3-py3-none-any.whl
Upload date: Mar 8, 2026
Size: 31.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for gne-0.4.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d4f53218ebc70fcc0e69f695894720d4f2c335ed1c77ad2bdf6c6cb0db4a7830`
MD5	`83f0776b42df6b97a1fd3cbba5aad77a`
BLAKE2b-256	`31f507f65c68fab99b22b5b948d2790e2fe0d7ff4f444fb650f4a14c75855b16`

See more details on using hashes here.

gne 0.4.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

GNE: 通用新闻网站正文抽取器

安装

文档

帮助 GNE 变得更好

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes