Skip to main content

HTML-first Yahoo Finance news scraping with broader ticker coverage

Project description

SimpleYNews

SimpleYNews is a small Python package for scraping Yahoo Finance news from the quote/news HTML page. The package is intentionally HTML-first so it can cover news that is often missing from the narrower Yahoo Finance JSON endpoint that libraries such as yfinance rely on.

Important legal notice

Yahoo!, Y!Finance, and Yahoo Finance are trademarks of Yahoo, Inc. SimpleYNews is not affiliated with, endorsed by, or vetted by Yahoo. Use of Yahoo's name is solely for the purpose of accurately describing which service this tool interacts with (nominative fair use).

Terms of Service. Users should be aware that Yahoo's Terms of Service (Section 2(d)(ix)) state that users must not access or collect data from Yahoo services using automated means without Yahoo's express prior permission, must not interfere with the services, and must not build a competing or substitute service from Yahoo content. Users are solely responsible for reviewing those terms, determining whether their particular use is permitted, and ceasing use if Yahoo objects or changes its access controls or terms.

robots.txt. Yahoo's robots.txt for finance.yahoo.com restricts automated access and explicitly blocks numerous scraping and AI user-agents. Under US law, robots.txt is not a legally binding instrument per se, but it evidences the site operator's wishes. Under EU and German law, robots.txt may qualify as a machine-readable opt-out from text and data mining under the DSM Directive (2019/790) Art. 4 and the German Urheberrechtsgesetz (UrhG) Section 44b. This tool does not fetch or check robots.txt before making requests.

Cookie consent and contractual risk. When accessing Yahoo Finance pages from jurisdictions subject to GDPR, Yahoo presents a cookie consent form. This tool submits that form programmatically as part of HTTP session handling. Users should be aware that such submission may constitute acceptance of Yahoo's Terms of Service on behalf of the user under clickwrap contract principles (cf. CJEU, Ryanair Ltd v. PR Aviation BV, C-30/14, 2015). Users should independently assess whether programmatic interaction with consent mechanisms creates contractual obligations and whether their intended use complies with any such obligations.

No circumvention. This tool does not bypass login requirements, password gates, CAPTCHAs, rate-limit enforcement mechanisms, or encryption. It accesses only publicly reachable pages that require no authentication. It sends standard HTTP requests comparable to those a web browser would make.

Because of the foregoing, SimpleYNews is an experimental tool intended only for personal research and educational use. If you need a production-grade or contractually clean solution, use a licensed data provider.

Legal notices last updated: 2026-03-09.

Why this exists

  • yfinance news coverage is not broad enough for all tickers.
  • Yahoo's page HTML often contains richer news data for non-US listings.
  • This package keeps the familiar Ticker(...).news API while using a wider extraction strategy.

Project scope

The defensible position for this project is narrow:

  • It is a local library, not a hosted service.
  • It returns article metadata and links, not full article bodies.
  • It is meant for personal research workflows and compatibility experiments.
  • It is not intended to mirror Yahoo Finance, replace Yahoo Finance, or resell Yahoo-derived data.
  • It should be used conservatively, with low request volume and no bulk crawl.
  • It does not bypass login requirements, password gates, CAPTCHAs, or encryption.

That scope does not eliminate legal or contractual risk. It only explains why the project exists technically: the human-facing Yahoo Finance quote/news page can expose broader coverage than the narrower endpoints used by other libraries.

Statutory and regulatory landscape

Depending on jurisdiction, automated data collection may implicate one or more of the following legal frameworks. This list is illustrative, not exhaustive.

United States. Computer Fraud and Abuse Act (18 U.S.C. Section 1030); DMCA anti-circumvention provisions (17 U.S.C. Section 1201); federal copyright law (17 U.S.C. Section 101 et seq.); state common-law torts including trespass to chattels and unfair competition.

European Union. Database Directive (96/9/EC) sui generis database right; Digital Single Market Directive (2019/790) Art. 3-4 text and data mining exceptions and Art. 15 press publisher right; General Data Protection Regulation (2016/679); ePrivacy Directive (2002/58/EC).

Germany. Urheberrechtsgesetz (UrhG) Sections 44b and 60d (text and data mining); Sections 87a-87e (database protection); Gesetz gegen den unlauteren Wettbewerb (UWG).

United Kingdom. Computer Misuse Act 1990; Copyright, Designs and Patents Act 1988 (CDPA) database right; UK GDPR (retained EU law).

Other jurisdictions. Users in Canada, Australia, and other jurisdictions should be aware that comparable computer misuse, copyright, database protection, and privacy statutes may apply. This list covers only selected jurisdictions and is not exhaustive.

The legality of automated data collection varies significantly across jurisdictions. Conduct that is permissible in one country may be prohibited, or even criminal, in another. Users are solely responsible for assessing and complying with the laws applicable to their use.

Data protection

News metadata returned by this tool may include personal data as defined by GDPR Art. 4(1), such as journalist or author names and publisher identifiers.

  • Under GDPR Art. 4(7), the end user who runs this tool — not the tool's developer — is the data controller who determines the purposes and means of processing.
  • Users processing personal data of individuals in the EU/EEA must ensure a valid legal basis (Art. 6 GDPR) and comply with transparency obligations (Art. 13-14 GDPR), including informing data subjects when personal data has not been obtained directly from them.
  • All data processing occurs locally on the user's machine. This tool does not transmit scraped data to the developer or any third party.
  • Users operating at scale should consider whether a Data Protection Impact Assessment (DPIA) is required under Art. 35 GDPR.

Responsible use

  • Keep request volume low. Do not bulk-crawl Yahoo Finance.
  • Do not redistribute, resell, or sublicense scraped data.
  • Do not use scraped data to build a service that competes with or substitutes for Yahoo Finance.
  • Respect any cease-and-desist communication or access restriction from Yahoo.
  • Yahoo may change its website structure, access controls, or terms at any time without notice.
  • Thumbnail URLs in results point to Yahoo-hosted images. Embedding or redistributing those images outside of personal, local use may constitute hotlinking or unauthorized use of Yahoo's content delivery resources.

Extraction strategy

SimpleYNews parses the quote/news page in layers:

  1. Embedded structured page state such as root.App.main
  2. JSON-LD news metadata
  3. DOM selectors as a final fallback

For .DE tickers, the scraper tries de.finance.yahoo.com before the default US site. For other tickers, it may fall back to regional Yahoo Finance sites if the primary site returns no results.

Installation

python -m pip install simpleynews

Quick start

from simpleynews import SimpleYNews

bmw = SimpleYNews.Ticker("BMW.DE")
for item in bmw.news:
    print(item["title"])
    print(item["link"])
    print(item["publisher"])

Returned shape

Each item in .news is a dictionary with these keys:

  • uuid
  • title
  • link
  • publisher
  • providerPublishTime
  • type
  • relatedTickers
  • thumbnail
  • summary

Development

Use either pyenv or a plain venv, but keep Python versions aligned with pyproject.toml and .python-version.

Option A: pyenv + venv (recommended)

pyenv install 3.11 -s
pyenv local 3.11
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -e .[dev]

Option B: system Python + venv

python3.11 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -e .[dev]

Quality checks

PYLINTHOME=.pylint.d python -m pylint simpleynews tests
python -m isort --check-only simpleynews tests
python -m pytest
python -m build

Versioning and release

  • Single source of truth: simpleynews/__init__.py (__version__).
  • Packaging metadata reads the version dynamically from that file.
  • Keep PyPI credentials outside the repository.

Release checklist

# 1) bump version in simpleynews/__init__.py
# 2) run checks
PYLINTHOME=.pylint.d python -m pylint simpleynews tests
python -m isort --check-only simpleynews tests
python -m pytest

# 3) rebuild artifacts
rm -f dist/*
python -m build
python -m twine check dist/*

# 4) upload using environment variables
TWINE_USERNAME=__token__ \
TWINE_PASSWORD=pypi-REPLACE_WITH_PROJECT_TOKEN \
python -m twine upload dist/*

If upload fails with File already exists, bump patch version and retry.

Disclaimer

Yahoo!, Yahoo Finance, and related marks are owned by Yahoo, Inc. Review Yahoo's terms before using this package or any data it returns:

These links are provided for reference and may change. Users should independently verify current terms.

No warranty of legality. The developers make no representation or warranty that operation of this tool is lawful in any jurisdiction. The tool is provided "as is" without warranty of any kind, including warranties of legality, accuracy, completeness, availability, or fitness for any purpose. Use is entirely at your own risk.

No warranty of accuracy. News metadata returned by this tool may be incomplete, delayed, or inaccurate. Do not use it as the basis for financial, investment, or trading decisions.

No legal advice. Nothing in this project — including its source code, documentation, and legal notices — constitutes legal advice. Users should consult qualified legal counsel in their jurisdiction before using this tool or relying on data obtained through it.

Jurisdictional variation. The legality of web scraping, automated data collection, and data reuse varies significantly across jurisdictions. Conduct that is permissible in one country may be prohibited, or even criminal, in another.

Indemnification. To the maximum extent permitted by applicable law, users agree to hold harmless and indemnify the developers and contributors of this project from any claims, damages, liabilities, or expenses arising from the user's use of this tool or data obtained through it. In jurisdictions where blanket indemnification clauses in gratuitous transactions are unenforceable (including under German Schenkungsrecht, BGB Section 516 et seq.), this clause applies only to the extent permitted by mandatory law.

By using SimpleYNews, you accept these terms and agree to comply with all applicable laws and the terms of any third-party services accessed through it. SimpleYNews is not legal advice and must not be presented as an approved or authorized Yahoo integration.

License

Apache License 2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simpleynews-0.3.3.tar.gz (25.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

simpleynews-0.3.3-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file simpleynews-0.3.3.tar.gz.

File metadata

  • Download URL: simpleynews-0.3.3.tar.gz
  • Upload date:
  • Size: 25.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for simpleynews-0.3.3.tar.gz
Algorithm Hash digest
SHA256 7d6facf7574e6844bcd039d38dd38bd0711c3cd066c12fdf44e9df9dddff20b9
MD5 c9f536a0c9b2166ef0d93ea4d4658a87
BLAKE2b-256 cf46e5ce92aaf22e5b6e08c7f60cc630f350eeba7701bec0fb96f5c037578514

See more details on using hashes here.

File details

Details for the file simpleynews-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: simpleynews-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 17.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for simpleynews-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 020d6353eb2f111401dac49afdae479fac5c3db9a63e77576db9e6e5854c1014
MD5 7faf02363ad1498da2d44d551c94551d
BLAKE2b-256 4496686f0ca04f4266fc460f56a4b6a4d4eb1b5bd3215b5340104bcd1fdde66c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page