A Python NextJS data parser from HTML

Project description

NJSParser

A powerful parser and explorer for any website built with NextJS.

Parses flight data (from the self.__next_f.push scripts).
Parses next data from __NEXT_DATA__ script.
Parses build manifests.
Searches for build id.
Many other things ...

It uses only lxml, orjson, pydantic to garantee a fast and efficient data parsing and processing.

Installation:

pip install njsparser

Use

CLI

You can use the cli from 3 different commands:

njsp
njsparser
python3 -m njsparser.cli It has only one functionality of displaying informations about the website, like this: For more informations, use the --help argument with the command.

Parsing `__next_f`.

The data you find in __next_f is called flight data, and contains data under react format. You can parse it easily with njsparser the way it follows.

We will build a parser for the flight data example

In the website you want to parse, make sure you see the self.__next_f.push in the begining of script contained the data you search for. Here I am searching for the description "I should really have a better hobby, but this is it..." (in blue) in my page, and I can also see the self.__next_f.push (in green).

Then I will do this simple script, to parse, then dump the flight data of my website, and see what objects I am searching for:

import requests
import njsparser
import json

# Here I get my page's html
response = requests.get("https://mediux.pro/user/r3draid3r04").text
# Then I parse it with njsparser
fd = njsparser.BeautifulFD(response)
# Then I will write to json the content of the flight data
with open("fd.json", "w") as write:
    # I use the njsparser.default function to support the dump of the flight data objects.
    json.dump(fd, write, indent=4, default=njsparser.default)

In my dumped flight data, I will search for the same string:
Then I will do to the closed "value" root to my found string, and look at the value of "cls". Here it is "Data":

Now that I know the "cls" (class) of object my data is contained in, I can search for it in my BeautifulFD object:

import requests
import njsparser
import json

# Here I get my page's html
response = requests.get("https://mediux.pro/user/r3draid3r04").text
# Then I parse it with njsparser
fd = njsparser.BeautifulFD(response)
# Then I iterate over the different classes `Data` in my flight data.
for data in fd.find_iter([njsparser.T.Data]):
    # Then I make sure that the content of my data is not None, and
    # check if the key `"user"` is in the data's content. If it is,
    # then i break the loop of searching.
    if data.content is not None and "user" in data.content:
        break
else:
    # If i didn't find it, i raise an error
    raise ValueError

# Now i have the data of my user
user = data.content["user"]
# And I can print the string i was searching for before
print(user["tagline"])

More informations:

If your object is inside another object (e.g. "Data" in a "DataParent", or in a "DataContainer"), the .find_iter will also find it recursively (except if you set recursive=False).
Make sure you use the correct flight data classes attributes when fetching their data. The class "Data" has a .content attribute. If you use .value, you will end up with the raw value and will have to parse it yourself. If you work with a "DataParent" object, instead of using .value (that will give you ["$", "$L16", None, {"children": ["$", "$L17", None, {"profile": {}}]}]), use .children (that will give you a "Data" object with a .content of {"profile": {}}). Check for the type file to see what classes you're interested in, and their attributes.
You can also use .find on BeautifulFD to return the only first occurence of your query, or None if not found.

Parsing `<script id='__NEXT_DATA__'>`

Just do:

import njsparser

html_text = ...
data = njsparser.get_next_data(html_text)

If the page contains any script <script id='__NEXT_DATA__'>, it will return the json loaded data, otherwise will return None.

Project details

Release history Release notifications | RSS feed

This version

2.16

Feb 6, 2026

2.15

Feb 6, 2026

2.14

Oct 3, 2025

2.13

Jul 12, 2025

2.12

Jul 7, 2025

2.10

Jul 7, 2025

2.9

Jul 7, 2025

2.8

Jan 15, 2025

2.7.0

Jan 14, 2025

2.6.3

Dec 30, 2024

2.6.2

Dec 27, 2024

2.6

Dec 8, 2024

2.5

Dec 7, 2024

2.4

Dec 7, 2024

2.3

Dec 7, 2024

2.1.1

Dec 7, 2024

2.1

Dec 7, 2024

2.0.2

Dec 7, 2024

2.0.1

Dec 7, 2024

2.0

Dec 7, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

njsparser-2.16.tar.gz (17.9 kB view details)

Uploaded Feb 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

njsparser-2.16-py3-none-any.whl (22.8 kB view details)

Uploaded Feb 6, 2026 Python 3

File details

Details for the file njsparser-2.16.tar.gz.

File metadata

Download URL: njsparser-2.16.tar.gz
Upload date: Feb 6, 2026
Size: 17.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for njsparser-2.16.tar.gz
Algorithm	Hash digest
SHA256	`045539311507e5fe45031e51587516d075c4ae6ed0580644e51171968e6ca497`
MD5	`64200df76ee9ada49a0d824b3bb35828`
BLAKE2b-256	`3b1973d1d1ac2b979eb1354a00015a4147b7bb2bb4f09eb1b509dc02c482cccc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for njsparser-2.16.tar.gz:

Publisher: publish.yml on novitae/njsparser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: njsparser-2.16.tar.gz
- Subject digest: 045539311507e5fe45031e51587516d075c4ae6ed0580644e51171968e6ca497
- Sigstore transparency entry: 923704669
- Sigstore integration time: Feb 6, 2026
Source repository:
- Permalink: novitae/njsparser@200cc70e2fd37adbcaff2552d961f9c0717e8a8c
- Branch / Tag: refs/tags/2.16
- Owner: https://github.com/novitae
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@200cc70e2fd37adbcaff2552d961f9c0717e8a8c
- Trigger Event: release

File details

Details for the file njsparser-2.16-py3-none-any.whl.

File metadata

Download URL: njsparser-2.16-py3-none-any.whl
Upload date: Feb 6, 2026
Size: 22.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for njsparser-2.16-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8d49b958705e9503f91fca40b694e20e1f6db18bd88c85de12b623d112358d4b`
MD5	`5d5896f42e5986d0f5e89dcd6d338ec8`
BLAKE2b-256	`11b4f62d7609ec654811e796669c0973fdbd33efa7da2ecdc842d410ae62aac5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for njsparser-2.16-py3-none-any.whl:

Publisher: publish.yml on novitae/njsparser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: njsparser-2.16-py3-none-any.whl
- Subject digest: 8d49b958705e9503f91fca40b694e20e1f6db18bd88c85de12b623d112358d4b
- Sigstore transparency entry: 923704730
- Sigstore integration time: Feb 6, 2026
Source repository:
- Permalink: novitae/njsparser@200cc70e2fd37adbcaff2552d961f9c0717e8a8c
- Branch / Tag: refs/tags/2.16
- Owner: https://github.com/novitae
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@200cc70e2fd37adbcaff2552d961f9c0717e8a8c
- Trigger Event: release

njsparser 2.16

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

NJSParser

Installation:

Use

CLI

Parsing `__next_f`.

Parsing `<script id='__NEXT_DATA__'>`

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

njsparser 2.16

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

NJSParser

Installation:

Use

CLI

Parsing __next_f.

Parsing <script id='__NEXT_DATA__'>

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Parsing `__next_f`.

Parsing `<script id='__NEXT_DATA__'>`