Skip to main content

Create a tree object from a LaTeX project

Project description

PyTexTree

Create a tree object from a LaTeX project. v0.1.1.

1. Features

This project was done to provide a way to handle / analyse LaTeX projects more easily. Feature highlights:

  • Support opening LaTeX projects consisting of multiple files (using \include)
  • Extract several, maybe useful, stats on the file
  • Create a traversable tree from the LaTeX project
    • Built upon the awesome NodeMixin of anytree
  • Export to a graph to support visualisation
    • A dict with edges and nodes
    • Creates edges between references within the project
    • Specifically export to .csv files supported by Gephi

2. Installation

Using pip:

$ pip install pytextree

3. Usage

Basic workflow with the PyTexTree consists of two steps:

  1. Load the tex file into text
  2. Parse the text into a tree

NOTE:

For testing, you can use the provided .tex example.

3.1. Loading a LaTeX project into a tree

You can use the textree.open_tex_project() to load your LaTeX project into a string

import pytextree as tt
txt = tt.open_tex_project('examples/lorem.tex')

Then you can simply parse the text into a tree:

tree = tt.parse_tex_to_tree(txt) # >>> <TNode [Root]: Root (0, 5502)>

The function returns the root node of the created tree. You can traverse it with .children and .parent attributes.

3.1.1. Tree node attributes

The node, describing a section or an evironment in the LaTeX project, contains some information about the containing text:

Attribute Description Example
.commands List of found LaTeX commands in this node ["\textbf", "\ref{fig:my_fig}"]
.comments List of comments in this node ["% A comment"]
.word_count Number of words, excluding comments and commands 527
.label LaTeX label of the node if one exists "fig:my_graph"
.citations Cited labels ["Lamport1984", "Rossum1991"]
.texts Text contents ["This is a paragraph.", "And another."]

3.2. Printing the tree

You can also pretty print the created tree for more information:

tree.pretty_print()

Output:

<TNode [Root]: Root (0, 5502)>
└── <TEnv [document]>: None (114, 5500)
   ├── <TNode [section]: S1 (144, 1129)>
   │   ├── <TNode [subsection]: S1.S1 (619, 869)>
   │   └── <TNode [subsection]: S1.S2 (870, 1129)>
   ├── <TNode [section]: S2 (1130, 2542)>
   │   └── <TNode [subsection]: S2.S1 (1460, 2542)>
   │       ├── <TEnv [itemize]>: list:mylist (1606, 1867)
   │       │   ├── <TEnv [itemize]>: None (1695, 1771)
   │       │   └── <TEnv [itemize]>: None (1776, 1853)
   │       ├── <TEnv [itemize]>: list:mylist (1869, 1959)
   │       ├── <TEnv [tabular]>: table:synonyms (2032, 2136)
   │       ├── <TNode [subsubsection]: S2.S1.S1 (2140, 2230)>
   │       └── <TNode [subsubsection]: S2.S1.S2 (2231, 2542)>
   ├── <TNode [section]: S3 (2543, 5485)>
   │   ├── <TNode [paragraph]: This is a paragraph (3417, 4368)>
   │   ├── <TNode [paragraph]: And have another paragraph (4369, 5299)>
   │   └── <TNode [subsection]: S3.S1  (5300, 5484)>
   └── <TEnv [appendices]>: None (5435, 5485)

3.3. Exporting

To visualise your project as a network / graph with some external software you can export the project as a dict containing nodes and edges.

graph = tree.to_graph()
print(graph['nodes'][0])
print(graph['edges'][0])

Output:

{
   'id': '$',
   'name': 'Root',
   'tag': 'Root',
   'texlabel': None,
   'word count': 0,
   'n comments': 0,
   'n commands': 5,
   'n references': 0,
   'n citations': 0,
   'value': 0,
   'label': '[Root]: Root',
   'group': -1,
   'title': 'words: 0'}
{
   'id': 'p1',
   'from': '$',
   'to': 'n0',
   'weight': 1,
   'type': 'undirected',
   'value': 1,
   'source': '$',
   'target': 'n0'
}

Alternativaly, you can export to a .csv files combatible with Gephi:

tree.to_gephi_csv()

This will create two files, one containing the nodes and one containing the edges.

4. Notes

Some limitations of the project:

  1. If you are laoding a project with includes, make sure the main file ends with main.tex
    • e.g. my_latex_project_main.tex
    • The files this includes, should be in the same directory
  2. Required package version compatablity not checked
    • Earlier versions might be fine as well
  3. TNode.pretty_print() does not work on Windows console due characters

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytextree-0.1.1.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

pytextree-0.1.1-py3-none-any.whl (8.6 kB view details)

Uploaded Python 3

File details

Details for the file pytextree-0.1.1.tar.gz.

File metadata

  • Download URL: pytextree-0.1.1.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.4

File hashes

Hashes for pytextree-0.1.1.tar.gz
Algorithm Hash digest
SHA256 92c8f717c695e79c3a0b6b1ca97a06f82d67afa60845938e45c7af43d3c0ba25
MD5 399fe59f3f55012d4981287b3c4ce96f
BLAKE2b-256 b564abea99f592acc7d1480275110fc5d43a5b45c1e4f2db2f755df5de84c196

See more details on using hashes here.

File details

Details for the file pytextree-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pytextree-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 8.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.4

File hashes

Hashes for pytextree-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8b06b31f6ec2ed3f68727dc987eae74bb31dc09a539b3560195acae219379962
MD5 ae5d2bb1fd7c9871a6c76192f6938a69
BLAKE2b-256 80340c1f86f4dc94a79d7d9fa7f113a78edf5da8a36ca5840e8f89702e7b3f6f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page