Skip to main content

Reads a python module and statically analyzes it.

Project description

textpy

Reads a python module and statically analyzes it. This works well with Jupyter extensions in VS Code, and will have better performance when the module files are formatted with PEP-8.

Installation

$ pip install textpy

Requirements

lazyr>=0.0.16
pandas
Jinja2
black
hintwith>=0.1.3

NOTE: pandas>=1.4.0 is recommended. Lower versions of pandas are also available, but some properties of this package will be affected.

Quick Start

To demonstrate the usage of this module, we put a file named myfile.py under ./examples/ (you can find it in the repository, or create a new file of your own):

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from typing import Optional


class MyBook:
    """
    A book that records a story.

    Parameters
    ----------
    story : str, optional
        Story to record, by default None.

    """

    def __init__(self, story: Optional[str] = None) -> None:
        if story is None:
            self.content = "This book is empty."
        self.content = story


def print_my_book(book: MyBook) -> None:
    """
    Print a book.

    Parameters
    ----------
    book : MyBook
        A book.

    """
    print(book.content)

Run the following codes to find all the occurrences of some pattern (for example, "MyBook") in myfile.py:

>>> import textpy as tx
>>> myfile = tx.module("./examples/myfile.py") # reads the python module

>>> myfile.findall("MyBook", styler=False)
examples/myfile.py:7: 'class <MyBook>:'
examples/myfile.py:24: 'def print_my_book(book: <MyBook>) -> None:'
examples/myfile.py:30: '    book : <MyBook>'

If you are using a Jupyter notebook in VS Code, you can run a cell like this:

>>> myfile.findall("content")
source match
myfile.MyBook:7 class MyBook:
myfile.print_my_book():24 def print_my_book(book: MyBook) -> None:
myfile.print_my_book():30 book : MyBook
Note that in the Jupyter notebook case, the matched substrings are **clickable**, linking to where the patterns were found.

Examples

tx.module()

The previous demonstration introduced the core function tx.module(). In fact, the return type of tx.module() is a subclass of the abstract class PyText, who supports various text manipulation methods:

>>> isinstance(m, tx.PyText)
True

Sometimes, your python module may contain not just one file but multiple files and folders, but don't worry, since tx.module() provides support for complex file hierarchies. The return type will be either PyDir or PyFile, both subclasses of PyText, depending on the path type.

In conclusion, suppose you've got a python package, you can simply give the package dirpath to tx.module(), and do things like before:

>>> pkg_dir = "examples/" # you can type any path here
>>> pattern = "" # you can type any regular expression here

>>> res = tx.module(pkg_dir).findall(pattern)

tx.PyText.findall()

As mentioned before, user can use .findall() to find all non-overlapping matches of some pattern in a python module.

>>> myfile.findall("optional", styler=False)
examples/myfile.py:13: '    story : str, <optional>'

The optional argument styler= determines whether to use a pandas Styler object to beautify the representation. If you are running python in the console, please always set styler=False. You can also disable the stylers in display_params, so that you don't need to repeat styler=False every time in the following examples:

>>> from textpy import display_params
>>> display_params.enable_styler = False

In addition, the .findall() method has some optional parameters to customize the matching pattern, including whole_word=, case_sensitive=, and regex=.

>>> myfile.findall("mybook", case_sensitive=False, regex=False, whole_word=True)
examples/myfile.py:7: 'class <MyBook>:'
examples/myfile.py:24: 'def print_my_book(book: <MyBook>) -> None:'
examples/myfile.py:30: '    book : <MyBook>'

tx.PyText.replace()

Use .replace() to find all non-overlapping matches of some pattern, and replace them with another string:

>>> replacer = myfile.replace("book", "magazine")
>>> replacer
examples/myfile.py:9: '    A <book/magazine> that records a story.'
examples/myfile.py:20: '            self.content = "This <book/magazine> is empty."'
examples/myfile.py:24: 'def print_my_<book/magazine>(<book/magazine>: MyBook) -> None:'
examples/myfile.py:26: '    Print a <book/magazine>.'
examples/myfile.py:30: '    <book/magazine> : MyBook'
examples/myfile.py:31: '        A <book/magazine>.'
examples/myfile.py:34: '    print(<book/magazine>.content)'

At this point, the replacement has not yet taken effect on the files. Use .confirm() to confirm the changes and make them done:

>>> replacer.confirm()
{'successful': ['examples/myfile.py'], 'failed': []}

tx.PyText.delete()

Use .delete() to find all non-overlapping matches of some pattern, and delete them:

>>> deleter = myfile.delete("book")
>>> deleter
examples/myfile.py:9: '    A <book> that records a story.'
examples/myfile.py:20: '            self.content = "This <book> is empty."'
examples/myfile.py:24: 'def print_my_<book>(<book>: MyBook) -> None:'
examples/myfile.py:26: '    Print a <book>.'
examples/myfile.py:30: '    <book> : MyBook'
examples/myfile.py:31: '        A <book>.'
examples/myfile.py:34: '    print(<book>.content)'

>>> deleter.confirm()
{'successful': ['examples/myfile.py'], 'failed': []}

See Also

Github repository

PyPI project

License

This project falls under the BSD 3-Clause License.

History

v0.1.25

  • Updated utils.re_extensions:
    • Important: we've decided to extract utils.re_extensions into an independent package named re_extensions (presently at v0.0.3), so any future updates should be looked up in https://github.com/Chitaoji/re-extensions instead; we will stay in sync with it, however;
    • real_findall() now returns match objects instead of spans and groups;
    • smart_sub() accepts a new optional parameter called count=;
    • SmartPattern supports [] to indicate a Unicode (str) or bytes pattern (like what re.Pattern does);
    • new regex operations smart_split(), smart_findall(), line_findall(), smart_subn(), and smart_fullmatch();
    • created a namespace Smart for all the smart operations;
    • bugfixes for rsplit(), lsplit(), and smart_sub().
  • Reduced the running cost of PyText.findall() by taking advantage of the new regex operation line_findall().

v0.1.24

  • New methods PyText.is_file() and PyText.is_dir() to find out whether the instance represents a file / directory.
  • New method PyText.check_format() for format checking.
  • Defined the comparison ordering methods __eq__(), __gt__(), and __ge__() for PyText. They compares two PyText object via their absolute paths.
  • Updated utils.re_extensions:
    • new regex operations smart_search(), smart_match(), and smart_sub();
    • new string operation counted_strip();
    • new utility classes SmartPattern and SmartMatch.
    • new utility functions find_right_bracket() and find_left_bracket().
  • Replacer.to_styler() will no longer return a styler when pandas version < 1.4.0.

v0.1.23

  • New string operation utils.re_extensions.word_wrap().
  • Various improvements.

v0.1.22

  • The module-level function textpy() is going to be deprecated to avoid conflicts with the package name textpy. Please use module() insead.
  • New methods PyText.replace() and PyText.delete().
  • New class Replacer as the return type of PyText.replace(), with public methods .confirm(), .rollback(), etc.
  • Added a dunder method PyText.__truediv__() as an alternative to PyText.jumpto().
  • New subclass PyContent inheriting from PyText. A PyContent object stores a part of a file that is not storable by instances of other subclasses.

v0.1.21

  • Improved behavior of clickables.

v0.1.20

  • Fixed issues:
    • incorrectly displayed file paths in the output of TextPy.findall(styler=False);
    • expired file links in the output of TextPy.findall(styler=True, line_numbers=False).

v0.1.19

  • Various improvements.

v0.1.18

  • Updated LICENSE.

v0.1.17

  • Refactored README.

v0.1.16

  • Lazily imported pandas to reduce the time cost for importing.

v0.1.12

  • New optional parameters for TextPy.findall() :
    • whole_word= : whether to match whole words only;
    • case_sensitive= : specifies case sensitivity.

v0.1.10

  • New optional parameter encoding= for textpy().

v0.1.9

  • Removed unnecessary dependencies.

v0.1.8

  • Bugfix under Windows system.

v0.1.5

  • Provided compatibility with pandas versions lower than 1.4.0.
  • Updated textpy() :
    • Path object is now acceptable as the positional argument;
    • new optional parameter home= for specifying the home path.
  • More flexible presentation of output from TextPy.findall().

v0.1.4

  • Fixed a display issue of README on PyPI.

v0.1.3

  • Initial release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

textpy-0.1.25-py2.py3-none-any.whl (28.4 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page