A Python equivalent of Java's StringTokenizer with some added functionality
Project description
StrTokenizer
A Python module that mimics the functionality of the Java StringTokenizer
class. This class splits a given string into tokens based on a specified delimiter and offers methods to iterate over the tokens, count them, and manipulate the tokenizer's state.
Installation
To install the StrTokenizer
package globally, you can use pip
. Here are the steps to install:
- Ensure you have
pip
installed on your system. - Open your command line interface (CLI) and run:
pip install StrTokenizer
If you want to use it locally without installing, simply download or copy the tokenizer.py
file and import it into your project.
Usage
Import the Module
If the module is installed via pip, import the class from your module:
from StrTokenizer import StrTokenizer
If the module (tokenizer.py) is downloaded from GitHub, import it like this:
from tokenizer import StrTokenizer
Creating a StrTokenizer Object
To create an instance of StrTokenizer
, provide the input string, the delimiter (optional, defaults to a space " "
), and whether to return the delimiters as tokens (optional, defaults to False
).
# Example with default delimiter (space)
tokenizer = StrTokenizer("This is a test string")
# Example with custom delimiter
tokenizer = StrTokenizer("This,is,a,test,string", ",")
# Example with custom delimiter and returning the delimiter as tokens
tokenizer = StrTokenizer("This,is,a,test,string", ",", return_delims=True)
Methods
countTokens() -> int
Returns the total number of tokens in the string.
token_count = tokenizer.countTokens()
print("Number of tokens:", token_count)
countTokensLeft() -> int
Returns the number of tokens left to be iterated.
tokens_left = tokenizer.countTokensLeft()
print("Tokens left:", tokens_left)
hasMoreTokens() -> bool
Checks if there are more tokens to iterate over.
if tokenizer.hasMoreTokens():
print("There are more tokens available.")
nextToken() -> str
Returns the next token. Raises an IndexError
if no more tokens are available.
while tokenizer.hasMoreTokens():
print(tokenizer.nextToken())
rewind(steps: int = None) -> None
Resets the tokenizer's index either completely or by a specified number of steps:
- Without arguments: Resets the tokenizer back to the first token.
- With
steps
: Moves the tokenizer back by the given number of steps.
# Rewind completely
tokenizer.rewind()
# Rewind by 2 tokens
tokenizer.rewind(2)
Example
from tokenizer import StrTokenizer
# Create a tokenizer with a custom delimiter
tokenizer = StrTokenizer("apple,orange,banana,grape", ",")
# Get the number of tokens
print("Number of tokens:", tokenizer.countTokens())
# Iterate over the tokens
while tokenizer.hasMoreTokens():
print("Token:", tokenizer.nextToken())
# Rewind the tokenizer and iterate again
tokenizer.rewind()
print("After rewinding:")
while tokenizer.hasMoreTokens():
print("Token:", tokenizer.nextToken())
Output:
Number of tokens: 4
Token: apple
Token: orange
Token: banana
Token: grape
After rewinding:
Token: apple
Token: orange
Token: banana
Token: grape
Methods Overview
-
__init__(self, inputstring: str, delimiter: str = " ", return_delims: bool = False)
:- Initializes the
StrTokenizer
with the given string, delimiter, and whether to return delimiters as tokens.
- Initializes the
-
__create_token(self) -> None
:- Splits the input string into tokens based on the delimiter.
-
countTokens(self) -> int
:- Returns the total number of tokens.
-
countTokensLeft(self) -> int
:- Returns the number of tokens left for iteration.
-
hasMoreTokens(self) -> bool
:- Checks if there are more tokens to be retrieved.
-
nextToken(self) -> str
:- Returns the next available token or raises an
IndexError
if no tokens are left.
- Returns the next available token or raises an
-
rewind(self, steps: int = None) -> None
:- Resets the tokenizer's index either completely or by a given number of steps.
You can install the StrTokenizer
package from PyPI:
Install StrTokenizer from PyPI
Source Code:
License
This project is open-source and available for modification or distribution.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file strtokenizer-1.1.0.tar.gz
.
File metadata
- Download URL: strtokenizer-1.1.0.tar.gz
- Upload date:
- Size: 4.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5bdccbf31c4f7956850344772d0d6949f148faa56ffe3b03517e623c4e24eb6b |
|
MD5 | 7ad642ee0e6acaa6849b59c5d95427c3 |
|
BLAKE2b-256 | bc8733987fde57d456599f18b245a24017f9d120e5dc4119673099411ec2a5d5 |
File details
Details for the file StrTokenizer-1.1.0-py3-none-any.whl
.
File metadata
- Download URL: StrTokenizer-1.1.0-py3-none-any.whl
- Upload date:
- Size: 4.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2971e2572b9d92a455e83eeba501e4ccb03dc9cde7ecad2a8cdcf74bd6706879 |
|
MD5 | d9d9e3c5215a262940c242615d047cea |
|
BLAKE2b-256 | 446cd11d591189820fbd04df66e961130185bb3ad9448decd94d61d2698c2a71 |