Skip to main content

A feature complete Python text case conversion library.

Project description

textcase

Coveralls PyPI - Version PyPI - Python Version PyPI - Types PyPI - Wheel

A feature complete Python text case conversion library.

Installation

Create and activate a virtual environment and then install textcase:

pip install textcase

Example

You can convert strings into a case using the textcase.convert function:

from textcase import case, convert

print(convert("ronnie james dio", case.SNAKE))     # ronnie_james_dio
print(convert("Ronnie_James_dio", case.CONSTANT))  # RONNIE_JAMES_DIO
print(convert("RONNIE_JAMES_DIO", case.KEBAB))     # ronnie-james-dio
print(convert("RONNIE-JAMES-DIO", case.CAMEL))     # ronnieJamesDio
print(convert("ronnie-james-dio", case.PASCAL))    # RonnieJamesDio
print(convert("RONNIE JAMES DIO", case.LOWER))     # ronnie james dio
print(convert("ronnie james dio", case.UPPER))     # RONNIE JAMES DIO
print(convert("ronnie-james-dio", case.TITLE))     # Ronnie James Dio
print(convert("ronnie james dio", case.SENTENCE))  # Ronnie james dio

By default, textcase.convert and textcase.converter.CaseConverter.convert will split along a set of default word boundaries, that is

  • underscores _,
  • hyphens -,
  • spaces ,
  • changes in capitalization from lowercase to uppercase aA,
  • adjacent digits and letters a1, 1a, A1, 1A,
  • and acroynms AAa (as in HTTPRequest).

For more precision, you can specify boundaries to split based on the word boundaries of a particular case. For example, splitting from snake case will only use underscores as word boundaries:

from textcase import boundary, case, convert

print(convert("2020-04-16_my_cat_cali", case.TITLE))                          # 2020 04 16 My Cat Cali
print(convert("2020-04-16_my_cat_cali", case.TITLE, (boundary.UNDERSCORE,)))  # 2020-04-16 My Cat Cali

This library can detect acronyms in camel-like strings. It also ignores any leading, trailing, or duplicate delimiters:

from textcase import case, convert

print(convert("IOStream", case.SNAKE))             # io_stream
print(convert("myJSONParser", case.SNAKE))         # my_json_parser
print(convert("__weird--var _name-", case.SNAKE))  # weird_var_name

It also works non-ascii characters. However, no inferences on the language itself is made. For instance, the digraph ij in Dutch will not be capitalized, because it is represented as two distinct Unicode characters. However, æ would be capitalized:

from textcase import case, convert

print(convert("GranatÄpfel", case.KEBAB))    # granat-äpfel
print(convert("ПЕРСПЕКТИВА24", case.TITLE))  # ПЕРСПЕКТИВА24
print(convert("ὈΔΥΣΣΕΎΣ", case.LOWER))       # ὀδυσσεύς

By default, characters followed by digits and vice-versa are considered word boundaries. In addition, any special ASCII characters (besides _ and -) are ignored:

from textcase import case, convert

print(convert("E5150", case.SNAKE))              # e_5150
print(convert("10,000Days", case.SNAKE))         # 10,000_days
print(convert("Hello, world!", case.UPPER))      # HELLO, WORLD!
print(convert("ONE\\nTWO\\nTHREE", case.TITLE))  # One\n Two\n Three

You can also test what case a string is in:

from textcase import case, is_case

print(is_case("css-class-name", case.KEBAB))  # True
print(is_case("css-class-name", case.SNAKE))  # False
print(is_case("UPPER_CASE_VAR", case.SNAKE))  # False

Boundary Specificity

It can be difficult to determine how to split a string into words. That is why this case provides the textcase.convert and textcase.converter.CaseConverter.convert functionality, but sometimes that isn’t enough to meet a specific use case.

Say an identifier has the word 2D, such as scale2D. No exclusive usage of textcase.convert or textcase.converter.CaseConverter.convert will be enough to solve the problem. In this case we can further specify which boundaries to split the string on. This library provides some patterns for achieving this specificity. We can specify what boundaries we want to split on using instances of the textcase.boundary.Boundary class:

from textcase import boundary, case, convert

# Not quite what we want
print(convert("scale2D", case.SNAKE, case.CAMEL.boundaries))    # scale_2_d

# Write boundaries explicitly
print(convert("scale2D", case.SNAKE, (boundary.LOWER_DIGIT,)))  # scale_2d

Custom Boundaries

This library provides a number of constants for boundaries associated with common cases. But you can create your own boundary to split on other criteria:

from textcase import case, convert
from textcase.boundary import Boundary

# Not quite what we want
print(convert("coolers.revenge", case.TITLE))  # Coolers.revenge

# Define custom boundary
DOT = Boundary(
    satisfies=lambda text: text.startswith("."),
    length=1,
)

print(convert("coolers.revenge", case.TITLE, (DOT,)))  # Coolers Revenge

# Define complex custom boundary
AT_LETTER = Boundary(
    satisfies=lambda text: (len(text) > 1 and text[0] == "@") and (text[1] == text[1].lower()),
    start=1,
    length=0,
)

print(convert("name@domain", case.TITLE, (AT_LETTER,)))  # Name@ Domain

To learn more about building a boundary from scratch, take a look at the textcase.boundary.Boundary class.

Custom Case

Simular to textcase.boundary.Boundary, there is textcase.case.Case that exposes the three components necessary for case conversion. This allows you to define a custom case that behaves appropriately in the textcase.convert and textcase.converter.CaseConverter.convert functions:

from textcase import convert
from textcase.boundary import Boundary
from textcase.case import Case
from textcase.pattern import lower

# Define custom boundary
DOT = Boundary(
    satisfies=lambda text: text.startswith("."),
    length=1,
)

# Define custom case
DOT_CASE = Case(
    boundaries=(DOT,),
    pattern=lower,
    delimiter=".",
)

print(convert("Dot case var", DOT_CASE))  # dot.case.var

And because we defined boundary conditions, this means textcase.is_case should also behave as expected:

from textcase import is_case
from textcase.boundary import Boundary
from textcase.case import Case
from textcase.pattern import lower

# Define custom boundary
DOT = Boundary(
    satisfies=lambda text: text.startswith("."),
    length=1,
)

# Define custom case
DOT_CASE = Case(
    boundaries=(DOT,),
    pattern=lower,
    delimiter=".",
)

print(is_case("dot.case.var", DOT_CASE))  # True
print(is_case("Dot case var", DOT_CASE))  # False

Case converter class

Case conversion takes place in two parts. The first splits an identifier into a series of words, and the second joins the words back together. Each of these are steps are defined using the textcase.converter.CaseConverter.to_case and textcase.converter.CaseConverter.to_case functions respectively.

CaseConverter is a class that encapsulates the boundaries used for splitting and the pattern and delimiter for mutating and joining. The convert method will apply the boundaries, pattern, and delimiter appropriately. This lets you define the parameters for case conversion upfront:

from textcase import CaseConverter, case, pattern

converter = CaseConverter()
converter.pattern = pattern.camel
converter.delimiter = "_"

print(converter.convert("My Special Case"))  # my_Special_Case

converter.from_case(case.CAMEL)
converter.to_case(case.SNAKE)

print(converter.convert("mySpecialCase"))  # my_special_case

For more details on how strings are converted, see the docs for textcase.converter.CaseConverter.

API

Modules
textcase.boundary Conditions for splitting an identifier into words.
textcase.case Case definitions for text transformation.
textcase.converter Text case conversion between different case formats.
textcase.pattern Functions for transforming a list of words.
Classes
textcase.boundary.Boundary Represents a condition for splitting an identifier into words.
textcase.case.Case Represents a text case format for transformation.
textcase.converter.CaseConverter Represents a utility class for converting text between different case formats.
Constants
textcase.boundary.UNDERSCORE Splits on _, consuming the character on segmentation.
textcase.boundary.HYPHEN Splits on -, consuming the character on segmentation.
textcase.boundary.SPACE Splits on space, consuming the character on segmentation.
textcase.boundary.LOWER_UPPER Splits where a lowercase letter is followed by an uppercase letter.
textcase.boundary.UPPER_LOWER Splits where an uppercase letter is followed by a lowercase letter. This is seldom used.
textcase.boundary.ACRONYM Splits where two uppercase letters are followed by a lowercase letter, identifying acronyms.
textcase.boundary.LOWER_DIGIT Splits where a lowercase letter is followed by a digit.
textcase.boundary.UPPER_DIGIT Splits where an uppercase letter is followed by a digit.
textcase.boundary.DIGIT_LOWER Splits where a digit is followed by a lowercase letter.
textcase.boundary.DIGIT_UPPER Splits where a digit is followed by an uppercase letter.
textcase.boundary.DEFAULT_BOUNDARIES Default boundaries used for splitting strings into words, including underscores, hyphens, spaces, and capitalization changes.
textcase.case.SNAKE Snake case strings are delimited by underscores _ and are all lowercase.
textcase.case.CONSTANT Constant case strings are delimited by underscores _ and are all uppercase.
textcase.case.KEBAB Kebab case strings are delimited by hyphens - and are all lowercase.
textcase.case.CAMEL Camel case strings are lowercase, but for every word except the first the first letter is capitalized.
textcase.case.PASCAL Pascal case strings are lowercase, but for every word the first letter is capitalized.
textcase.case.LOWER Lowercase strings are delimited by spaces and all characters are lowercase.
textcase.case.UPPER Uppercase strings are delimited by spaces and all characters are uppercase.
textcase.case.TITLE Title case strings are delimited by spaces. Only the leading character of each word is uppercase.
textcase.case.SENTENCE Sentence case strings are delimited by spaces. Only the leading character of the first word is uppercase.
Functions
textcase.is_case Check if the given text matches the specified case format.
textcase.convert Convert the given text to the specified case format.
textcase.boundary.split Split an identifier into a list of words using the provided boundaries.
textcase.boundary.get_boundaries Identifies boundaries present in the given text.
textcase.pattern.lower Convert all words to lowercase.
textcase.pattern.upper Convert all words to uppercase.
textcase.pattern.capital Capitalize the first letter of each word and make the rest lowercase.
textcase.pattern.camel Convert the first word to lowercase and capitalize the remaining words.
textcase.pattern.sentence Capitalize the first word and make the remaining words lowercase.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textcase-0.2.0.tar.gz (50.8 kB view details)

Uploaded Source

Built Distribution

textcase-0.2.0-py3-none-any.whl (36.4 kB view details)

Uploaded Python 3

File details

Details for the file textcase-0.2.0.tar.gz.

File metadata

  • Download URL: textcase-0.2.0.tar.gz
  • Upload date:
  • Size: 50.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.4

File hashes

Hashes for textcase-0.2.0.tar.gz
Algorithm Hash digest
SHA256 7151f769fa04fec627803bbf3c6ce5260470df537213051e5af27b1705913c42
MD5 b05996e4fcf4d0f990660834b9569f1b
BLAKE2b-256 2d20a63111b2adb27199b013125e7a0d0dc87d6547a962117062b97685228fce

See more details on using hashes here.

File details

Details for the file textcase-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: textcase-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 36.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.4

File hashes

Hashes for textcase-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1e1516c1bac51dd9921d6a0343c96604f14ea12019069bd9ea609fb9da97503e
MD5 71c2f1a7955b7799ac48c51b1d02eb44
BLAKE2b-256 2999694cdd6da7e4780f892fb47cbb7fb9933a5a91bca537ab02bbc902f8f356

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page