This is a pre-production deployment of Warehouse, however changes made here WILL affect the production instance of PyPI.
Latest Version Dependencies status unknown Test status unknown Test coverage unknown
Project Description

Quick example

>>> from textmodel import TextModel
>>> text = TextModel(u'Hello World')
>>> text2 = TextModel(u'!', fontsize=20)
>>> text.insert(11, text2)
>>> text.set_properties(6, 11, bgcolor='yellow')
>>> for i in range(1000):
...     text.append(TextModel("Line %i\n" % i))
>>> text.linelength(0) # length of first line
19
>>> text.index2position(100) # row, col of index 100
(12, 2)

Introduction

Word processors are usually believed to be heavy and slow applications. However I think, that it is possible to design a word processor which is light weight and which is fast - so fast that it even can be implemented in a “slow” scripting language. Text model is ment to be a prove of concept (even though it is merely a text editor and not a full word processor).

Storing and editing text information is a problem with a long history in computer science. Known solutions include the gap buffer (used by Emacs), the piece table (used by MS-Word) and the rope data structure . Instead, text model uses internally a structure which I named “texel tree” and which is probably a new approach to the problem. The goal was to find a data structure which stores text together with format information and is

  • fast (even when implemented in a scripting language)
  • efficient (in memory consumption)
  • hierarchic (so that texts can contain elements like tables which itself contain text)

The texel tree consists of nodes which are called texels (text elements). Each texel can have a variable number of child texels (between 8 and 15), forming a highly branched tree, similar to a B-tree. Operations to the tree a performed in such a way, that the tree is kept balanced, i.e. all branches have exactly the same depth. The texel tree is fast because it allows all text operations (insert, remove, copy, paste) in logarithmic time. It is efficient because it stores text on the level of strings and not on the character level and it stores the styling in a economic way.

Text model is an interface to the texel tree, hiding all the complexity of the recursive texel data structure. It is termed “text model” because in a model-view-controller scenario it would have the role of the “model”. A matching view / editor component is wxtextview. In combination they can be used as text editor.

Speed

Note that textmodel is not yet optimized. By saying that the texel structure is fast, I mean that the time of operations grows only slowly with the length of the text. I would not be surprised, if the times could be improved by a factor of 2 or more.

The following table shows how the time needed to insert a line grows with the length of the text. The text length is measured as number of text nodes, where each text node holds one line of text, e.g. 50000 means a text with 50 thousand lines of text.

# lines time (milliseconds)
1 0.332514
3 0.379985
5 0.436915
10 0.519033
30 0.596213
50 0.657198
100 0.75822
300 0.843198
500 0.897312
1000 0.998324
3000 1.081806
5000 1.136462
10000 1.246638
30000 1.356982
50000 1.404089

As can be seen, the time grows only very little with number of lines. Ideally, I would expect a logarithmic dependence on text length. This is especially true for the following operations:

  • inserting strings
  • inserting other trees (=paste)
  • copying text
  • removing text
  • calculating index positions from (row, col)-tuples and vice versa
  • counting lines

Moreover, pasting and cutting text changes only little with the size of the text which is cut out or pasted in. Again, there should be a logarithmic dependence.

Implementation details

The texel tree consists of different kinds of texels: group texels, character texels, glyphs texels and containers texels.

Character texels hold strings of uniformly styled unicode text. NewLines are a special case of character texels. Groups hold child elements. The following texel stores the words Hello world! with world marked with red.

G[C('Hello'), C('world!', bgcolor='red')]

Each texel has a length, which corresponds to the number of contained characters. For example, the length of C(‘Hello’) is 5 and the length of an empty group is zero.

There are also texels for new lines and tabs and a special mark for the end of text.

It is easy to extend text model by introducing new texels, e.g. tables and math formulas.

Each texel has a weights attribute. This attribute is one of the reasons for the high efficiency of the texel tree. It is a tuple of 3 integer numbers and it facilitates fast navigation in the tree. The first entry gives the depth of the texel, which is needed internally, the second gives the number of characters in texel and the third gives the number of line breaks in the texel. The latter is used excessively by methods as nlines, linelength, lineend and index2position.

Release History

Release History

0.3.6

This version

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.3.5

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.3.4

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.3.3

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.3.2

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.3.1

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.3

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.2.2

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.2.1

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.2

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.1.1

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.1

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.0

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

Download Files

Download Files

TODO: Brief introduction on what you do with files - including link to relevant help section.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
textmodel-0.3.6.tar.gz (27.0 kB) Copy SHA256 Checksum SHA256 Source Feb 5, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS HPE HPE Development Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting