An OpenDocument to reStructuredText/Sphinx converter.
What is it?
Odt2sphinx converts OpenDocument Text .odt file(s) to one or several reStructuredText .rst files.
This is a fork of Christophe de Vienne odt2sphinx.
Python 3 is required!
pip3 install metapensiero.odt2sphinx
usage: odt2sphinx [-h] [--debug] [--download-source-link] [--embedded-uris] [--ignore-original-column-widths] [--encoding ENCODING] [--test] source [target] ODT to RST positional arguments: source Source ODT file to be converted, or a directory containing ODT files and corresponding .expected.rst files in test mode target Either destination directory, a single .rst target filename or "-" for stdout optional arguments: -h, --help show this help message and exit --debug Emit debug noise --download-source-link Add a link to the ODT source file --embedded-uris Emit embedded URIs, instead of anonymous refs --ignore-original-column-widths Do not honor the widths of the columns in the original document tables --encoding ENCODING Output encoding, by default UTF-8 --test Run in test mode, comparing output with expected reST to be found in “source.expected.rst”
There are four modes of operation:
- Sphinx, splitting the source in multiple files, one per chapter
- Monolithic single plain reST output
- Functional test
The first mode is selected by omitting the second positional argument, or giving it the name of a directory. The second is selected by specifying a file name with a .rst extension as the second positional argument. The third by specifying - as the target name. The latter by using the --test option.
Multiple files mode
The files are generated in the target dir, which by default has the same name as the .odt file minus the extension.
At least one file, index.rst, will be written. Depending on the document content, additional rst files may be generated.
Images are extracted and put together in an “images” directory inside the targetdir.
Monolithic output mode
All the output goes into the single rst file specified as the second positional argument.
Images are extracted and put together in an “images” directory inside the directory containing the output file.
No files are created, even for images: all the output goes to stdout.
Functional test mode
This mode is used by the automatic tests: when the --test option is specified, the tool loads the expected result from a file with the same name as the source one but with the .odt suffix replaced by .expected.rst.
It will print out any discrepancy as a unified diff.
The following rules will be applied to particulary styles when converting an .odt file. The style names are case-insensitive.
- Becomes the main document title (over- and underlined with =)
- Becomes the document subtitle (over- and underlined with -)
- Title 1 … Title 6
- Becomes sub-chapter titles, underlined respectively with #, =, -, ~, +
and `; in
multiple files modethe source document is splitted on Title 1 sections and a reference to the single files is inserted in a toctree directive in the index.rst file
- “Warning” (or “Avertissement”)
- The chapter becomes the content of a .. warning directive
- “Tip” (or “Trucs & Astuces”)
- The chapter becomes the content of a .. tip directive
- “Note” or “Information”
- The chapter becomes the content of a .. note directive
- Support also OpenOffice
- Fix corner case when a line-break follow empty spaces
- Better recognition of WMF images
- Recognize fixed text also using the font pitch
- Optimize **bold** **words** as **bold words**
- Fix error when a table contains empty columns
- Recurse down document sections
- New option –ignore-original-column-widths, to produce tighter tables
- Eliminate font style from spans in Anchors, since the textual part of it is taken verbatim by docutils
- Respect original relative widths of table columns
- Fix compatibility with Python 3.4
- Fix rendering of tables with columns span greater than two
- Aggregate consecutive admonition directives of the same type
- Fix representation of list item containing a nested list
- Handle table of contents
- By default hyperlinks are rendered using anonymous refs, the new option --embedded-uris reverts to the old behaviour
- Eliminate excessive newlines from the output
- Aggregate consecutive similar elements into a single one
- Unbreak metafile conversion to PNG
- Convert also StarView Metafile images to PNG
- Fix issue with table rendering
- Center cell content of header rows
- Let the content of multi-rows cell flow thru the separator border
- Use LibreOffice to convert Windows Meta File images to PNG
- Restore handling of –download-source-link option
- Code overhaul, in particular the reST Writer has been rewritten from scratch and the Visitor
- reST generation is now done using a stack of objects, easier to understand and to extend
- honor the auto-numerated and nested list styles
- handle line breaks in paragraphs
- honor the title and subtitle of the document, using different decorations than those used for section titles
- respect the styling of the section titles
- support multi-rows header in tables
- handle subscript and superscript text styles
- New automatic tests, comparing the output with an expected result
- Print to stdout alternative mode
- Fix release version, removing the date tag
- Forked from https://bitbucket.org/cdevienne/odt2sphinx
- Drop support for Python 2
- Use Pillow instead of PIL
- Rewrap output text for enhanced readability
- Single monolithic alternative mode
- Fix filename generation by replacing any non-alphanumeric character (issue #3).
- Fix handling of non-styled lists.
- Fix the sdist archive on pypi.
- Add support for numbered lists, hyperlinks, underlined text (translated to italic).
- Fix bold text support.
- Now supports python 3
- Explicitely added PIL as a dependency (issue #2).
- Add “Information” to the styles mapping.
- Handle note, tip and warning styles in lists items. This allows to use lists inside a note, a tip or a warning.
- Now handle external images (issue #1).
- Improved the RstFile for use in third-party code: it is now possible to insert code and not just append it.
- Add a README file