extract_doc.py – a document extractor and formatter for embedded Textile

Copyright 2008, Mike Howard and Clove Technologies, Inc. All Rights Reserved.

Use is granted to any and all under the terms of the GNU Public License, version 2 – as it exists at the current time.

extract_doc.py is a very simple program which extracts all the plain text between marker tags [#doc-start and #doc-end] in one or more source code files and translates it into HTML.

Of course, the translation to HTML is a lot better if the text is formatted correctly using the Textile formating language.

This version of extract_doc.py uses the python implementation of Textile in textile-2.0.11. From the PKG_INFO of that distribution:

Metadata-Version: 1.0
Name: textile
Version: 2.0.11
Summary: This is Textile. A Humane Web Text Generator.
Home-page: http://dealmeida.net/projects/textile/
Author: Roberto A. F. De Almeida
Author-email: roberto@dealmeida.net
License: Freely Distributable
Download-URL: http://dom.eav.free.fr/textile-2.0.10.tar.gz
Description: Textile is a XHTML generator using a simple markup developed by Dea
n Allen. This is a Python port with support for code validation, itex to MathML
translation, Python code coloring and much more.

Platform: any

Running extract_doc.py is simple:

This will create a subdirectory named doc and write one HTML file for each source file found – named <file>.html.

Options are available to:

One small point: the -a option appends extensions to the current list, wherease the -e option Replaces the current list.

Another small point: for the ‘forgetful’ there are several synonyms for #doc-start and #doc-end. They are:

Last small point: #doc-start and #doc-end are not recognized if they don’t start in column 1.

Hope this is useful. Mike Howard – http://www.clove.com