doc2txt
is a script that converts an XML files using XIST doc vocabulary
(i.e. the ll.xist.ns.doc
namespace module) into plain text (by using
ll.xist.ns.html.astext
).
doc2txt
supports the following options:
-t
,--title
The title for the document
-w
,--width
The width of the formatted text output (default 72)
The input is read from stdin and printed to stdout.
Example
The following generates spam.txt
from spam.xml
formatted to 80 columns:
$ doc2txt <spam.xml >spam.txt -w80