[ previous ] [ Contents ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ A ] [ B ] [ next ]


DebianDoc-SGML Manual
Appendix B - Conversion to XML


Now debiandoc-sgml package offers conversion to Docbook XML format for its user to smoothly migrate to the newer well maintained tool set called Docbook XML.

This appendix was written by Osamu Aoki (GPL2) in 2005.


B.1 Basic conversion method

No particular preparation of debiandoc-sgml SGML source is needed to use the conversion tool debiandoc2xml . See debiandoc2xml(1).

To make a single XML file from valid debiandoc-sgml source, issue the following commands:

     $ cd path/to/sgml-source-tree/
     $ debiandoc2xml -1 foo.sgml
     $ cd foo.xml/
     $ mv index.xml foo.xml

The generated XML file is named as index.xml and let's manually rename as foo.xml for this time.


B.2 Advanced conversion methods

In order to make generated file manageable, you may want to have them split into separate files for each chapter and preserve external ENTITY definitions as separate file. They are quite easy.


B.2.1 Split XML file output

When issuing debiandoc2xml command, just issue it without -1 option in the above example. You get XML files with file names matching id values of chapt tags. The top page is index.xml and it will contain file inclusion directions.


B.2.2 Preserving external ENTITY definitions

Some SGML sources use external file to manage common information across the documentation source and maintain good coherence. Creating this ENTITY definitions in a separate file named default.ent is common practice. It contains entries such as:

     <!ENTITY debianhome     "http://www.debian.org/">

You probably want to preserve these remote definition after XML conversion. Following describes how to do this.

In order to simplify this conversion, you need to simplify default.ent by removing definition for conditional switching such as:

     <!ENTITY % q-ref   "IGNORE">

and

     <![%lang-fr;[
     <!ENTITY full-title "Guide de référence pour Debian">
     <!ENTITY p-debian-reference "debian-reference-fr">
     ]]>

Then, you tweak default.ent file (assuming files are normally formatted) as follows:

     $ mv default.ent default.ent.orig
     $ egrep "<\![[:space:]]*ENTITY[[:SPACE:]]+" <default.ent.orig | \
       perl -p -e \
          's/<\!\s*ENTITY\s+([-\w]+)\s.*$/<\!ENTITY $1 "@#@#@#$1#@#@#@">/' \
          > default.ent

This will create alternative entries, which generate reference markers such as:

     <!ENTITY debianhome "@#@#@#debianhome#@#@#@">

Then use this alternative default.ent file for XML conversion. (You may still modify this to include missing required definitions such as lines containing "%" in the original default.ent file.)

For each generated XML files with this alternative default.ent file, you recover remote references by converting markers with:

     $ for i in *.xml ; do \
       perl -p -i -e 's/@#@#@#$([\w]+)#@#@#@/\&$1;/g' $i ; \
       done

You need to add specification of including original default.ent in the header area of foo.xml XML file at the top again as:

     <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
         "/usr/share/sgml/docbook/dtd/xml/4.2/docbookx.dtd" [
     
     <!ENTITY % default  SYSTEM "default.ent">  %default;
     
     <!-- more lines here -->
     
     ]>

B.3 Testing generated XML file(s)

You can test the generated XML file with Emacs and psgml, or nsgmls:

     $ nsgmls -s /usr/share/sgml/declaration/xml.decl foo.xml

[ previous ] [ Contents ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ A ] [ B ] [ next ]


DebianDoc-SGML Manual

2021-01-16

Ardo van Rangelrooij mailto:ardo@debian.org
Ian Jackson mailto:ijackson@gnu.ai.mit.edu