SingleSourceDocs: DITA

Showing posts with label DITA. Show all posts

Monday, 21 May 2012

Restructure DITA Plugin Released

Finally released my new FrameMaker plugin today! Restructure DITA allows you to quickly and easily change one DITA list type for another, as well as convert paragraphs to lists and restructure existing lists as paragraphs. There's a link on the home page.
I've also been busy rewriting the source code for the CleanImport plugin. It has been a few years since I'd worked on it and looking at the code I realised I couldn't figure it out! So I decided to simplify and harmonize the code and, of course, document it better. I haven't added any new features yet, but at least I feel like I could now if I wanted to.
So now back to other projects, including finally finishing my HTML5 export filter for the DITA Open Toolkit.

Thursday, 5 April 2012

Exporting DITA to HTML5

I've been somewhat surprised at the lack of any real discussion on this topic so far apart from a few posts on the DITA-USERS forum about video codec compatibility and a good article by Don Day. The current explosive growth in the number of smart devices deployed will inevitably lead to a strong demand for technical documentation on mobile platforms. PDF- and HTML-based documents as well as eBooks can, of course, already be used on smart devices by means of native viewers, mobile browsers and eBook readers. The PDF and HTML user experience is generally poor, involving a lot of tapping, pinching, scrolling and rotating to get content to display correctly. Although the eBook user experience is much better, especially on tablets, eBook readers are proprietary native applications and mobile platform owners have imposed all sorts of restrictions on content. HTML5 promises to offer a superior user experience for mobile users. Mobile browsers already do a good job of automatically resizing and reformatting HTML5-based content to match devices' screen sizes and resolutions. HTML5 content can be rich, dynamic, and interactive so is well suited to eLearning applications, for example. So I've decided to try and create an HTML5 plugin for DITA! I hope this plugin will provide support for:

HTML's new semantic elements such as <nav>, <header>, and <footer>
Inline SVG graphics
MathML equations
Offline storage

The latest generation of browsers, including Firefox, Chrome, Safari, and Webkit-based mobile browsers, already support all these features. Internet Explorer is lagging behind as usual, but Microsoft has promised that IE10 will fully support the HTML5 standard. HTML5 is still evolving, so it looks like definitive support for things like metadata and the many Javascript APIs will have to wait for a while yet. But the core functionality is remarkably stable and well-supported, so there's already enough to be getting on with...

Adding Inline SVG and MathML to DITA

One interesting feature of HTML5 is its ability to render inline SVG and mathML markup to display 2D graphics, syntax diagrams, and equations. For example:

<html>
<body>
<h1>My first SVG</h1>
<svg xmlns="http://www.w3.org/2000/svg" version="1.1">
stroke-width="2" fill="red" />
</svg>
</body>
</html>

This sort of markup works in most modern browsers. It's called inline SVG because the SVG tags are embedded directly within the HTML code, in contrast to external SVG in which an SVG file is referenced in exactly the same way as you would a GIF or JPEG:

<img src="images/myDiagram.svg" alt="An external SVG graphic"/>

My plan for the DITA to HTML5 plugin is to pass inline SVG or mathML markup directly through from DITA topics to HTML5. Unfortunately, getting inline SVG and mathML to work in DITA is not straightforward. In fact, I've just spent the last two days doing some specialization, the mysterious science of customizing the set of tags that authors can use in DITA, in order to get it to work. The reason that native SVG and mathML support has never been included in the DITA Open Toolkit seems to be that there simply hasn't been much demand for it (and it was difficult to display in older browsers). SVG is still the only vectorial graphics format supported by DITA and hopefully one day it'll be fully integrated into the DITA Open Toolkit. My main sources of information about specialization have been Eliot Kimber's excellent DITA Configuration and Specialization tutorial and Introduction to DITA by Jennifer Linton and Kylene Bruski. Specialization can be used to modify DITA's original set of elements and attributes in several ways:

If you don't need a particular domain (a related set of tags, for example, the User Interface domain), removing it completely so that authors no longer see any of the domain's tags in the list of available elements.
Modifying the properties of particular tags, for example, so that <p> must contain plain text only and none of the inline formatting tags like <b> or <i> that are normally available.
Creating new attributes for existing tags.
Adding new custom domains.

In DITA parlance, these techniques are called respectively "Document Type Shell", "Topic Constraint", "Attribute Specialization", and "Element Domain Specialization". Conclusion: you don't have to be a geek to specialize, but it certainly helps! We're going to be doing Element Domain Specialization. To be honest, though, following Eliot's tutorial was a lot easier than I anticipated. Using oXygenXML, DITA Open Toolkit 1.5.3 and a few articles I found on Google, I got inline SVG and mathML working without too much trouble. I suspect I'll have more problems packaging it as a plugin so that others can use it, but that's for later. And there are still a lot of things I don't understand. For now, I'm going to switch to technical author mode to describe how to implement the specializations.

Preparing a Test Environment

Copy the entire {dita-ot-root}/dtd/technicalContent folder (where {dita-ot-root} is the root folder of your DITA Open Toolkit installation) to a temporary folder.
Create a new DITA concept topic and change the DOCTYPE line to point to the concept.dtd in your temporary folder, for example:
```
<!DOCTYPE concept SYSTEM "C:/temp/technicalContent/dtd/concept.dtd">
```
Note: If your editor adds a PUBLIC identifier as well as or instead of a SYSTEM identifier when it creates a new topic, I would recommend removing it, as a PUBLIC identifier takes precedence over the SYSTEM one and your topic will validate even if the SYSTEM identifier is wrong or a problem occurs in the specialization files.
Validate the topic to check that the DTDs in your working folder are being used.
Save the topic with a .dita or .xml extension to any folder.

Adding Inline mathML support to DITA

Domain specialization requires you to create two files, a .mod (module) and a .ent (entity), then update the DTD to reference them. This example only shows the concept DTD, but you'd need to do it to the other topic types' DTDs too (there must be a way of doing it to the base DTD, ditabase.dtd, so that it works for all topic types, but I couldn't figure that out).

Copy and paste this .mod file (which I've taken from a specialization article I found) and save it in your temporary technicalContent/dtd folder as mathmlDomain2.mod.
Copy and paste this .ent file and save it in the technicalContent/dtd folder as mathmlDomain2.ent.
Edit the concept.dtd file in your temporary folder and make the following changes:
- Add these lines to the bottom of the DOMAIN ENTITY DECLARATIONS section:
```
<!ENTITY % math-d-dec SYSTEM "mathmlDomain2.ent">
%math-d-dec;
```
- In the DOMAIN EXTENSIONS section, add the lines:
```
<!ENTITY % foreign "foreign | %math-d-foreign;">
<!ENTITY % unknown "unknown | %math-d-unknown;">
```
- In the DOMAINS ATTRIBUTE OVERRIDE section, add the line:
```
&math-d-att;
```
- In the DOMAIN ELEMENT INTEGRATION section, add the following lines:
```
<!ENTITY % math-d-def SYSTEM "mathmlDomain2.mod">
%math-d-def;
```
Save the changes and validate your concept topic to check that you haven't messed things up.

That's it! Position the cursor within a <p> element in your concept topic and you should now see new elements like <equation> and <math> in the list of available elements. To test it on something meaningful, you can use the following sample code:

<math type="presentation">
   <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"
    display="block">
    <mml:semantics>
     <mml:mrow>
      <mml:mrow>
       <mml:mi mathvariant="bold">a</mml:mi>
       <mml:mo>=</mml:mo>
       <mml:mfrac>
        <mml:mrow>
         <mml:mi mathvariant="bold">F</mml:mi>
        </mml:mrow>
        <mml:mi>m</mml:mi>
       </mml:mfrac>
       <mml:mo>=</mml:mo>
       <mml:mfrac>
        <mml:mrow>
         <mml:mi>q</mml:mi>
         <mml:mo>[</mml:mo>
         <mml:mi mathvariant="bold">E</mml:mi>
         <mml:mo>+</mml:mo>
         <mml:mfenced>
          <mml:mrow>
           <mml:mi mathvariant="bold">v</mml:mi>
           <mml:mi>X</mml:mi>
           <mml:mi mathvariant="bold">B</mml:mi>
          </mml:mrow>
         </mml:mfenced>
         <mml:mo>]</mml:mo>
        </mml:mrow>
        <mml:mi>m</mml:mi>
       </mml:mfrac>
      </mml:mrow>
     </mml:mrow>
    </mml:semantics>
   </mml:math>
  </math>

Note: In my original post, I had wrapped <equation> tags around the above example. This was wrong. The equation element is meant to be used as the top-level element in a separate file and as a container for MathML markup. You would then include the markup in a topic using something like <xref type="eq" href="equation1.dita"/>. I have not been able to get this to work and it isn't even documented anywhere as far as I can tell.

Adding Inline SVG support

SVG integration follows the same basic procedure as MathML: create .mod (module) and .ent (entity) files, then update the DTD file.

Copy and paste this .mod file and save it in the technicalContent/dtd folder as svgDomain.mod.
Copy and paste this .ent file and save it in the technicalContent/dtd folder as svgDomain.ent.
If you don't already have it, do a Google search for the svg11.dtd file and copy it into the technicalContent/dtd folder.
Edit the concept.dtd file in your temporary folder and make the following changes:
- In the DOMAIN ENTITY DECLARATIONS section, add the lines:
```
<!ENTITY % svg-d-dec SYSTEM "svgDomain.ent">
%svg-d-dec;
```
- In the DOMAIN EXTENSIONS section, modify the line that you previously edited for mathML to:
```
<!ENTITY % foreign "foreign | %math-d-foreign; | %svg-d-foreign;">
```
- In the DOMAINS ATTRIBUTE OVERRIDE section, add the line:
```
&svg-d-att;
```
- In the DOMAIN ELEMENT INTEGRATION section, add the following lines:
```
<!ENTITY % svg-d-def SYSTEM "svgDomain.mod">
%svg-d-def;
```
Save the changes and validate your concept topic again to check that everything works.

That's it! Position the cursor within a <p> element in your concept topic and you should now see the new <svg> element in the list of available elements. To test it on something meaningful, you can use the following sample code:

<svg>
   <svg:svg xmlns:svg="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
      <svg:ellipse cx="300" cy="150" rx="200" ry="80" style="fill:rgb(200,100,50);
            stroke:rgb(0,0,100);stroke-width:2"/>
    </svg:svg>
</svg>

When you've finished testing, remember to;

Repeat the procedure for the Task and Reference topic types
Copy the contents of your temporary /dtd/technicalContent folder to your DITA Open Toolkit folder, replacing the original contents.

Conclusions and Next Steps

Just to prove it does work, here's a screenshot from oXygenXML showing a bit of inline SVG and mathML:

The next steps are:

To package this as a plugin so that anyone can add it to their DITA Open Toolkit
To update my DITA to HTML5 transformation so that the inline mathML and SVG appear in my HTML5 topics.

Mapping DITA Elements to HTML5

I've been wondering how best to map DITA elements to HTML5's new semantic tags.

<header> and <footer>

These two tags are straightforward. I'll wrap them around the contents of any custom HTML specified with the arg.hdr and arg.ftr DITA parameters.

<article>

The HTML5 specification describes an article as being "a self-contained composition in a document, page, application, or site that is, in principle, independently distributable or reusable". That seems to closely match the DITA concept of a topic. So there are two possibilities:

Don't use <article> at all in the generated HTML5, except if topics are being chunked together in a single physical file.
Start the body of each topic with <article>, for example: <body> <article> ... <article> </body>

<section>

In HTML5, a section is used to logically subdivide a document or article:

This clearly corresponds to a DITA section, which "represents an organizational division in a topic".

<nav>

The HTML5 spec says that nav is "a section of a page that links to other pages or to parts within the page". The purpose of introducing such tags to HTML5 is to indicate to search engines that they don't need to index the content in them, so speeding up searches. Although it doesn't quite match because the links are not internal, I think this is a good match for the related links section of a DITA topic.

<aside>

This new HTML element represents "a section of a page that consists of content that is tangentially related to the content around the aside element, and which could be considered separate from that content". It's plainly important that content in the aside is distinctly styled: for example, as a right-aligned sidebar in printed material, or as a floating box with a distinctive background color or border in a web page. So perhaps the best match in DITA terms is the abstract element.

<hgroup>

This element is supposed to group consecutive headers together, for example:

<hgroup>
<h1>Main Title</h1>
<h2>Secondary Title</h2>
</hgroup>

Stacked headings with no intervening text are considered bad practise in technical document, and DITA's DTDs reinforce that by not allowing them. The only time it could happen in DITA is when the titlealt element is used to provide an alternative header (for example, one that appears in search results or in a table of contents). But only one of the titles appears in the document at any one time. So I'm inclined not to use hgroup at all.

Friday, 18 December 2009

Adding a chapter to a DITABOOK

There are many ways to add a new chapter to a DITABOOK. Not all work...

When the chapter's content is in a single topic file, it is straightforward:
<chapter navtitle="Introduction" href="cIntroduction.xml"/>

A chapter may have an introductory paragraph or two before the first level one heading (typically "This chapter describes..."). In this case, just use the previous example and add topicref tags pointing to the chapter's sections. The topicrefs must be child elements of the chapter tag, for example:
<chapter href="TaskHelp/cIntroduction.xml">
<topicref navtitle="Updating a Module" href="FieldHelp/rModule_Update.xml"/>
</chapter>

If the chapter's content is more complex, you might prefer to put the chapter's content in a DITAMAP and reference this DITAMAP in the chapter tag:
<chapter href="chapter1.ditamap" format="ditamap"/>

There are some pitfalls too:

If you do reference a DITAMAP, it must only contain a single top-level topicref tag. Otherwise, all the top-level topicref tags appear in the generated PDF at the chapter level but are not numbered as chapters.

You must have an href attribute on the chapter tag, otherwise the topicref's titles appear at the chapter level and there's no chapter title. So neither of the following examples work:
<chapter>
<topicref ...>
</chapter>
<chapter navtitle="Introduction">
<topicref ...>
</chapter>

Visio and SVG still don't work together

Unfortunately, Microsoft don't seem interested in fixing the bug in Visio's SVG export filter that leaves arrowheads off lines. The bug survived the SP2 and SP3 updates for Visio, and apparently is even present in Visio 2007!
There are several workarounds:

Don't use arrowheads in diagrams!
Fix the SVG file before using it
Use a different tool altogether.

To draw arrows in Visio, just draw one long and two short lines (or a line and a filled triangle), then group them together. Not pretty, but it works.

I found a small XSLT script that fixes the problem. Because SVG files are XML, you can run the script against the SVG, then save the resulting XML as a new SVG file. Here's the script:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xsl:stylesheet>


<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg" exclude-result-prefixes="svg">
<xsl:output method="xml" indent="no"/>
<xsl:template match="svg:marker">
<xsl:element name="marker" xmlns="http://www.w3.org/2000/svg">
<xsl:copy-of select="@*"/>
<xsl:attribute name="overflow">visible</xsl:attribute>
<xsl:copy-of select="node()"/>
</xsl:element>
</xsl:template>
<xsl:template match="@*¦node()">
<xsl:copy>
<xsl:apply-templates select="@*¦node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

I use oXygen to perform the transformation using the XSLT Debugger.

As far as alternative tools are concerned, I'm testing an Open Source tool called Inkscape. First impressions are that its interface is a bit quirky, but it seems perfectly adequate.

Tuesday, 25 November 2008

Smart quotes with FrameMaker and DITA-FMx

(FrameMaker 8 with DITA-FMx v1.00.26, DITA 1.0)

If you have Smart Quotes turned on in FrameMaker, beware that double quotes (") typed in FrameMaker are by default translated as `` (“) and ” (”) in generated XHTML. Only by turning off Smart Quotes in FrameMaker can I get standard straight quotes " (") in the XHTML. Of course, I then get straight quotes in my PDF files as well.

The solution is to wrap the text in a <q> (quote) element in DITA.

This has several advantages:
1. The correct opening and closing smart quotes are inserted automatically in FrameMaker.
2. The generated XHTML gets straight quotes.
3. The generated PDF gets smart quotes.
4. When the language attribute is set, the quotes get translated to localized versions, for example
« («) and » (») in French.

Monday, 20 August 2007

Using a DITAVAL file with FM+DITA

For some reason, the conditional processing parameter dita.input.valfile is not recognized by the DITA Open Toolkit when you're using the FM+DITA plugin.
There's a workaround: add the value to the ditafm.ini file in the ditafm folder of your root FrameMaker installation.
The procedure is quite straightforward:

Define the attributes in your topic files that you wish to make conditional (for example, set audience to "internal", product to "XMS Version 1" or platform to "Unix").

Open the ditafm.ini file in a text editor. Find the AntCommand line (in the [BuildFile] section) and add -Ddita.input.valfile="path_to_ditaval_file". For example:
```
AntCommand=ant -Ddita.input.valfile="C:/WORK/VMXMSRepository/en-GB/XMS Help System/Common/common.ditaval"
```
Create a .ditaval file in the specified location. The content is described in the DITA OT documentation. For example:
```
<?xml version="1.0"?>
<val>
<prop att="product" val="m3" action="exclude" />
</val>
```
Restart FrameMaker to take the changes to the ditafm.ini file into account.

Note that this approach has one serious disadvantage: the ditafm.ini file controls all conditional processing, so you can't have several different .ditaval files active simultaneously. Let's hope they address this problem soon.

It's also not yet clear how this works in FrameMaker 8.0, as the aforementioned [BuildFile] section is missing from the ditafm.ini file supplied with FrameMaker 8.0 altogether (btw, it's now in the fminit/ditafm folder).

SingleSourceDocs