Migrating Various Document Formats to and from DITA
When organizations decide to use DITA for structuring, developing, managing, or publishing content, they usually already have content written in other formats and need to convert it to DITA. There are a variety of possibilities for a conversion to DITA, depending on the original format of the content.
Migration from other formats to DITA is rarely perfect and manual changes may need to be made to the converted content, but the methods described below should help you find the best approach for your particular case.
Migrating Microsoft Office and Other Similar Types of Documents to DITA
There are various possibilities for migrating content from Microsoft Office® (and other Office-type formats) to DITA. For details, see Migrating MS Office Documents to DITA.
Migrating DocBook Content to DITA
The Oxygen Batch Documents Converter add-on can be used for migrating one or multiple DocBook documents to DITA.
The provided DocBook to DITA conversion contains an option named Create DITA maps from DocBook documents containing multiple sections. When this option is selected, all sections from your DocBook document will be separated into individual DITA topics and referenced in a DITA map.
Migrating Google Docs to DITA
- Copy the content from Google Docs and paste it in an open DITA topic in Author mode. The Smart Paste functionality will attempt to convert the content to DITA.
- Save the Google document as OpenDocumentFormat (ODF), then open it in the free LibreOffice application and save it as DocBook. Next, open the DocBook document in Oxygen XML Editor and run the built-in transformation scenario called DocBook to DITA.
- If you want to convert multiple Google documents at once, save the documents as HTML, then use Oxygen's Batch Documents Converter add-on to convert the documents to DITA.
In all cases, you may need to make some manual adjustments in the resulting documents for elements that couldn't be mapped.
Migrating Markdown Content to DITA
- The DITA Open Toolkit publishing engine bundled with Oxygen XML Editor allows you to reference Markdown files directly in a DITA map and either publish them directly or export the Markdown files to DITA one by one. For details, see Working with Markdown Documents in DITA.
- If you want to convert multiple Markdown files at once, you can use Oxygen's Batch Documents Converter add-on to convert the documents to DITA.
Migrating HTML Content to DITA
There are several possibilities to convert HTML content to DITA:
- Copy the HTML content and paste it in an open DITA topic in Author mode. The Smart Paste functionality will attempt to convert the content to DITA.
- Convert the HTML file to XHTML by selecting XHTML to DITA Transformation Scenarios to convert the content to DITA. . Then, open the XHTML file and use one of the
- If you want to convert multiple HTML files at once, you can use Oxygen's Batch Converter add-on to convert the documents to DITA.
Migrating Unstructured FrameMaker to DITA
There is a blog post that details various possibilities for converting Unstructured FrameMaker content to DITA: Migrating Unstructured FrameMaker to DITA.
Migrating MadCap Content to DITA
This open-source project contains such a stylesheet that attempts to convert a Flare project to DITA XML along with instructions on how to use it. As an alternative, some recent MadCap versions seem to have facilities to export content directly to DITA.
Migrating Confluence to DITA
To migrate Confluence content to DITA, first export the content to HTML. For this, log in to your Confluence account and navigate to the specific space that you want to export. Then go to
and choose to export it as HTML.You can then use Oxygen's Batch Documents Converter add-on, selecting the Confluence to DITA action, to convert the exported index.html file into a DITA map with topics.
Migrating LaTex to DITA
You may use a third-party application (such as Pandoc) to convert LaTex content to Word or HTML. Then, you can use the Oxygen's Batch Documents Converter add-on to convert it to DITA XML.
Migrating Other Formats to DITA
You may find third-party applications (such as Pandoc) that can convert your content to HTML or to some kind of XML format like DocBook. Once you have HTML or DocBook content, you can convert them to DITA using one of the methods described above.
Migrate from DITA to Confluence and Other Formats
There are various possible methods available for converting DITA content to Confluence and other formats (such as Microsoft Word or HTML). For details and ideas for some of the possible methods, see the DITA to Confluence blog post.
Resources
- Webinar: Integrating Various Document Formats (OpenAPI, Word, Markdown, HTML, Excel) into DITA Documentation
- Webinar: Working with DITA in Oxygen - Migrating to DITA and Refactoring
- Video: Integrating REST-API Content into DITA Documentation in Oxygen
- Blog post: Migrating MS Word to DITA Using the Batch Documents Converter