Edit online

Hyphenation

Hyphenation specifies how words should be hyphenated when text wraps across multiple lines.

The transformation plugin uses the capabilities of the PDF Chemistry processor to perform hyphenation.

Edit online

Hyphenation Dictionaries

The Oxygen DITA-OT CSS-based PDF Publishing plugin provides built-in hyphenation patterns for the following languages:
Code Language
da Danish
de German
de_CH German (Switzerland)
en English
en-GB English (Great Britain)
es Spanish
fr French
it Italian
nb Norwegian Bokmål
nl

Dutch

ro Romanian
ru Russian
sv Swedish
th Thai
pt Portuguese
da Danish

The built-in hyphenation pattern license terms are listed in the XML files in the [CHEMISTRY_INSTALL_DIR]/config/hyph folder. Most of them comply with the LaTex distribution policy.

Edit online

Installing New Hyphenation Dictionaries

Oxygen DITA-OT CSS-based PDF Publishing plugin uses the TeX hyphenation dictionaries converted to XML by the OFFO project: https://sourceforge.net/projects/offo/.

The .xml files allow you to access the licensing terms and you can use them as a starting point to create customized dictionaries (see How to Alter a Hyphenation Dictionary).

The .hyp files are the compiled dictionaries that the Oxygen DITA-OT CSS-based PDF Publishing plugin actually uses.

The hyphenation dictionaries are located in: [OPE_INSTALL_DIR]/plugins/com.oxygenxml.pdf.css/lib/oxygen-pdf-chemistry/config/hyph.

One simple way to add more dictionaries:
  1. Download and extract the offo-hyphenation-compiled.zip file. This file is a bundle of many dictionary files.
  2. Copy the fop-hyph.jar file to the [DITA_OT_DIR]/plugins/com.oxygenxml.pdf.css/lib/oxygen-pdf-chemistry/lib directory.
  3. If you just need a single dictionary, place the .hyp or .xml file in the [DITA_OT_DIR]/plugins/com.oxygenxml.pdf.css/lib/oxygen-pdf-chemistry/config/hyph directory.
Edit online

How to Alter a Hyphenation Dictionary

The hyphenation dictionaries are stored as XML files in the [OPE_INSTALL_DIR]/plugins/com.oxygenxml.pdf.css/lib/oxygen-pdf-chemistry/config/hyph directory.

You can copy the dictionaries you need to change in another directory, then use the -hyph-dir parameter to refer them inside your transformation.

Each file is named with the language code and has the following structure:

<hyphenation-info>

<hyphen-min before="2" after="3"/>

<exceptions>
o-mni-bus
...
</exceptions>

<patterns>
préémi3nent.
proémi3nent.
surémi3nent.
....
</patterns>

</hyphenation-info>
To change the behavior of the hyphenation, you can modify either the patterns or the exceptions sections:
exceptions
Contains the list of words that are not processed using the patterns, each on a single line. Each of the words should indicate the hyphenation points using the hyphen ("-") character. If a word does not contain this character, it will not be hyphenated.

For example, o-mni-bus will match the omnibus word and will indicate two possible hyphenation points.

Note: Compound words (i.e. e-mail) cannot be controlled by exception words.
patterns
Contains the list of patterns, each on a single line. A pattern is a word fragment, not a word. The numbers from the patterns indicate how desirable a hyphen is at that position.

For example, tran3s2act indicates that the possible hyphenation points are "tran-s-act" and the preferable point is the first one, having the higher score of "3".

Edit online

How to Enable Hyphenation for Entire Map

To enable hyphenation for your entire map:

  1. Make sure you set an @xml:lang attribute on the root of your map, or set the default.language parameter in the transformation.
  2. In your customization CSS, add:
    :root {
        hyphens: auto;
    }
  3. To except certain elements from being hyphenated, use hyphens:none. The following example excludes the <keyword> elements from being hyphenated:
    *[class ~= "topic/keyword"] {
        hyphens: none;
    }
Edit online

How to Enable/Disable Hyphenation for an Element

  1. Make sure you set an @xml:lang attribute on the root of your map, or set the default.language parameter in the transformation.
  2. You have two options to control hyphenation inside an XML element:
    CSS Approach
    Use the hyphens property.
    For example, if you want to enable hyphenation in codeblocks:
    *[class~="pr-d/codeblock"] {
        hyphens: auto;
    }
    If you want to disable hyphenation inside tables:
    *[class~="topic/table"] {
        hyphens: none;
    }
    Attribute Approach
    Use the @outputclass="hyphens" or @outputclass="no-hyphens" attributes/values.
    For example, if you want to enable hyphenation in codeblocks:
    <codeblock outputclass="hyphens">
      ...
    </codeblock>
    If you want to disable hyphenation inside tables:
    <table outputclass="no-hyphens" ...>
      ...
    </table>
    Note: The default built-in CSS enables hyphenation for tables:
    *[class ~= "topic/table"] {
        hyphens: auto;
    }
Edit online

How to Force or Avoid Line Breaks at Hyphens

It is possible to force or avoid line breaks inside words with hyphens (U+2010). This can be useful, for example, inside tables that have product references if you want the display to remain on a single line (or to split it on multiple lines). To achieve this, you can use the -oxy-break-line-at-hyphens property:

The accepted values are:
auto
Words are hyphenated automatically according to an algorithm that is driven by a hyphenation dictionary. This can lead to line breaks at hyphens.
avoid
Words are still hyphenated automatically except no line break will occur on hyphens.
always
Words are still hyphenated automatically except line breaks will be forced on hyphens.

Example:

Suppose you have a products table like this:
<table>
  <row>
    <cell>Product-1233-55-88</cell>
    <cell>120</cell>
  <row>
  <row>
    <cell>Product-1244-66-99</cell>
    <cell>112</cell>
  <row>
</table>
and the following rule in a CSS stylesheet:
table {
  -oxy-break-line-at-hyphens: avoid;
}

In the output, the list of product references will be displayed in a single line. On the contrary, setting the property value to always, will force a break after each hyphen.