Localization
Choosing the Fonts
An important step in making sure your document is published properly in multiple scripts (languages) is choosing a font that covers the entire set of characters from that script (language).
However, you can specify a series of font families (using the CSS
font-family
property) that are used as fall-backs. If a word cannot be
rendered using the first specified font, the processor tries the next font family, and so
on. The following example uses some common fonts available in Windows to make a CSS
stylesheet capable of properly displaying a large number of languages, from all European to
Asian languages:
* {
font-family: Calibri, SimSun, "Malgun Gothic";
}
Serif,
Times, Times New Roman, DejaVu Serif, Symbol, SimSun, MingLiU, MS Mincho, Batang,
Vijaya, Noto Serif CJK SC, Noto Serif CJK JP, Noto Serif CJK KR, Arial Unicode
MS
. This combination covers a wide character range, but the final result
depends on the fonts from this list available on your OS. If a word contains a character
that is not found in the current font, the fallback font list is iterated until one that
supports all the word characters is found.Support for Right-to-Left Languages
The Unicode BIDI algorithm is applied automatically. For the best results in HTML, make sure you mark the right-to-left content, or the left-to-right content embedded in right-to-left, with proper direction attributes:
<p dir='rtl'>SOME ARABIC TEXT <p dir='ltr'>Some latin words.</p>.</p>
For arbitrary XML, use the unicode-bidi
and direction
CSS
properties:
<para dir='right-to-left'>SOME ARABIC TEXT.</p>
CSS:
para[dir='right-to-left'] {
direction: rtl;
unicode-bidi: embed;
}
@dir
attribute in any XML vocabulary with the same semantic as for HTML.
The accepted values are: ltr, rtl, auto. If you use other attribute
names or other values for this attribute, you should add CSS rules similar to the one
above.For elements in a right-to-left context, Oxygen PDF Chemistry automatically switches the left borders, paddings, and margins with the ones from the right. This keeps your CSS as simple as possible.
In the following example, the <p>
element has a left border.
p {
border-left: 1pt solid orange;
}
Suppose it is placed in a <div>
with the default ltr
direction. The orange line is painted in its left border. But if it is placed in a
<div>
with the rtl
direction, the orange line will be
painted in its right border because that is where the text begins.
Changing Labels Depending on Language
When developing a CSS that will apply to output localized for multiple languages, you should consider changing the static text (CSS generated) depending on the language.
xml:lang
or lang
attributes
to specify the content language, ideally on the root element. The value must be specified
using a language identifier (such as "en", "en-US", "en-CA", "fr", "fr-CA").Consider a case where all the chapter titles are prefixes with the word "Chapter", followed by the figure counter. Depending on the language of the XML/HTML document, you need this word to change to: "Kapitel" for German, or to "Chapitre" for French.
<div class='chp'>
<h2>Introduction</h2>
...
</dif>
The CSS may be written starting with a default rule that will be used when the content has languages other than the ones that are expected (for example, in English):
div.chp > h2:before{
content: "Chapter " counter(chp);
}
Next, write rules for each of the specific languages:
div.chp > h2:lang(de):before{
content: "Kapitel " counter(chp);
}
div.chp > h2:lang(fr):before{
content: "Chapitre " counter(chp);
}
To make the maintenance easier, you can separate the strings from the counter value by
using one of the advanced features of Oxygen PDF Chemistry (the
:before
and :after
pseudo-elements with multiple
levels). So you could write the default rule as:
div.chp > h2:before(2){
content: "Chapter ";
}
div.chp > h2:before(1){
content: counter(chp);
}
Now, the more specific rules are more simple:
div.chp > h2:lang(de):before(2){
content: "Kapitel ";
}
div.chp > h2:lang(fr):before(2){
content: "Chapitre ";
}