Configuring the Search Engine Optimization
A DITA Map WebHelp transformation scenario produces a sitemap.xml file that is used by search engines to aid crawling and indexing mechanisms. A sitemap lists all pages of a WebHelp system and allows web admins to provide additional information about each page, such as the date it was last updated, change frequency, and importance of each page in relation to other pages in your WebHelp deployment.
Important: If the
webhelp.sitemap.base.url
parameter is specified, the loc element will contain the value of
this parameter plus the relative path to the page. If the
webhelp.sitemap.base.url
parameter is not specified, the
loc element will only contain the relative path of the page.You can also set these additional parameters:
- webhelp.sitemap.change.frequency - Specifies how
frequently the WebHelp pages are likely to change (accepted values are:
always
,hourly
,daily
,weekly
,monthly
,yearly
, andnever
). - webhelp.sitemap.priority - Specifies the priority of each page (a value ranging from 0.0 to 1.0).
The structure of the sitemap.xml file looks like
this:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/topics/introduction.html</loc>
<lastmod>2014-10-24</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.example.com/topics/care.html#care</loc>
<lastmod>2014-10-24</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
. . .
</urlset>
Each page has a
<url>
element structure containing additional
information, such as:- loc - The URL of the page. This URL must begin with the protocol (such as
http
), if required by your web server. It is constructed from the value of thewebhelp.sitemap.base.url
parameter from the transformation scenario and the relative path to the page (collected from thehref
attribute of atopicref
element in the DITA map).Note: The value must have fewer than 2,048 characters. - lastmod (optional) - The date when the page was last
modified. The date format is
YYYY-MM-DD hh:mm:ss
. - changefreq (optional) - Indicates how frequently the page is likely to change.
This value provides general information to assist search engines, but may not correlate
exactly to how often they crawl the page. Valid values are:
always
,hourly
,daily
,weekly
,monthly
,yearly
, andnever
. The first time the sitemap.xml file is generated, the value is set based upon the value of thewebhelp.sitemap.change.frequency
parameter in the DITA WebHelp transformation scenario. You can change the value in eachurl
element by editing the sitemap.xml file.Note: The valuealways
should be used to describe documents that change each time they are accessed. The valuenever
should be used to describe archived URLs. - priority (optional) - The priority of this page relative to other pages on your
site. Valid values range from 0.0 to 1.0. This value does not affect how your pages are
compared to pages on other sites. It only lets the search engines know which pages you
deem most important for the crawlers. The first time the sitemap.xml
file is generated, the value is set based upon the value of the
webhelp.sitemap.priority
parameter in the DITA WebHelp transformation scenario. You can change the value in eachurl
element by editing the sitemap.xml file.
Creating and Editing the sitemap.xml File
Follow these steps to produce a sitemap.xml file for your WebHelp
system, which can then be edited to fine-tune search engine optimization:
- Edit the transformation scenario you currently use for obtaining your WebHelp output. This opens the Edit DITA Scenario dialog box.
- Open the Parameters tab and set a value for the following
parameters:
- webhelp.sitemap.base.url - The URL of the location where your WebHelp
system is deployed.Note: This parameter is required for Oxygen XML Editor Eclipse plugin to generate the sitemap.xml file.
- webhelp.sitemap.change.frequency - How frequently the WebHelp pages are
likely to change (accepted values are:
always
,hourly
,daily
,weekly
,monthly
,yearly
, andnever
). - webhelp.sitemap.priority - The priority of each page (value ranging from 0.0 to 1.0).
- webhelp.sitemap.base.url - The URL of the location where your WebHelp
system is deployed.
- Run the transformation scenario.
- Look for the sitemap.xml file in the transformation's output folder. Edit the file to fine-tune the parameters of each page, according to your needs.