Internationalized index in XSL

Jirka Kosek jirka na kosek.cz
Čtvrtek Listopad 27 13:39:29 CET 2003


Hi,

automatic indexing which is available in the DocBook XSL stylesheets is 
working quite well for English, but for other languages it lacks several 
features.

 From my own experience, following functionality is missing:

- allow to group accented letters like e, é, ë into the same group under 
letter "e"

- treat special letters (e.g. "ch") as one character and place them in 
correct position (e.g. between "h" and "i")

Attached customization layer is able to solve these two issues. If you 
want to try it, just import original fo/docbook.xsl into your 
customization layer and then include attached file. It was tested under 
Saxon, but should work in any processor which implements EXSLT function 
extension.

Current settings are suitable for Czech language, but you can easily 
modify them by editing content of l:letters element:

<l:letters lang="cs">
   ...
   <l i="1">A</l>
   <l i="1">a</l>
   <l i="1">Á</l>
   <l i="1">á</l>
   <l i="2">B</l>
   <l i="2">b</l>
   ...
   <l i="10">H</l>
   <l i="10">h</l>
   <l i="11">Ch</l>
   <l i="11">ch</l>
   <l i="11">cH</l>
   <l i="11">CH</l>
   <l i="12">I</l>
   <l i="12">i</l>
   ...
</l:letters>

This snippet means that letters A, a, Á, á will be put under the same 
group in the index (they have same i attribute) and that letter "a" will 
be sorted before "b" which has i=2. Later elements define that "ch" will 
be placed between "h" and "i".

I will appreciate if you could:

- test solution and give me feedback
- send me <l:letter lang="xx"> table for languages you are using
- write me that this solution can't handle properly your language and 
write me what must be improved in order to fix it

After gathering your feedback I will add this file into standard DocBook 
XSL stylesheets.

					Jirka

-- 
-----------------------------------------------------------------
   Jirka Kosek  	
   e-mail: jirka na kosek.cz
   http://www.kosek.cz

------------- další část ---------------
A non-text attachment was scrubbed...
Name: autoidx-ng.xsl
Type: text/xml
Size: 6390 bytes
Desc: [žádný popis není k dispozici]
URL: <http://www.linux.cz/pipermail/docbook/attachments/20031127/408cc57f/attachment.xml>
------------- další část ---------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3403 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://www.linux.cz/pipermail/docbook/attachments/20031127/408cc57f/attachment.bin>


Další informace o konferenci Docbook