a question about the hyphenator

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

a question about the hyphenator

Mauro Trevisan
 Hi, I would like to ask you if there are some documentation on the
hyphenation algorithm, particularly on the non-standard hyphenation and --
mainly -- on the NEXTLEVEL keyword.
I don't understand how it is used. It divides the patterns in two groups
where the first is used in the hyphenation of a non-compounded word and the
second on a compounded word? How can I know a word is compounded?
Thank you
Reply | Threaded
Open this post in threaded view
|

Re: a question about the hyphenator

jonathon-3


On 04/09/2018 06:46 AM, Mauro Trevisan wrote:
>  Hi, I would like to ask you if there are some documentation on the
> hyphenation algorithm, particularly on the non-standard hyphenation and --
> mainly -- on the NEXTLEVEL keyword.

a) I haven't looked at the code;
b) This is from memory, based on material back when OOo 2.x was in
development;

As such, this might not reflect current practice, if not outright wrong.

There are two hyphenation algorithms in OOo.  One of them was created by
Lázló Németh. (Hoping I have the correct diacritic marks.) The other one
was a slightly modified form of the hyphenation program used by TeX.

The Tex derived hyphenation program is for "normal languages", where
everything is regular, and compounded words can be easily parsed and
broken apart, without losing meaning.

The program Lázló wrote is for non-normal languages. These are languages
that use a number of suffixes, prefixes, intefixes, and other things,
that more or less ensure that breaking the word can, and usually will
alter the meaning of the word, if not the sentence. It is not uncommon
that breaking one of these words in the wrong place, will change the
meaning of the entire paragraph. Hungarian is the best known European
language, with this type of behaviour.

His program also works for languages in which words can be compounded
upon words, to create new words. Break these words in the wrong place,
and one can easily overlook the misplaced hyphen. Even breaking them in
the right place can result in the hyphen being overlooked.

IIRC, Lázló wrote at least one paper, and did one or two presentations
on hyphenation in OOo.

jonathon

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]