Unicode in MySpell

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Unicode in MySpell

Bugzilla from artavazdm@gmail.com
Hi all!

I'm working on OpenOffice.org 2.0 localization in Armenian.
We at the Open Source Armenia team have already localized version 1.1.0 at
01.09.2003. At that time there was no solution for Unicode spell checker.
The issue is that there's no 8-bit encoding for Armenian, and only Windows
XP supports Armenian language in Unicode.

To resolve this encoding problem, I've created a "pseudo" 8-bit encoding.
The whole algorithm of the solution is almost similar to HunSpell, but in my
case class MySpell makes steps like these:
1. Encoding detection.
2. If Armenian, convert UTF-8 text to ARMSCII-8(pseudo 8-bit encoding).
3. If incorrect, make suggestion list from dictionary (8-bit encoding).
4. Return suggestion list converted from ARMSCII-8 to UTF-8.
So I've added
1. ARMSCII-8 to UTF-8 converter
2. UTF-8 to ARMSCII-8 converter
3. A different "special_chars" in "cleanword" method.

Thus, I would like to ask you to consider this option, and would very much
like to get a feedback from OpenOffice community in that regard.

You can see the sources at:
http://hy.openoffice.org/source/browse/hy/src/2.0.0/lingucomponent/source/sp
ellcheck/myspell/


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Unicode in MySpell

nemeth-2
Hi Artavazd,

You can use your patch for Armenian OOo 2.0, but using Hunspell
(really extended MySpell) is a general solution for encoding problems.

Hunspell integration is targeted to OOo 2.0.2 (end of february 2006),
but you can integate Hunspell to the OOo 2.0 source from the CWS `hunspell':

$ cvs -d:pserver:[hidden email]:/cvs login
password: anoncvs
$ cvs -z3 -d:pserver:[hidden email]:/cvs co -r
cws_src680_hunspell lingucomponent config_office scp2

Quoting Artavazd Mertarjyan <[hidden email]>:

> Hi all!
>
> I'm working on OpenOffice.org 2.0 localization in Armenian.
> We at the Open Source Armenia team have already localized version 1.1.0 at
> 01.09.2003. At that time there was no solution for Unicode spell checker.
> The issue is that there's no 8-bit encoding for Armenian, and only Windows
> XP supports Armenian language in Unicode.
>
> To resolve this encoding problem, I've created a "pseudo" 8-bit encoding.
> The whole algorithm of the solution is almost similar to HunSpell, but in my
> case class MySpell makes steps like these:

Aspell has a similar solution. It's enough good for the most languages, but
there are some languages with more than 255 characters.
For professional/scientific spell checking there is also a need to combine
multiple character encodings (for example, using foreign
geographical, person and other proper names). It's difficult for agglutinative
languages, because these languages can combine the different characters
in one word (with the foreign stems + native affixes): in Hungarian "about
&#197;ngström" is "&#197;ngströmről", a word with latin-1 (&#197;) and latin-2
(ő) characters.

Hunspell handles really 16-bit encoding. Nepali and Hungarian OOo 2.0 use
Hunspell with Unicode (UTF-8) Nepali and Hungarian dictionaries.

> 1. Encoding detection.
> 2. If Armenian, convert UTF-8 text to ARMSCII-8(pseudo 8-bit encoding).
> 3. If incorrect, make suggestion list from dictionary (8-bit encoding).
> 4. Return suggestion list converted from ARMSCII-8 to UTF-8.
> So I've added
> 1. ARMSCII-8 to UTF-8 converter
> 2. UTF-8 to ARMSCII-8 converter
> 3. A different "special_chars" in "cleanword" method.

Special_chars in clean_word() is deleted from the MySpell source.
The right tokenization comes from the OOo's breakiterator.
If the default tokenization is bad for Armenian, you need a Breakiterator
patch. (See i18npool/source/breakiterator/ and its data/ subdirectory).

Best regards,

Laci

>
> Thus, I would like to ask you to consider this option, and would very much
> like to get a feedback from OpenOffice community in that regard.
>
> You can see the sources at:
> http://hy.openoffice.org/source/browse/hy/src/2.0.0/lingucomponent/source/sp
> ellcheck/myspell/
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>




----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Unicode in MySpell

Simon Brouwer
Hi Nemeth,

[hidden email] wrote:

>Hi Artavazd,
>
>You can use your patch for Armenian OOo 2.0, but using Hunspell
>(really extended MySpell) is a general solution for encoding problems.
>
>Hunspell integration is targeted to OOo 2.0.2 (end of february 2006),
>  
>
Does that mean we have to modify the format of the existing Myspell
dictionaries?

Or is it possible to use different spell checkers, e.g. if there is more
than one language in a document,
one language might be checked using Hunspell and another using Myspell.

>The right tokenization comes from the OOo's breakiterator.
>If the default tokenization is bad for Armenian, you need a Breakiterator
>patch. (See i18npool/source/breakiterator/ and its data/ subdirectory).
>  
>
Will the different behaviour of the breakiterator be effective on all
the languages in the document, or
can it also be switched depending on the language?

For Dutch spell checking, it would be preferable if the break iterator
could be instructed not to break
on hyphens, because the new Dutch spelling introduces are Dutch words
that include a hyphen, of
which not all parts are also valid words (example:
"arbeidsre-integratie", in which "arbeidsre" is not a Dutch word).

--
Vriendelijke groet,
Simon Brouwer.

### nl.openoffice.org ###


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Unicode in MySpell

Дмитрий Габинский
Fri, 23 Dec 2005 10:07:34 +0100, Simon Brouwer <[hidden email]>
писал(а):

> Does that mean we have to modify the format of the existing Myspell
> dictionaries?

I can answer that even before Laci would bother. I do use Hunspell (I
switch off MySpell) with the existing MySpell dictionaries. No need to
modify. But you do have to modify them (well, to modify affix files)
if you want to use the full power of HunSpell. However, I — developing
a Belarusian spellcheck module — still don't take such steps (since I,
regretfully, don't have enough time to dig deeply into Hunspell's new
features).

> Or is it possible to use different spell checkers, e.g. if there is
>more
> than one language in a document,
> one language might be checked using Hunspell and another using
>Myspell.

Hunspell and MySpell can be used simultaneously, but what's the
reason?

Best regards,

Dmitri Gabinski
 
   
   
---
Специальное  предложение от синима - кафе"Дом"! Мы приглашаем Вас весело
встретить Новый Год. В программе Сергей Кравец, цыганский ансамбль и море
веселья. Наш адрес Минск, ул.Красноармейская,3 (Центральный дом офицеров)
кафе "Дом", т.768-90-73, http://www.cafedom.by ; e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Unicode in MySpell

Simon Brouwer
Hi Dmitri,

Dmitri Gabinski wrote:

> Fri, 23 Dec 2005 10:07:34 +0100, Simon Brouwer <[hidden email]>
> писал(а):
>
>> Does that mean we have to modify the format of the existing Myspell
>> dictionaries?
>
>
> I can answer that even before Laci would bother. I do use Hunspell (I
> switch off MySpell) with the existing MySpell dictionaries. No need to
> modify. But you do have to modify them (well, to modify affix files)
> if you want to use the full power of HunSpell. However, I — developing
> a Belarusian spellcheck module — still don't take such steps (since I,
> regretfully, don't have enough time to dig deeply into Hunspell's new
> features).

OK, great!

The Dutch Myspell dictionary already contains the NOSPLITSUGS option,
which was also supported in Myspell in OOo 1.1.4/5 but unfortunately not
in OOo 2.0. It's good to know that it will work again with the
integration of Hunspell.

I am very excited about the compounding feature in Hunspell, this is
something often requested by Dutch users. We will definitely use it!

>
>> Or is it possible to use different spell checkers, e.g. if there is more
>> than one language in a document,
>> one language might be checked using Hunspell and another using Myspell.
>
>
> Hunspell and MySpell can be used simultaneously, but what's the reason?

Given your explanation, I agree there is none.
Thanks!

--
Vriendelijke groet,
Simon Brouwer.

### nl.openoffice.org ###

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Unicode in MySpell

nemeth-2
In reply to this post by Simon Brouwer
Quoting Simon Brouwer <[hidden email]>:

> Hi Nemeth,
>
> [hidden email] wrote:
>
> >Hi Artavazd,
> >
> >You can use your patch for Armenian OOo 2.0, but using Hunspell
> >(really extended MySpell) is a general solution for encoding problems.
> >
> >Hunspell integration is targeted to OOo 2.0.2 (end of february 2006),
> >
> >
> Does that mean we have to modify the format of the existing Myspell
> dictionaries?

Hi Simon,

No, Hunspell is back compatible with MySpell. Dmitri, thanks for the
answer! Hunspell supports NOSPLITSUGS. I strongly think, Hunspell can
help in handling of Dutch compound words. (By the way, I have a little
Christmas surprise for Dutch users of OOo. I hope, I can post on the
weekend. :)

>
> Or is it possible to use different spell checkers, e.g. if there is more
> than one language in a document,
> one language might be checked using Hunspell and another using Myspell.

Björn Jacke has suggested a dictionary.lst syntax to differentiate
MySpell and Hunspell dictionaries (because German Hunspell dictionary
uses new features of Hunspell, and it don't work well with MySpell).
But new versions of Hunspell could have also new features, so I think, we need
only a policy for downloadable OOo dictionaries. It's enough, that
DictOOo always supports the spell checker version of the last stable version of
OOo. (Localised versions of OOo can contain newer spell checking dictionaries
with a newer Hunspell or other spell checkers.)

>
> >The right tokenization comes from the OOo's breakiterator.
> >If the default tokenization is bad for Armenian, you need a Breakiterator
> >patch. (See i18npool/source/breakiterator/ and its data/ subdirectory).
> >
> >
> Will the different behaviour of the breakiterator be effective on all
> the languages in the document, or
> can it also be switched depending on the language?

I have suggested language specific breakiterator patches, like
the Catalan, Hungarian etc. dict_word patches in
i18npool/source/breakiterator/data directory.

>
> For Dutch spell checking, it would be preferable if the break iterator
> could be instructed not to break
> on hyphens, because the new Dutch spelling introduces are Dutch words
> that include a hyphen, of
> which not all parts are also valid words (example:
> "arbeidsre-integratie", in which "arbeidsre" is not a Dutch word).

Similar to Hungarian. See i18npool/source/breakiterator/data/dict_word_hu
(the new version of dict_word_hu includes also the n-dash as word
character).

Best regards,

Laci

>
> --
> Vriendelijke groet,
> Simon Brouwer.
>
> ### nl.openoffice.org ###
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>




----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Unicode in MySpell

Kevin B. Hendricks
Hi,

I would like MySpell to go completely away (and soon).  I simply do  
not have the time to maintain it properly and Hunspell can do  
everything that MySpell does and much more.

So IMHO, we should remove MySpell completely from the source tree  
when Hunspell is integrated.

That will remove the duplication and prevent confusion.  Then we  
would have NO Myspell vs HunSpell issues in dictionary.lst

Kevin

On Dec 23, 2005, at 6:54 AM, [hidden email] wrote:

> Quoting Simon Brouwer <[hidden email]>:
>
>> Hi Nemeth,
>>
>> [hidden email] wrote:
>>
>>> Hi Artavazd,
>>>
>>> You can use your patch for Armenian OOo 2.0, but using Hunspell
>>> (really extended MySpell) is a general solution for encoding  
>>> problems.
>>>
>>> Hunspell integration is targeted to OOo 2.0.2 (end of february  
>>> 2006),
>>>
>>>
>> Does that mean we have to modify the format of the existing Myspell
>> dictionaries?
>
> Hi Simon,
>
> No, Hunspell is back compatible with MySpell. Dmitri, thanks for the
> answer! Hunspell supports NOSPLITSUGS. I strongly think, Hunspell can
> help in handling of Dutch compound words. (By the way, I have a little
> Christmas surprise for Dutch users of OOo. I hope, I can post on the
> weekend. :)
>
>>
>> Or is it possible to use different spell checkers, e.g. if there  
>> is more
>> than one language in a document,
>> one language might be checked using Hunspell and another using  
>> Myspell.
>
> Björn Jacke has suggested a dictionary.lst syntax to differentiate
> MySpell and Hunspell dictionaries (because German Hunspell dictionary
> uses new features of Hunspell, and it don't work well with MySpell).
> But new versions of Hunspell could have also new features, so I  
> think, we need
> only a policy for downloadable OOo dictionaries. It's enough, that
> DictOOo always supports the spell checker version of the last  
> stable version of
> OOo. (Localised versions of OOo can contain newer spell checking  
> dictionaries
> with a newer Hunspell or other spell checkers.)
>
>>
>>> The right tokenization comes from the OOo's breakiterator.
>>> If the default tokenization is bad for Armenian, you need a  
>>> Breakiterator
>>> patch. (See i18npool/source/breakiterator/ and its data/  
>>> subdirectory).
>>>
>>>
>> Will the different behaviour of the breakiterator be effective on all
>> the languages in the document, or
>> can it also be switched depending on the language?
>
> I have suggested language specific breakiterator patches, like
> the Catalan, Hungarian etc. dict_word patches in
> i18npool/source/breakiterator/data directory.
>
>>
>> For Dutch spell checking, it would be preferable if the break  
>> iterator
>> could be instructed not to break
>> on hyphens, because the new Dutch spelling introduces are Dutch words
>> that include a hyphen, of
>> which not all parts are also valid words (example:
>> "arbeidsre-integratie", in which "arbeidsre" is not a Dutch word).
>
> Similar to Hungarian. See i18npool/source/breakiterator/data/
> dict_word_hu
> (the new version of dict_word_hu includes also the n-dash as word
> character).
>
> Best regards,
>
> Laci
>
>>
>> --
>> Vriendelijke groet,
>> Simon Brouwer.
>>
>> ### nl.openoffice.org ###
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: dev-
>> [hidden email]
>>
>>
>
>
>
>
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: dev-
> [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Unicode in MySpell

Bugzilla from artavazdm@gmail.com
In reply to this post by nemeth-2

Hi All!

Thanks for detailed answers!
I agree that HunSpell is better then MySpell and I'm going to localize it
for Armenian too.

In "hu" project CVS (2.0.1) the Hungarian language isn't defined as UTF-8.
Does that mean you are not using UTF-8 for Hungarian or you have another one
solution?
If you are using UTF-8 now, have you compare these two solutions, which is
faster?

I've some doubt in score of HunSpell's UTF-8 text spell checker.
May be for Armenian it will be better to use the same algorithm in the
HunSpell?



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Unicode in MySpell

Bjoern JACKE-3
In reply to this post by Kevin B. Hendricks
Hi!

On 2005-12-23 at 08:55 -0500 Kevin B. Hendricks sent off:
>So IMHO, we should remove MySpell completely from the source tree  
>when Hunspell is integrated.
>
>That will remove the duplication and prevent confusion.  Then we  
>would have NO Myspell vs HunSpell issues in dictionary.lst

there will be confusion for dict downloaders who use older OOo versions
(just supporting myspell) and dict downloaders who use newer OOo versions
(using hunspell). At the time where dictionaries which will use
hunspell specific features and there is no seperation of my/hunspell
dicts, users of myspell OOo versions will not be happy anymore and
download broken (for myspell useage) dictionaries ;-).

Other spellcheck engines than my/hunspell which require other
dictionaries will also require a dictionary.lst change to not mix up
things completely. For example - how should an Arabic spellcheck
engine's dictionary entries be distinguished from an Arabic hunspell
dictionary? I hope a dictionary.lst change will come before
hunspell will altually BE the default.

Cheers
Bjoern

attachment0 (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Unicode in MySpell

Kevin B. Hendricks
Hi,

The whole "dictionary.lst" mechanism is part of the spell checking  
component (ie. the MySpell spell checker).  So another spell checking  
component need not and should not use the concept of "dictionary.lst"  
since that was developed for MySpell and MyThes components.  It is  
being adopted for HunSpell since it is replacing MySpell.  So an  
arabic or commerical or some other spell checker should not use  
MySpell dictionary.lst but should instead use some other method to  
allow users to install and update their dictionaries.

Also since old MySpell dictionaries will work just fine with  
Hunspell,  and since dictionary authors that choose to use HunSpell  
specific features can simply name their dictionaries something else  
(en_US_hunspell for example), this really should not be an issue.

Again, the best solution is to get rid of Myspell as fast as possible  
(IMHO).  Users of old OOo versions can simply install a properly  
compiled hunspell component if they want to use hunspell dictionaries.

My 2 cents,

Kevin



On Dec 23, 2005, at 5:44 PM, Bjoern JACKE wrote:

> Hi!
>
> On 2005-12-23 at 08:55 -0500 Kevin B. Hendricks sent off:
>> So IMHO, we should remove MySpell completely from the source tree  
>> when Hunspell is integrated.
>>
>> That will remove the duplication and prevent confusion.  Then we  
>> would have NO Myspell vs HunSpell issues in dictionary.lst
>
> there will be confusion for dict downloaders who use older OOo  
> versions (just supporting myspell) and dict downloaders who use  
> newer OOo versions (using hunspell). At the time where dictionaries  
> which will use hunspell specific features and there is no  
> seperation of my/hunspell dicts, users of myspell OOo versions will  
> not be happy anymore and download broken (for myspell useage)  
> dictionaries ;-).
> Other spellcheck engines than my/hunspell which require other  
> dictionaries will also require a dictionary.lst change to not mix  
> up things completely. For example - how should an Arabic spellcheck  
> engine's dictionary entries be distinguished from an Arabic  
> hunspell dictionary? I hope a dictionary.lst change will come  
> before hunspell will altually BE the default.
>
> Cheers
> Bjoern


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Unicode in MySpell

nemeth-2
In reply to this post by Bugzilla from artavazdm@gmail.com
Quoting Artavazd Mertarjyan <[hidden email]>:

>
> Hi All!
>
> Thanks for detailed answers!
> I agree that HunSpell is better then MySpell and I'm going to localize it
> for Armenian too.
>
> In "hu" project CVS (2.0.1) the Hungarian language isn't defined as UTF-8.
> Does that mean you are not using UTF-8 for Hungarian or you have another one
> solution?

Hi,

You can find the source of the Hungarian OOo 2.0.1 build on our build server:

http://ftp.fsf.hu/OpenOffice.org_hu/2.0.1/

See hu_HU_u8.aff and hu_HU_u8.dic files in the

http://ftp.fsf.hu/OpenOffice.org_hu/2.0.1/OOo_2.0.1_src_hu_additional.tar.gz

file, and in the builds.

> If you are using UTF-8 now, have you compare these two solutions, which is
> faster?
>
> I've some doubt in score of HunSpell's UTF-8 text spell checker.
> May be for Armenian it will be better to use the same algorithm in the
> HunSpell?

Unicode encoding has a little overhead.

Using UTF-8 dictionary is slower on Hungarian texts by 10-20 percent (checks
80,000-90,000 words/s instead of 100,000 words/s on my machine).

But I think, UTF-8 Armenian spell checking will be faster, as _8-bit_ Hungarian
spell checking, because the bottle neck is the complexity of the
morphology (the affix description) and the compound word support.
Hungarian uses double suffix stripping plus compounding, and enough
fast with UTF-8 encoding, too.

We need the best spell checking and other Lingucomponent support for Armenian,
too. Please, write (and make issues in the Issuezilla) about problems of
Armenian OOo. For example, need for Armenian breakiterator patch, Armenian
hyphenation with the special Armenian hyphen character, etc. UTF-8 encoding has
already set in the Thesaurus component of OOo 2.0.1, thanks to the report of the
Nepali developers.

Best regards,

Laci

>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>




----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Unicode in MySpell

Bjoern JACKE-3
In reply to this post by Kevin B. Hendricks
On 2005-12-23 at 21:07 -0500 Kevin B. Hendricks sent off:
>The whole "dictionary.lst" mechanism is part of the spell checking  
>component (ie. the MySpell spell checker).  So another spell checking  
>component need not and should not use the concept of "dictionary.lst"  
>since that was developed for MySpell and MyThes components.  It is  
>being adopted for HunSpell since it is replacing MySpell.  So an  
>arabic or commerical or some other spell checker should not use  
>MySpell dictionary.lst but should instead use some other method to  
>allow users to install and update their dictionaries.

I think reality looks different - from the start dictionary.lst
contained hyphenation and thesaurus dictionaries. That dictionary.lst
is a myspell-only thing is realy new for me. Apart from that a name
with a very generic name like dictionary.lst at such an exposted place
like this would be expected by everyone to be a generic config file
not a myspell-specific one.

Cheers
Bjoern

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Unicode in MySpell

Bugzilla from artavazdm@gmail.com
In reply to this post by nemeth-2

Hi Nemeth,

Thank you for answers!
Your experience is very interesting and important for me, and thanks for
valuable information!

I think the complexity of Armenian morphology not less or equal to
Hungarian. Affix descriptions which we've used in OOo 1.1 are very typical
and now our linguists are working on them.
Armenian hyphenation based on a few grammatical rules which have many
exceptions.

We want finish translation as soon as possible, and concentrate the all our
resources on the other aspects of localization.

I'm not sure that we'll be able to finish works on hyphenation module and
thesaurus in current version, but we are going to add them in the next
versions.

Best regards!



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Unicode in MySpell

Kevin B. Hendricks
In reply to this post by Bjoern JACKE-3
Hi,

I designed it that way.  That code is written and put into place by  
me to support the components I either wrote (myspell and mythes) or  
inherited (hyphenator).    The reason you are confused is that there  
really has never been another spellchecker, thesaurus or hyphenator  
for OOo until now.  I chose to use a dictionary.lst and its install  
location and its format so that OOo had something that worked at  
all.  Now, Hunspell can take that code over since it will replace  
MySpell as the "official" spellchecker of OOo but other non-official  
components really should and will figure out how to let their own  
components know what the users have installed and actually want to use.

My 2 cents,

Kevin



On Dec 24, 2005, at 3:43 AM, Bjoern JACKE wrote:

> On 2005-12-23 at 21:07 -0500 Kevin B. Hendricks sent off:
>> The whole "dictionary.lst" mechanism is part of the spell  
>> checking  component (ie. the MySpell spell checker).  So another  
>> spell checking  component need not and should not use the concept  
>> of "dictionary.lst"  since that was developed for MySpell and  
>> MyThes components.  It is  being adopted for HunSpell since it is  
>> replacing MySpell.  So an  arabic or commerical or some other  
>> spell checker should not use  MySpell dictionary.lst but should  
>> instead use some other method to  allow users to install and  
>> update their dictionaries.
>
> I think reality looks different - from the start dictionary.lst  
> contained hyphenation and thesaurus dictionaries. That  
> dictionary.lst is a myspell-only thing is realy new for me. Apart  
> from that a name with a very generic name like dictionary.lst at  
> such an exposted place like this would be expected by everyone to  
> be a generic config file not a myspell-specific one.
>
> Cheers
> Bjoern
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: dev-
> [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]