Hyphen should be forwarded to spellchecker

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Hyphen should be forwarded to spellchecker

Per Eriksson-2
Hello,

I would like to know the status for 64400. What is the current status?

--
:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.
Best Regards
Per Eriksson
Lead Swedish Native Lang Project
OpenOffice.org Community
Phone: +46 70 560 10 33
Email: [hidden email]
Web: http://sv.openoffice.org/


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

Olivier R.-2
Per Eriksson a écrit :

> I would like to know the status for 64400. What is the current status?

I would like to know also what solution will be applied.

It would useful to prepare dictionaries accordingly.

Can we already assume that if a word with hyphen is not recognized, each
parts of the word will be checked separately?

Regards,
Olivier R.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

thomas.lange

Hi all,

Olivier R. wrote:
> Per Eriksson a écrit :
>
> > I would like to know the status for 64400. What is the current status?
>
> I would like to know also what solution will be applied.
>
> It would useful to prepare dictionaries accordingly.
>  

It is planned that hyphens should become part of the word as recognized
by the breakiterator. The ecxact types of hyphens and languages where to
apply this (probably all western languages at least, maybe all
languages) is still up for discussion. From my point of view I currently
see no reason why the same behavior should not be applied to all languages.

As to when the change will happen:
It should be fixed for OOo 3.2 but details are not yet known.

> Can we already assume that if a word with hyphen is not recognized, each
> parts of the word will be checked separately?
>  
That needs to be discussed with László since that will probably needed
to be taken care of by hunspell.
However as mentioned in the issue, a problem will be that the current
spell check API can not mark parts of the text to be checked as wrong.
Also even if it could be done in the spell check API the applications
are not yet prepared for it.
See my comment from /Wed Sep 3 11:31:15 +0000 2008 in the issue,
especially in c).

Of course in order to see if that approach works at all we need to have
the hyphen to be part of the word as soon as possible. Only then we can
see in detail what additional problems will arise and how to handle them.
/
The long run optimal solution would be that this problem is handled by a
grammar checker that is also a spell checker and thus can make use of
the grammar checking API that allows for proper handling of this problem
(mark only part of the text and provide suggestions only for that part).


Regards,
Thomas



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

Ruud Baars-2
Thomas Lange - Sun Germany - ham02 - Hamburg schreef:

> Hi all,
>
> Olivier R. wrote:
>  
>> Per Eriksson a écrit :
>>
>>    
>>> I would like to know the status for 64400. What is the current status?
>>>      
>> I would like to know also what solution will be applied.
>>
>> It would useful to prepare dictionaries accordingly.
>>  
>>    
>
> It is planned that hyphens should become part of the word as recognized
> by the breakiterator. The ecxact types of hyphens and languages where to
> apply this (probably all western languages at least, maybe all
> languages) is still up for discussion. From my point of view I currently
> see no reason why the same behavior should not be applied to all languages.
>  
It would be best to have a design that has a language dependend
word-iteration.
Which chars can and cannot be part of a word is rather depending on the
language, plus the programmed behaviour of the components below that.
For Dutch, the -, the ' (under certain circumstances) are word chars.

Some general construction like SRX does for sentences, could be used fo
applicar isolationg the words in a sentence too.
This construct would preferably be applicable to all  programs  using
spellchecking. Grammar has partly the same issues for findign the tokens.
> As to when the change will happen:
> It should be fixed for OOo 3.2 but details are not yet known.
>
>  
That would be perfect! We could prepare the word list to recognize
incorrectly used dashes ...

>> Can we already assume that if a word with hyphen is not recognized, each
>> parts of the word will be checked separately?
>>  
>>    
> That needs to be discussed with László since that will probably needed
> to be taken care of by hunspell.
> However as mentioned in the issue, a problem will be that the current
> spell check API can not mark parts of the text to be checked as wrong.
> Also even if it could be done in the spell check API the applications
> are not yet prepared for it.
> See my comment from /Wed Sep 3 11:31:15 +0000 2008 in the issue,
> especially in c).
>
> Of course in order to see if that approach works at all we need to have
> the hyphen to be part of the word as soon as possible. Only then we can
> see in detail what additional problems will arise and how to handle them.
> /
> The long run optimal solution would be that this problem is handled by a
> grammar checker that is also a spell checker and thus can make use of
> the grammar checking API that allows for proper handling of this problem
> (mark only part of the text and provide suggestions only for that part).
>  
My idea! Let's integrate 'automated proofreading' !

>
> Regards,
> Thomas
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>
>  

Reply | Threaded
Open this post in threaded view
|

Spellchecking and OOo setting

Ruud Baars-2
In reply to this post by Olivier R.-2
In OOo there is a setting that makes OOo ignore spellchecking for words,
written completely in UPPERCASE.

Maybe the intention is to ignore the amount of 3-letter company names.

Nevertheless, it also prevents spellchecking correcting the misspelled
DVD and CD to be corrected into dvd and cd, which is the official
spelling for Dutch. We were so happy to have found Laszlo's 'keepcase'...

Is it possible to change this option from the spelling extension? Where
could I find info on that?

Hope you can help.

Ruud Baars
OpenTaal

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

thomas.lange
In reply to this post by Per Eriksson-2


Hi Ruud,


> Thomas Lange - Sun Germany - ham02 - Hamburg schreef:
>> Hi all,
>>
>> Olivier R. wrote:
>>  
>>> Per Eriksson a écrit :
>>>
>>>    
>>>> I would like to know the status for 64400. What is the current status?
>>>>      
>>> I would like to know also what solution will be applied.
>>>
>>> It would useful to prepare dictionaries accordingly.
>>>  
>>>    
>>
>> It is planned that hyphens should become part of the word as recognized
>> by the breakiterator. The ecxact types of hyphens and languages where to
>> apply this (probably all western languages at least, maybe all
>> languages) is still up for discussion. From my point of view I currently
>> see no reason why the same behavior should not be applied to all languages.
>>  
> It would be best to have a design that has a language dependend
> word-iteration.
> Which chars can and cannot be part of a word is rather depending on the
> language, plus the programmed behaviour of the components below that.
> For Dutch, the -, the ' (under certain circumstances) are word chars.

The breakiterator is locale specific and thus can be cusomized for each
language. Still the question remains if that change should only be
applied to every language listed in the issue or by default to all
language since probably all are going to benfit from it.


Thomas


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

thomas.lange
In reply to this post by Per Eriksson-2
Hi Ruud,


> > Thomas Lange - Sun Germany - ham02 - Hamburg schreef:
>  
>> >> Hi all,
>> >>
>> >> Olivier R. wrote:
>> >>  
>>    
>>> >>> Per Eriksson a écrit :
>>> >>>
>>> >>>    
>>>      
>>>> >>>> I would like to know the status for 64400. What is the current status?
>>>> >>>>      
>>>>        
>>> >>> I would like to know also what solution will be applied.
>>> >>>
>>> >>> It would useful to prepare dictionaries accordingly.
>>> >>>  
>>> >>>    
>>>      
>> >>
>> >> It is planned that hyphens should become part of the word as recognized
>> >> by the breakiterator. The ecxact types of hyphens and languages where to
>> >> apply this (probably all western languages at least, maybe all
>> >> languages) is still up for discussion. From my point of view I currently
>> >> see no reason why the same behavior should not be applied to all languages.
>> >>  
>>    
> > It would be best to have a design that has a language dependend
> > word-iteration.
> > Which chars can and cannot be part of a word is rather depending on the
> > language, plus the programmed behaviour of the components below that.
> > For Dutch, the -, the ' (under certain circumstances) are word chars.
>  

The breakiterator is locale specific and thus can be cusomized for each
language. Still the question remains if that change should only be
applied to every language listed in the issue or by default to all
language since probably all are going to benfit from it.


Thomas


Reply | Threaded
Open this post in threaded view
|

Re: Spellchecking and OOo setting

thomas.lange
In reply to this post by Per Eriksson-2

Hi Ruud Baars,

> In OOo there is a setting that makes OOo ignore spellchecking for words,
> written completely in UPPERCASE.
>
> Maybe the intention is to ignore the amount of 3-letter company names.
>
> Nevertheless, it also prevents spellchecking correcting the misspelled
> DVD and CD to be corrected into dvd and cd, which is the official
> spelling for Dutch. We were so happy to have found Laszlo's 'keepcase'...
>
> Is it possible to change this option from the spelling extension? Where
> could I find info on that?

It is possible to change that setting from an extension. But since
unfortunately that setting applies to all languages it is probably not a
very good idea to change a global setting from within a language
specific extension.

As for the property:
the configuration node path is "org.openoffice.Office.Linguistic" here
in the group "SpellChecking" you can find the property named
"IsSpellUpperCase".


Thomas


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

Carlos Menezes
In reply to this post by thomas.lange
Compound words in Portuguese can have hyphen connecting their parts, like
"vice-presidente", for example.

Regards,

Carlos Menezes
CoGrOO Project


2009/6/8 Thomas Lange - Sun Germany - ham02 - Hamburg <[hidden email]>

>
>
> Hi Ruud,
>
>
> > Thomas Lange - Sun Germany - ham02 - Hamburg schreef:
> >> Hi all,
> >>
> >> Olivier R. wrote:
> >>
> >>> Per Eriksson a écrit :
> >>>
> >>>
> >>>> I would like to know the status for 64400. What is the current status?
> >>>>
> >>> I would like to know also what solution will be applied.
> >>>
> >>> It would useful to prepare dictionaries accordingly.
> >>>
> >>>
> >>
> >> It is planned that hyphens should become part of the word as recognized
> >> by the breakiterator. The ecxact types of hyphens and languages where to
> >> apply this (probably all western languages at least, maybe all
> >> languages) is still up for discussion. From my point of view I currently
> >> see no reason why the same behavior should not be applied to all
> languages.
> >>
> > It would be best to have a design that has a language dependend
> > word-iteration.
> > Which chars can and cannot be part of a word is rather depending on the
> > language, plus the programmed behaviour of the components below that.
> > For Dutch, the -, the ' (under certain circumstances) are word chars.
>
> The breakiterator is locale specific and thus can be cusomized for each
> language. Still the question remains if that change should only be
> applied to every language listed in the issue or by default to all
> language since probably all are going to benfit from it.
>
>
> Thomas
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

Ruud Baars-2
In reply to this post by Olivier R.-2
Olivier R. schreef:

> Per Eriksson a écrit :
>
>> I would like to know the status for 64400. What is the current status?
>
> I would like to know also what solution will be applied.
>
> It would useful to prepare dictionaries accordingly.
>
> Can we already assume that if a word with hyphen is not recognized,
> each parts of the word will be checked separately?
I hope that is not what OOo will do! That would create space for all
erroneous usage of a - in Dutch words.
I would like only the correct words to be approved.

>
> Regards,
> Olivier R.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

Javier SOLA
In reply to this post by thomas.lange
Thomas Lange - Sun Germany - ham02 - Hamburg wrote:

> Hi Ruud,
>
>
>  
>> Thomas Lange - Sun Germany - ham02 - Hamburg schreef:
>>    
>>> Hi all,
>>>
>>> Olivier R. wrote:
>>>  
>>>      
>>>> Per Eriksson a écrit :
>>>>
>>>>    
>>>>        
>>>>> I would like to know the status for 64400. What is the current status?
>>>>>      
>>>>>          
>>>> I would like to know also what solution will be applied.
>>>>
>>>> It would useful to prepare dictionaries accordingly.
>>>>  
>>>>    
>>>>        
>>> It is planned that hyphens should become part of the word as recognized
>>> by the breakiterator. The ecxact types of hyphens and languages where to
>>> apply this (probably all western languages at least, maybe all
>>> languages) is still up for discussion. From my point of view I currently
>>> see no reason why the same behavior should not be applied to all languages.
>>>  
>>>      
>> It would be best to have a design that has a language dependend
>> word-iteration.
>> Which chars can and cannot be part of a word is rather depending on the
>> language, plus the programmed behaviour of the components below that.
>> For Dutch, the -, the ' (under certain circumstances) are word chars.
>>    
>
> The breakiterator is locale specific and thus can be cusomized for each
> language. Still the question remains if that change should only be
> applied to every language listed in the issue or by default to all
> language since probably all are going to benfit from it.
>  
If I understand correcly, this should not be applied to all languages.
Spanish -when used correctly- attaches hyphens to normal words as I just
did in this sentence, unlike English - which separates the hyphens from
the words - as I just did. Microsoft word used to break spanish
spell-checking because of this. This cannot be changed for all
languages, only specifically for those who ask for it, otherwise many
spellcheckers might break.

Javier

>
> Thomas
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>  


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

thomas.lange

Hi all,

Javier SOLA wrote:

> ...
> > The breakiterator is locale specific and thus can be cusomized for each
> > language. Still the question remains if that change should only be
> > applied to every language listed in the issue or by default to all
> > language since probably all are going to benfit from it.
> >  
> If I understand correcly, this should not be applied to all languages.
> Spanish -when used correctly- attaches hyphens to normal words as I just
> did in this sentence, unlike English - which separates the hyphens from
> the words - as I just did. Microsoft word used to break spanish
> spell-checking because of this. This cannot be changed for all
> languages, only specifically for those who ask for it, otherwise many
> spellcheckers might break.
>  

The pre- and postfix hyphen thingie will only get applied to languages
if request. It was planned this way from the start. Since up until this
morning in issue 64400 this was only specifically requested for Swedish
it will be applied for German an Swedish only right now. I will make the
change for Swedish right away and then the CWS should be ready for QA today.

Thomas



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

Per Eriksson-2
Hi Thomas,

Thomas Lange - Sun Germany - ham02 - Hamburg skrev:

> Hi all,
>
> Javier SOLA wrote:
>  
>> ...
>>    
>>> The breakiterator is locale specific and thus can be cusomized for each
>>> language. Still the question remains if that change should only be
>>> applied to every language listed in the issue or by default to all
>>> language since probably all are going to benfit from it.
>>>  
>>>      
>> If I understand correcly, this should not be applied to all languages.
>> Spanish -when used correctly- attaches hyphens to normal words as I just
>> did in this sentence, unlike English - which separates the hyphens from
>> the words - as I just did. Microsoft word used to break spanish
>> spell-checking because of this. This cannot be changed for all
>> languages, only specifically for those who ask for it, otherwise many
>> spellcheckers might break.
>>  
>>    
>
> The pre- and postfix hyphen thingie will only get applied to languages
> if request. It was planned this way from the start. Since up until this
> morning in issue 64400 this was only specifically requested for Swedish
> it will be applied for German an Swedish only right now. I will make the
> change for Swedish right away and then the CWS should be ready for QA today.
>  

Thanks for your great work. The Swedish community has a very active
dictionary team, and they would like to have a CWS build for Win32 or
RPM for testing. They are very competent and would have a great use of a
CWS build. Is this possible from your side?

Best
Per

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

Per Eriksson-2
In reply to this post by thomas.lange
Hi Thomas

Thomas Lange - Sun Germany - ham02 - Hamburg skrev:

> Hi all,
>
> Javier SOLA wrote:
>  
>> ...
>>    
>>> The breakiterator is locale specific and thus can be cusomized for each
>>> language. Still the question remains if that change should only be
>>> applied to every language listed in the issue or by default to all
>>> language since probably all are going to benfit from it.
>>>  
>>>      
>> If I understand correcly, this should not be applied to all languages.
>> Spanish -when used correctly- attaches hyphens to normal words as I just
>> did in this sentence, unlike English - which separates the hyphens from
>> the words - as I just did. Microsoft word used to break spanish
>> spell-checking because of this. This cannot be changed for all
>> languages, only specifically for those who ask for it, otherwise many
>> spellcheckers might break.
>>  
>>    
>
> The pre- and postfix hyphen thingie will only get applied to languages
> if request. It was planned this way from the start. Since up until this
> morning in issue 64400 this was only specifically requested for Swedish
> it will be applied for German an Swedish only right now. I will make the
> change for Swedish right away and then the CWS should be ready for QA today.
>  

Our dictionary people wonder the following:

Will OOo read the Hunspell WORDCHARS directive, and not forward the
hyphens if the following is mentioned in the dictionary?

WORDCHARS -

Or will OOo 3.2 always forward the hyphen in the beginning and at the
end of words?

Best
Per


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

thomas.lange
In reply to this post by Per Eriksson-2
Hi Per,

> > The pre- and postfix hyphen thingie will only get applied to languages
> > if request. It was planned this way from the start. Since up until this
> > morning in issue 64400 this was only specifically requested for Swedish
> > it will be applied for German an Swedish only right now. I will make the
> > change for Swedish right away and then the CWS should be ready for QA today.
> >  
>
> Thanks for your great work. The Swedish community has a very active
> dictionary team, and they would like to have a CWS build for Win32 or
> RPM for testing. They are very competent and would have a great use of a
> CWS build. Is this possible from your side?
>  

A Linux test-installation (*.tar.gz) can now be found here:
http://ooo.services.openoffice.org/pub/OpenOffice.org/cws/upload/tl70/

Thomas



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

thomas.lange
In reply to this post by Per Eriksson-2

Hi Per,

> Our dictionary people wonder the following:
>
> Will OOo read the Hunspell WORDCHARS directive, and not forward the
> hyphens if the following is mentioned in the dictionary?
>
> WORDCHARS -
>
> Or will OOo 3.2 always forward the hyphen in the beginning and at the
> end of words?
>  

The OOo breakiterator does not know about dictionaries. Thus it will
always include the hyphens now as part of the word.

(The only dictionaries that are implicitly known by the breakiterator
are the hyphenation dictionaries. This is because the breakiterator is
used by the layout to determine the position for line-breaking. And to
do that it uses the regular hyphenation component to get this done.
But still, even for this task the word boundaries are obtained without
looking at any dictionary content.)

Thomas



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

Per Eriksson-2
Hello Thomas,

Thomas Lange - Sun Germany - ham02 - Hamburg skrev:
> The OOo breakiterator does not know about dictionaries. Thus it will
> always include the hyphens now as part of the word.
>
> (The only dictionaries that are implicitly known by the breakiterator
> are the hyphenation dictionaries. This is because the breakiterator is
> used by the layout to determine the position for line-breaking. And to
> do that it uses the regular hyphenation component to get this done.
> But still, even for this task the word boundaries are obtained without
> looking at any dictionary content.

So the breakiterator will not read the affix file read the WORDCHARS
directive? It will always be on?

Best
Per

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

Simon Brouwer
In reply to this post by thomas.lange
Hi Thomas,

Thomas Lange - Sun Germany - ham02 - Hamburg schreef:

> Hi all,
>
> Javier SOLA wrote:
>  
>> ...
>>    
>>> The breakiterator is locale specific and thus can be cusomized for each
>>> language. Still the question remains if that change should only be
>>> applied to every language listed in the issue or by default to all
>>> language since probably all are going to benfit from it.
>>>  
>>>      
>> If I understand correcly, this should not be applied to all languages.
>> Spanish -when used correctly- attaches hyphens to normal words as I just
>> did in this sentence, unlike English - which separates the hyphens from
>> the words - as I just did. Microsoft word used to break spanish
>> spell-checking because of this. This cannot be changed for all
>> languages, only specifically for those who ask for it, otherwise many
>> spellcheckers might break.
>>  
>>    
>
> The pre- and postfix hyphen thingie will only get applied to languages
> if request. It was planned this way from the start. Since up until this
> morning in issue 64400 this was only specifically requested for Swedish
> it will be applied for German an Swedish only right now. I will make the
> change for Swedish right away and then the CWS should be ready for QA today.
>
> Thomas
>  
Sorry for my tardiness, but could you apply it to Dutch as well? I
wanted to check on the OpenTaal mailinglist first if there were any
objections (no objections). I re-opened the issue and added the request
in there as well.

--
Vriendelijke groet,
Simon Brouwer.

| http://nl.openoffice.org | http://www.opentaal.org |


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

Simon Brouwer
In reply to this post by thomas.lange
Hi Thomas,

Thomas Lange - Sun Germany - ham02 - Hamburg schreef:

> Hi Per,
>
>  
>> Our dictionary people wonder the following:
>>
>> Will OOo read the Hunspell WORDCHARS directive, and not forward the
>> hyphens if the following is mentioned in the dictionary?
>>
>> WORDCHARS -
>>
>> Or will OOo 3.2 always forward the hyphen in the beginning and at the
>> end of words?
>>  
>>    
>
> The OOo breakiterator does not know about dictionaries. Thus it will
> always include the hyphens now as part of the word.
>
> (The only dictionaries that are implicitly known by the breakiterator
> are the hyphenation dictionaries. This is because the breakiterator is
> used by the layout to determine the position for line-breaking. And to
> do that it uses the regular hyphenation component to get this done.
> But still, even for this task the word boundaries are obtained without
> looking at any dictionary content.)
>  
Would it be possible to add a function in the break iterator to switch
the feature on or off? Then Hunspell could use this function when it
parses the affix file.

At least for Dutch, changes in the dictionary and possibly support in
Hunspell have to be added to use this feature. It would be useful if a
switch in the affix file could enable or disable it so that spell
checking with the existing dictionary (before we complete these changes)
would work as before.

By the way, does Hunspell accept hyphen characters as part of a prefix
or suffix?
e.g. if we use these affix rules:

PFX X 0 - .

SFX Y 0 - .
SFX Z 0 s- .

--
Vriendelijke groet,
Simon Brouwer.

| http://nl.openoffice.org | http://www.opentaal.org |


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Hyphen should be forwarded to spellchecker

thomas.lange
In reply to this post by Per Eriksson-2
Hi Per,

>
>
> Thomas Lange - Sun Germany - ham02 - Hamburg skrev:
>> The OOo breakiterator does not know about dictionaries. Thus it will
>> always include the hyphens now as part of the word.
>>
>> (The only dictionaries that are implicitly known by the breakiterator
>> are the hyphenation dictionaries. This is because the breakiterator is
>> used by the layout to determine the position for line-breaking. And to
>> do that it uses the regular hyphenation component to get this done.
>> But still, even for this task the word boundaries are obtained without
>> looking at any dictionary content.
>
> So the breakiterator will not read the affix file read the WORDCHARS
> directive? It will always be on?
Correct. The breakiterator never did that. The new behavior is only
about hyphens/dashes.
Is there a problem with that?

Thomas





---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

12