pt-BR

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

pt-BR

ge-7
Eduardo,

I understood the affixes from here:
http://lingucomponent.openoffice.org/affix.readme

The hunspell secondary affix flag is documented here:
http://tkltrans.sourceforge.net/tklspell/secflag.htm

Good Luck!
Eleonora.


Hi, when I started writting on the Writer I realized that our portuguese-br dictionary ins't goog. So this week I was trying to understand how it works so that we could improve it.
?
?So far, I have read the "How-to create a Spelling Dictionary" section, I have compiled the MySpell stand alone program and tested it.
?
?I understod that we have the sufix file .aff and the vocabulary file .dic. But I haven't understod how the sufix file really works. I know that for each language they gone different.
?
?At the site of the portuguese dictionay I saw it's from 1999! (as you can see here: http://www.ime.usp.br/~ueda/br.ispell/summary.html)
?
?Asking around, someone show me a bigger one (http://www.coc.ufrj.br/~douglas/dict/pt_BR.dic.bz2).
?


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: pt-BR

Eduardo Santana-2
It's a little difficult to understand.

I had read the affix.readme before. But it doesn't say
anything about the PFX ou REP flag. Should we use them
in our aff file?

Also, the .aff file has no comment! Can we add comment
to that?

Isn't there a better documentation somewhere?

Eduardo

--- ge <[hidden email]> escreveu:

> Eduardo,
>
> I understood the affixes from here:
> http://lingucomponent.openoffice.org/affix.readme
>
> The hunspell secondary affix flag is documented
> here:
> http://tkltrans.sourceforge.net/tklspell/secflag.htm
>
> Good Luck!
> Eleonora.
>
>
> Hi, when I started writting on the Writer I realized
> that our portuguese-br dictionary ins't goog. So
> this week I was trying to understand how it works so
> that we could improve it.
>  
>  So far, I have read the "How-to create a Spelling
> Dictionary" section, I have compiled the MySpell
> stand alone program and tested it.
>  
>  I understod that we have the sufix file .aff and
> the vocabulary file .dic. But I haven't understod
> how the sufix file really works. I know that for
> each language they gone different.
>  
>  At the site of the portuguese dictionay I saw it's
> from 1999! (as you can see here:
> http://www.ime.usp.br/~ueda/br.ispell/summary.html)
>  
>  Asking around, someone show me a bigger one
>
(http://www.coc.ufrj.br/~douglas/dict/pt_BR.dic.bz2).
>  
>
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [hidden email]
> For additional commands, e-mail:
> [hidden email]
>
>



       



       
               
_______________________________________________________
Yahoo! doce lar. Fa?a do Yahoo! sua homepage.
http://br.yahoo.com/homepageset.html 


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: pt-BR

Simon Brouwer
Hi Eduardo,

Eduardo Santana wrote:

>It's a little difficult to understand.
>
>I had read the affix.readme before. But it doesn't say
>anything about the PFX ou REP flag. Should we use them
>in our aff file?
>  
>
The PFX flag is similar to the SFX flag, only it is about parts of words
that can be added in front of them.
In Portuguese, examples of prefixes would be "re" "de" "a" "con" "pro"
"contra", (you can probably think of more
and better ones).

The REP command is useful because the normal suggestion producing
mechanism can only handle relatively
simple spelling errors (one letter wrong, one letter missing, one letter
too much, two letters exchanged).
Using the REP command you can instruct the suggestion mechanism to also
try replacing the specified character
sequence. For example, letter combinations that are often spelled wrong
because they sound similar, or whole
words that are often spelled wrongly.

>Also, the .aff file has no comment! Can we add comment
>to that?
>
>Isn't there a better documentation somewhere?
>  
>
Some more explanation:

The reason to use prefixes and suffixes is that in most languages,
including Portuguese, many words
are derived from a basic form in a regular way. For example, andar,
ando, andei, anda, andam, andamos, andando, ...
Instead of listing each of these forms, an affix compressed dictionary
needs to list only the basic form, together
with flags that indicate which other forms are recognized. You can see
how this is more efficient.

The affix file specifies the rules for making these other forms, and is
typically based on observations about
the grammar of the language in question. The more commonly used a prefix
or affix is in a language, the more
efficiency is added by using it in the affix file. So there is no
"right" or "wrong" set of affixes, it is just that if the
set of affixes is cleverly chosen, the dictionary file will be more
efficiently compressed.

You can automatically generate the affix compressed dictionary using a
plain list of words, that contains *all* the word forms,
and an affix file, using the "munch" utility that is included in
myspell. The reverse is the "unmunch" utility which you can use to
get a full plain word list back.

I hope this will make it a bit clearer. Don't hesitate if you have
further questions!

Atenciosamente,

Simon Brouwer.

>>> nl.openoffice.org <<<


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: pt-BR

Дмитрий Габинский
In reply to this post by Eduardo Santana-2
> I had read the affix.readme before. But it doesn't say
> anything about the PFX ou REP flag. Should we use them
> in our aff file?

It's really up to you. I would only guess that you probably should
focus on suffixes first because:
a) to the best of my knowledge, Portuguese has a relatively litimed
set of prefixes, so you can simply add prefixed words as separate
entries;
b) Portuguese writing is more or less phonetic and there are no
specific mistakes. I'd only guess the following vairant: in Brasilian,
as I know, the letter � is used instead of � as in Iberian Portuguese
in quite many cases, so you can add as follows:
REP 1
� �
to ?brazilianize? spelling.

> Also, the .aff file has no comment! Can we add comment
> to that?

Yes, use a # to delimit a comment:
# E um comento

I can only advise to put comments into separate lines; in my
experience, comments in line with suffix entries lead to bugs and
crashes.

> Isn't there a better documentation somewhere?

I haven't seen anything better and even anything else. Try to study
affix files for other languages.

Atenciosamente,

Dmitri Gabinski
   
   
   
---
??? ????????? ? ?????? ?????????? ?????. ??????????? ?? "?????", ???????
???????? 50%-??? ?????? ?? ??? ?????? ?????? ???? ?? 31 ??????.
??????????? ?? http://www.mts.by/news/e90a060f9157a213.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: pt-BR

Duarte M. Costa
Sorry for interfering with your mail.
I would like to put a question collateral to the matter of your
discussion. I am not familiar with OOo "mechanics", how "technical"
matters work. Would someone of you be so kind to tell me where can I
find the meaning of the different abbreviations I see many times: for
instance, PFX (prefix?), SFX (suffix?), REP, .aff file, etc., etc.
Thank you.

Duarte M.  Costa
[hidden email]


Dmitri Gabinski wrote:

>> I had read the affix.readme before. But it doesn't say
>> anything about the PFX ou REP flag. Should we use them
>> in our aff file?
>
>
> It's really up to you. I would only guess that you probably should
> focus on suffixes first because:
> a) to the best of my knowledge, Portuguese has a relatively litimed
> set of prefixes, so you can simply add prefixed words as separate
> entries;
> b) Portuguese writing is more or less phonetic and there are no
> specific mistakes. I'd only guess the following vairant: in Brasilian,
> as I know, the letter � is used instead of � as in Iberian Portuguese
> in quite many cases, so you can add as follows:
> REP 1
> � �
> to ?brazilianize? spelling.
>
>> Also, the .aff file has no comment! Can we add comment
>> to that?
>
>
> Yes, use a # to delimit a comment:
> # E um comento
>
> I can only advise to put comments into separate lines; in my
> experience, comments in line with suffix entries lead to bugs and
> crashes.
>
>> Isn't there a better documentation somewhere?
>
>
> I haven't seen anything better and even anything else. Try to study
> affix files for other languages.
>
> Atenciosamente,
>
> Dmitri Gabinski
>         ---
> ??? ????????? ? ?????? ?????????? ?????. ??????????? ?? "?????", ???????
> ???????? 50%-??? ?????? ?? ??? ?????? ?????? ???? ?? 31 ??????.
> ??????????? ?? http://www.mts.by/news/e90a060f9157a213.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]