[lingu-dev] Support for Tajik

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[lingu-dev] Support for Tajik

emnej
Hi,

A colleague of me has a word list for Tajik with at least 8,000 words.
He wants to make that list available for use within a word processor and
wasn't very successful with MsWord. That's why he asked me about Openoffice.

 From the information that we found in the archives, it seemed to us
that MySpell wouldn't support all the Tajik characters, but that
Hunspell does. I downloaded and installed Hunspell under Windows XP and
got it to work with Uzbek, so that looks promising.

We now have two questions:

1. What would be the best way to change the word list into a format that
can be used by Hunspell?

2. Currently, Tajik (only 6 million native speakers) is not a language
that is selectable in the list of default languages. Once we have the
dictionary, would it be possible to add it?

I understand that these questions are pretty basic, but we've never had
to make a dictionary for an Office suite before so I hope that you can
help us.

All the best,

Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [lingu-dev] Support for Tajik

Michel Weimerskirch
Hi

> 1. What would be the best way to change the word list into a format that
> can be used by Hunspell?
Hunspell can use the same word lists you created for MySpell, so I suggest you
read this to see how to create those lists:
http://lingucomponent.openoffice.org/dictionary.html
Additionally, Hunspell has some additional features, which you can find here:
http://sourceforge.net/docman/display_doc.php?docid=29374&group_id=143754

> 2. Currently, Tajik (only 6 million native speakers) is not a language
> that is selectable in the list of default languages. Once we have the
> dictionary, would it be possible to add it?
I had the same problem for "Lëtzebuergesch" (Luxembourgish). I filed an issue
and it is currently being worked on:
http://qa.openoffice.org/issues/show_bug.cgi?id=54818
Thus, I suggest you do the same and file an issue for Tajik.

I really hope your project succeeds!

Michel Weimerskirch

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [lingu-dev] Support for Tajik

Дмитрий Габинский
In reply to this post by emnej
> A colleague of me has a word list for Tajik with at least 8,000
>words.
?
> 1. What would be the best way to change the word list into a format
>that can be used by Hunspell?

If I understand you correctly, you've got a world list as a plain-text
file. I'm not sure, what is ?best?, but the quickest solution can be
as follows:
1) Save your word list as a text file in UTF-8 with .dic extension
(say, tojik.dic)
2) Create an affix file with the same name and .aff. extension (for
the given example, this should be tojik.aff). Just to begin with that,
you'll need  there only one line:

SET UTF-8

3) Then copy you files to the folder with spellcheck modules and
modify dictionary.lst file. Since Tajik is not supported currently,
you'll have to engage you module for another language. For example,
you can replace German(Germany), if you don't need that one:

DICT de DE tojik

It does not matter much, which language you replace, but you'll have
to format your text language respectively.

So, here it is.

An advanced way is to create a true affix file. As far as I
understand, Tajik, as an Iranian language, should have a grammar more
or less similar to European languages, so there should be no problem.
You can start with instructions at
http://lingucomponent.openoffice.org/dictionary.html, and this group
is also ready to help.

> 2. Currently, Tajik (only 6 million native speakers)

Only? Well, many languages currently supported have less?

> is not a
>language that is selectable in the list of default languages. Once we
>have the dictionary, would it be possible to add it?

This is not only about an antry to the spellcheck list, but also about
many data items necessary to build a locale. I'm not very competent in
this matter, hopefully, someone will comment more.

Best regards,

Dmitri Gabinski
   
   
   
---
?????? ?????? ??????????? ??????? ? ?????? - ???? "???" ? ?????? ????
?? ???????? ??????. ???????? ????? ?? 8000. ?? ????????? ?? ??????:
?????, ??.???????????????, 3 (??????????? ??? ????????)
???? "???", ?.768-90-73, e-mail: [hidden email], http://www.cafedom.by

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]