Machine readable list of dictionaries?

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Machine readable list of dictionaries?

F Wolff-2
Hallo everybody

Is there a machine readable list of Hunspell / OOo dictionaries
available somewhere?  I realise we have .oxt files now in OOo. Even a
list of the .oxt files available would be very useful. Currently it
seems a human must download these manually, and there is no predictable
URL for the fr_FR dictionary, for example.

The best currently available seem to be these pages:
http://extensions.services.openoffice.org/dictionary?cid=926386
http://wiki.services.openoffice.org/wiki/Dictionaries


Can anybody give some pointers?  Would this be useful to more people?

Friedel


--
Recently on my blog:
http://translate.org.za/blogs/friedel/en/content/pseudolocalisation-podebug-3-interview-rail-aliev


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Machine readable list of dictionaries?

Jancs
Quoting F Wolff <[hidden email]>:

> Is there a machine readable list of Hunspell / OOo dictionaries
> available somewhere?  I realise we have .oxt files now in OOo. Even a
> list of the .oxt files available would be very useful. Currently it
> seems a human must download these manually, and there is no predictable
> URL for the fr_FR dictionary, for example.
>
> The best currently available seem to be these pages:
> http://extensions.services.openoffice.org/dictionary?cid=926386

I do not understand the problem. The link you mention works great, if  
the autoupdate option is left enabled during installation. It works  
both under Win or Lin.

The link is changing for reason as the development continues and the  
new releases are added to depository.

If there is another software which uses Hunspell engine but has left  
out the autoupdate option from development (Memoq, for example,  
allthough I am not sure) then - yes, the only option is to download  
oxt manually, unzip it and place necessary files where they should  
reside.

Janis

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Machine readable list of dictionaries?

F Wolff-2
Op Di, 2009-10-20 om 10:37 +0300 skryf Jancs:

> Quoting F Wolff <[hidden email]>:
>
> > Is there a machine readable list of Hunspell / OOo dictionaries
> > available somewhere?  I realise we have .oxt files now in OOo. Even a
> > list of the .oxt files available would be very useful. Currently it
> > seems a human must download these manually, and there is no predictable
> > URL for the fr_FR dictionary, for example.
> >
> > The best currently available seem to be these pages:
> > http://extensions.services.openoffice.org/dictionary?cid=926386
>
> I do not understand the problem. The link you mention works great, if  
> the autoupdate option is left enabled during installation. It works  
> both under Win or Lin.
>
> The link is changing for reason as the development continues and the  
> new releases are added to depository.
>
> If there is another software which uses Hunspell engine but has left  
> out the autoupdate option from development (Memoq, for example,  
> allthough I am not sure) then - yes, the only option is to download  
> oxt manually, unzip it and place necessary files where they should  
> reside.
>
> Janis


Hallo Janis

The link works great for a human with a web browser, but not for
anything else.  I would like to be able to reliably download the Breton
spell checker (for example) by just knowing that the language code for
it is br.  At the moment a human has to browse the web and read and
click and do things.  I'm looking for a way to automatically discover
which dictionaries are available, and hopefully have them at a reliable
and predictable URL.  OpenOffice.org is the main repository for FOSS
spell checkers (or at least the Hunspell ones), yet there is now way for
software to help users to easily find them, for example.

Since writing my first email, I found this:
http://ftp.osuosl.org/pub/openoffice/contrib/dictionaries/available.lst
but I don't know if that is up to date or maintained in any way.  Does
anybody know?  Also, this is on a mirror. Is there a canonical URL that
will automatically redirect to mirrors as appropriate?

Friedel


--
Recently on my blog:
http://translate.org.za/blogs/friedel/en/content/pseudolocalisation-podebug-3-interview-rail-aliev


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Machine readable list of dictionaries?

Daniel Naber-9
On Tuesday 20 October 2009, F Wolff wrote:

> Since writing my first email, I found this:
> http://ftp.osuosl.org/pub/openoffice/contrib/dictionaries/available.lst
> but I don't know if that is up to date or maintained in any way.  Does
> anybody know?

That file isn't up-to-date nor maintained. Actually, the
complete "dictionaries" directory on the FTP server is outdated.

Regards
 Daniel

--
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Machine readable list of dictionaries?

F Wolff-2
In reply to this post by F Wolff-2
Op Di, 2009-10-20 om 22:58 +0300 skryf Jancs:

> Quoting F Wolff <[hidden email]>:
>
> > The link works great for a human with a web browser, but not for
> > anything else.  I would like to be able to reliably download the Breton
> > spell checker (for example) by just knowing that the language code for
> > it is br.  At the moment a human has to browse the web and read and
> > click and do things.  I'm looking for a way to automatically discover
> > which dictionaries are available, and hopefully have them at a reliable
> > and predictable URL.  OpenOffice.org is the main repository for FOSS
> > spell checkers (or at least the Hunspell ones), yet there is now way for
> > software to help users to easily find them, for example.
>
> ;) I am still not catching - if someone goes for the use of OO, it  
> could be assumed that such person is located in front of computer  
> running some king of OS with graphic user interface.
>
> If it is the case, for the one challenged to scroll the page there is  
> covenient search box where to type for example Breton and get the link  
> pointing to it. After that - click, click and voila!
>
> What kind of automatic dictionary retrieval and installation  
> application do you plan?
>
> There is always possibility to mirror the repository using various  
> applications - wget and httrack comes to my mind as cross-platform OS  
> tools.

I'm talking about applications other than OOo that want to offer spell
checking capabilities to the user.  Many applications now use the OOo
spell checkers, and refer to the OOo site for people to download, but it
would be much more pleasant if software could simply try to get the
appropriate spell checker automatically for the user. Let's think of a
web based translation QA system that want a spell checker to provide
summaries. The admin might not want to launch browsers, might not have
OOo, or the software might want to do it entirely unattended (on demand,
for example).

Even for end-user software, we must keep in mind that the extensions
site is not available in all the languages that users might need to
understand what they are doing. A user might not even know to click on
"Get it!" :-)

If I remember correctly, with the old DictOOo macro, it provided a list
for the user to choose from (inside the macro), which is much more
convenient than reading a web page. I guess the list I quoted earlier in
the thread provided the information for that which Daniel says is
unmaintained.

Friedel

--
Recently on my blog:
http://translate.org.za/blogs/friedel/en/content/pseudolocalisation-podebug-3-interview-rail-aliev


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Machine readable list of dictionaries?

Jancs
Quoting F Wolff <[hidden email]>:

> I'm talking about applications other than OOo that want to offer spell
> checking capabilities to the user.  Many applications now use the OOo
> spell checkers, and refer to the OOo site for people to download, but it
> would be much more pleasant if software could simply try to get the
> appropriate spell checker automatically for the user. Let's think of a

then it is up to programmer to implement a mechanism similar to OO  
Extension manager. I think, it could not be extremely hard to  
decipher, how extension updater gets to the right (up-to-date)  
releases of dictionaries.

> Even for end-user software, we must keep in mind that the extensions
> site is not available in all the languages that users might need to
> understand what they are doing. A user might not even know to click on
> "Get it!" :-)

I think, it is necessary to note that even OO is not available in all  
languages it offers spellchecking libraries to.

> If I remember correctly, with the old DictOOo macro, it provided a list
> for the user to choose from (inside the macro), which is much more
> convenient than reading a web page. I guess the list I quoted earlier in
> the thread provided the information for that which Daniel says is
> unmaintained.

Yes, it is so and i do not think there will be great support for the  
idea to change production cycle as the current finally seems good  
enough.

Janis

P.S. I could be wrong.

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Machine readable list of dictionaries?

F Wolff-2
Op Wo, 2009-10-21 om 11:21 +0300 skryf Jancs:

> Quoting F Wolff <[hidden email]>:
>
> > I'm talking about applications other than OOo that want to offer spell
> > checking capabilities to the user.  Many applications now use the OOo
> > spell checkers, and refer to the OOo site for people to download, but it
> > would be much more pleasant if software could simply try to get the
> > appropriate spell checker automatically for the user. Let's think of a
>
> then it is up to programmer to implement a mechanism similar to OO  
> Extension manager. I think, it could not be extremely hard to  
> decipher, how extension updater gets to the right (up-to-date)  
> releases of dictionaries.

No, I honestly don't believe it needs to be.  Surely there is huge
advantage for knowing reliably where the file is.  The "Get It" button
already knows it, so the information is already captured somewhere.  So
it would be nice to simply have a URL such as
http://lingucomponent.openoffice.org/hunspell/dictionary.lst
where any software can get a list of the available dictionaries and
download one without user interaction.


> > Even for end-user software, we must keep in mind that the extensions
> > site is not available in all the languages that users might need to
> > understand what they are doing. A user might not even know to click on
> > "Get it!" :-)
>
> I think, it is necessary to note that even OO is not available in all  
> languages it offers spellchecking libraries to.

I think it is available in the order of something like 80 languages.
Our translation application, Virtaal, is much less well known than
OpenOffice.org, and it is available in more than 20 languages. I was not
able to get the extensions site in anything except English.


> > If I remember correctly, with the old DictOOo macro, it provided a list
> > for the user to choose from (inside the macro), which is much more
> > convenient than reading a web page. I guess the list I quoted earlier in
> > the thread provided the information for that which Daniel says is
> > unmaintained.
>
> Yes, it is so and i do not think there will be great support for the  
> idea to change production cycle as the current finally seems good  
> enough.

I'm not commenting on the production cycle of OpenOffice.org.  I'm
trying to make a case for distribution and advancement of the
dictionaries that are already produced and made available through this
project.  I think this is a way to make it more useful and more freely
available.  Yes, OpenOffice.org won't benefit directly, but it will
benefit since it will provide the spell checkers to more people which
should result in more users and contributors.

Friedel

--
Recently on my blog:
http://translate.org.za/blogs/friedel/en/content/pseudolocalisation-podebug-3-interview-rail-aliev


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Machine readable list of dictionaries?

Olivier R.-2
Hello,

Le 22/10/2009 13:32, F Wolff a écrit :

>> then it is up to programmer to implement a mechanism similar to OO
>> Extension manager. I think, it could not be extremely hard to
>> decipher, how extension updater gets to the right (up-to-date)
>> releases of dictionaries.
>
> No, I honestly don't believe it needs to be.  Surely there is huge
> advantage for knowing reliably where the file is.  The "Get It" button
> already knows it, so the information is already captured somewhere.  So
> it would be nice to simply have a URL such as
> http://lingucomponent.openoffice.org/hunspell/dictionary.lst
> where any software can get a list of the available dictionaries and
> download one without user interaction.

It’s probably more relevant to ask on the mailing-list
<[hidden email]> than here. ;)

http://extensions.openoffice.org/

Regards,
--
Olivier R.

== Adresse mail réservée aux listes de discussion.                ==
== Les messages venant d’ailleurs sont _automatiquement_ effacés. ==
** E-mail dedicated to mailing-lists.                             **
** Messages from anywhere else are _automatically_ erased.        **

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Machine readable list of dictionaries?

Andrea Pescetti
In reply to this post by F Wolff-2
On 22/10/2009 F Wolff wrote:
> Our translation application, Virtaal, is much less well known than
> OpenOffice.org, and it is available in more than 20 languages. I was not
> able to get the extensions site in anything except English.

Well, just login on http://extensions.services.openoffice.org/ and you
can choose the language from a drop-down list. There is some caching, so
you will still see English texts for a while. But then you see a
localized interface. At least, Italian is there.

Unfortunately, extension descriptions are still English-only, so a very
big problem remains. But the Drupal site infrastructure does support
localization.

Regards,
  Andrea Pescetti - Italian N-L Project Lead.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]