classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view


How can I get the proper  8-bit encoded morphological dictionaries?
The ones I downloaded from
(morphdb_hu.aff, dic) are obviously not in 8 bit encoded format.

Can I convert them to the proper form? If yes, how?

I tried:
en@anonymous:~/program/humorph$ cat morphdb_hu.aff | iconv -f latin2 -t utf-8 > morphdb_hu.aff.u8
en@anonymous:~/program/humorph$ cat morphdb_hu.dic | iconv -f latin2 -t utf-8 > morphdb_hu.dic.u8

In the *.aff.u8 file
SET ISO8859-2 replaced with SET UTF-8

The result is still no good:

en@anonymous:~/program/humorph$ echo program | chmorph *hu.aff.u8 *hu.dic.u8 /dev/stdin NOM ACC

en@anonymous:~/program/humorph$ echo program | chmorph *hu.aff.u8 *hu.dic.u8 /dev/stdin NOM POSS

en@anonymous:~/program/humorph$ echo program asztalt |./analyze *hu.aff.u8 *hu.dic.u8 /dev/stdin
generate(program, asztalt) = NO DATA


To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]