Dictionaries - usage, upstream, updates

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Dictionaries - usage, upstream, updates

Vojtěch Zeisek-2
Hi,
there is recent work on new Czech dictionary, which should have better
vocabulary and should be better in handling complex Czech grammar. The work is
still far from being complete, but so far it's promising. Some info (mostly in
Czech, sorry) is at https://gitlab.com/strepon/czech-cc0-dictionaries/ and
http://ceskeslovniky.cz/about.html and https://github.com/l10ncz/
We discussed this at https://www.linuxdays.cz/2019/en/ last weekend (BTW,
(open)SUSE had the best presentation there) and there were couple of questions
we were unable to answer, so I seek advice here. :-)
The dictionary above is now available as hunspell addon for LO. This format is
used also by FF, TB and more. But what about KDE, GNOME, other DEs, Qt (non-
KDE) apps? Others? What do they use? In another words, what is purpose of
having distribution packages for aspell, ispell, hunspell, myspell? What is
relationship among them? What is used for what?
Regarding hunspell/myspell, what is upstream for the dictionaries? We were
unable to find where does the Czech dictionary came from. :-)
One day we'd like to merge the new dictionary with the existing one, but it
requires much more work now. Anyway, it'd be good to have distribution package
available for testing. We then wonder if it's possible to have installed
together e.g. myspell-cs_CZ and myspell-cs_CZ_EXPERIMENTAL and somehow switch
between them. Or more general, how is packaging of dictionaries organized?
Technically as well as getting the data.
I'm looking forward for any point on this topic.
V.

--
Vojtěch Zeisek

Komunita openSUSE GNU/Linuxu
Community of the openSUSE GNU/Linux

https://www.opensuse.org/
https://trapa.cz/

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Dictionaries - usage, upstream, updates

Tomas Chvatal-2
Vojtěch Zeisek píše v Út 08. 10. 2019 v 09:31 +0200:

> Hi,
> there is recent work on new Czech dictionary, which should have
> better
> vocabulary and should be better in handling complex Czech grammar.
> The work is
> still far from being complete, but so far it's promising. Some info
> (mostly in
> Czech, sorry) is at
> https://gitlab.com/strepon/czech-cc0-dictionaries/ and
> http://ceskeslovniky.cz/about.html and https://github.com/l10ncz/
> We discussed this at https://www.linuxdays.cz/2019/en/ last weekend
> (BTW,
> (open)SUSE had the best presentation there) and there were couple of
> questions
> we were unable to answer, so I seek advice here. :-)
> The dictionary above is now available as hunspell addon for LO. This
> format is
> used also by FF, TB and more. But what about KDE, GNOME, other DEs,
> Qt (non-
> KDE) apps? Others? What do they use? In another words, what is
> purpose of
> having distribution packages for aspell, ispell, hunspell, myspell?
> What is
> relationship among them? What is used for what?
> Regarding hunspell/myspell, what is upstream for the dictionaries? We
> were
> unable to find where does the Czech dictionary came from. :-)
> One day we'd like to merge the new dictionary with the existing one,
> but it
> requires much more work now. Anyway, it'd be good to have
> distribution package
> available for testing. We then wonder if it's possible to have
> installed
> together e.g. myspell-cs_CZ and myspell-cs_CZ_EXPERIMENTAL and
> somehow switch
> between them. Or more general, how is packaging of dictionaries
> organized?
> Technically as well as getting the data.
> I'm looking forward for any point on this topic.
> V.
Hello Vojtech,

I so wanted to join this chat, but sadly as and Org didn't get the time
to do much else except handling the conference itself.

The first thing I have is that we set up the dictionaries to be auto-
generated from the libreoffice-dictionaries repository for all the
languages.
The previous approach was individual projects and packages and it was
pain in the butt to keep it up-to-date and working well.

So I would first recommend to take that project and integrate them in
LO [1] and we will automatically inherit them.

For the interim testing you can name the package as i.e. 'hunspell-new-
cs_CZ' and have it provide the symbols of the other czech dictionary
name and at the same time conflict with it. It will let users choose
the dictionary.

But I can't stress enough how much pain it was to keep the separate
pkgs for various languages so really try to integrate it in the
libreoffice.

Cheers

Tom

[1] https://cgit.freedesktop.org/libreoffice/dictionaries/

signature.asc (849 bytes) Download Attachment