Discussion:
Query Lexemes in the Query Service
(too old to reply)
Léa Lacroix
2018-10-16 08:12:59 UTC
Permalink
Hello all,

I’m very happy to announce that another important feature for Lexicographical
Data <https://www.wikidata.org/wiki/Wikidata:Lexicographical_data> has been
deployed: the ability to *query Lexemes in the Query Service*.

Here are a few examples:

- List of the longest words in English
<https://query.wikidata.org/#SELECT%20DISTINCT%20%3Fl%20%3Fword%20%3Flen%20WHERE%20%7B%0A%20%7B%0A%20%20%20%3Fl%20a%20ontolex%3ALexicalEntry%20%3B%20dct%3Alanguage%20wd%3AQ1860%20%3B%20wikibase%3Alemma%20%3Fword%0A%20%20%20BIND%28strlen%28%3Fword%29%20as%20%3Flen%29%20%20%0A%20%20%7D%20UNION%20%7B%0A%20%20%20%3Fl%20a%20ontolex%3ALexicalEntry%20%3B%20dct%3Alanguage%20wd%3AQ1860%20%3B%20ontolex%3AlexicalForm%2Fontolex%3Arepresentation%20%3Fword%0A%20%20%20BIND%28strlen%28%3Fword%29%20as%20%3Flen%29%20%20%0A%20%20%7D%0A%7D%20%0Aorder%20by%20DESC%28%3Flen%29%20%0ALIMIT%2020>,
or in German
<https://query.wikidata.org/#SELECT%20DISTINCT%20%3Fl%20%3Fword%20%3Flen%20WHERE%20%7B%0A%20%7B%0A%20%20%20%3Fl%20a%20ontolex%3ALexicalEntry%20%3B%20dct%3Alanguage%20wd%3AQ188%20%3B%20wikibase%3Alemma%20%3Fword%0A%20%20%20BIND%28strlen%28%3Fword%29%20as%20%3Flen%29%20%20%0A%20%20%7D%20UNION%20%7B%0A%20%20%20%3Fl%20a%20ontolex%3ALexicalEntry%20%3B%20dct%3Alanguage%20wd%3AQ188%20%3B%20ontolex%3AlexicalForm%2Fontolex%3Arepresentation%20%3Fword%0A%20%20%20BIND%28strlen%28%3Fword%29%20as%20%3Flen%29%20%20%0A%20%20%7D%0A%7D%20%0Aorder%20by%20DESC%28%3Flen%29%20%0ALIMIT%2020>
- Graph of all Lexemes
<https://query.wikidata.org/#%23defaultView%3AGraph%0ASELECT%20%3Flexeme%20%3FlexemeLabel%20%3Ftarget%20%3FtargetLabel%20WHERE%20%7B%0A%20%20%3Flexeme%20wdt%3AP5191%20%3Ftarget%3B%20wikibase%3Alemma%20%3FlexemeLabel.%0A%20%20%3Ftarget%20wdt%3AP5191*%20wd%3AL2087%3B%20wikibase%3Alemma%20%3FtargetLabel.%0A%7D>
derived from *wódr̥ (L2087) <https://www.wikidata.org/wiki/Lexeme:L2087>
- Grammatical genders that are most used in lexicographical data in
Wikidata
<https://query.wikidata.org/#%23%20most%20common%20grammatical%20genders%0ASELECT%20%3Fgender%20%3FgenderLabel%20%3Fcount%20WITH%20%7B%0A%20%20SELECT%20%3Fgender%20%28COUNT%28%3Flexeme%29%20AS%20%3Fcount%29%20WHERE%20%7B%0A%20%20%20%20%3Flexeme%20a%20ontolex%3ALexicalEntry%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20wdt%3AP5185%20%3Fgender.%0A%20%20%7D%0A%20%20GROUP%20BY%20%3Fgender%0A%7D%20AS%20%25results%20WHERE%20%7B%0A%20%20INCLUDE%20%25results.%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%7D%0AORDER%20BY%20DESC%28%3Fcount%29>

The queries are based on the RDF mapping that you can find here
<https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/RDF_mapping>. Feel
free to help improving the documentation, so people can understand how to
build queries out of Lexemes.

Thank you very much to Tpt <https://www.wikidata.org/wiki/User:Tpt> who’s
been doing a huge part of the work by mapping Lexemes in RDF, and Smalyshev
(WMF) <https://www.wikidata.org/wiki/User:Smalyshev_(WMF)> who made the RDF
dumps available and integrated in the Query Service.

Feel free to play with it, bring some of these ideas of queries
<https://www.wikidata.org/wiki/Wikidata:Lexicographical_data/Ideas_of_queries>
to life, and let us know if you find any issue or bug. These can be stored
as subtasks of this one <https://phabricator.wikimedia.org/T193645> on
Phabricator.

If you have questions about Lexicographical Data in general, feel free to
write on the talk page of the project
<https://www.wikidata.org/wiki/Wikidata_talk:Lexicographical_data>. If you
have specific questions about the integration in the Query Service, you can
also ping Stas onwiki or on IRC.

Cheers,
--
Léa Lacroix
Project Manager Community Communication for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnÃŒtzig anerkannt durch das Finanzamt fÃŒr
Körperschaften I Berlin, Steuernummer 27/029/42207.
Loading...