Discussion:
[Wikidata] Wikibase as a decentralized perspective for Wikidata
Baptiste de Coulon (le lieu imaginaire)
2018-11-27 17:29:35 UTC
Permalink
Hello,

In the pre-conference of SWIB18 [1], Stacy Allison-Cassin and Dan Scott
have lead yesterday a great workshop on "Wikibase: configure, customize,
and collaborate".

Among others, the discussion on the panel have show the big interest on
a decentralized mode to use Wikidata throug a network of Wikibase instances.

To implement it, we have identify the following needs:

* Wikibase instance on Docker have to be update to current version of
the software.
* A users' community have to be build and remain in close connecting
interactions with the development team.
* Performing Import and export script between Wikidata and Wikibase
have to be achieve.
* Connecting Properties have to be developping in the way to
interoperate the instances.

Is the Wikidata Community agree with this proposal? The development team
also?

Is it necessary to open a second mailing-list dedicate to Wikibase?

Where is the best place to discuss of all this things?

Best Regards

Baptiste

[1] http://swib.org/swib18/index.html

Pour le lieu imaginaire

Baptiste de Coulon
conseiller en gestion de l'information

le lieu imaginaire
rue des Oeillets 14
2502 Bienne
Suisse

+41 78 636 32 17
***@lelieuimaginaire.ch <mailto:***@lelieuimaginaire.ch>
lelieuimaginaire.ch <https://lelieuimaginaire.ch>
Yuri Astrakhan
2018-11-28 16:32:18 UTC
Permalink
I would add another very important aspect - query prefixes - to build some
cohesion within Wikibase community.

Currently, WDQS hardcodes prefixes like "wd:" and "wdt:" to be based on the
"conceptUri" parameter. Which means that any Wikibase installation that
has its own data would still use well-recognized wd* style prefixes, but
they would not mean the same thing as for Wikidata, causing confusion.
This is especially important because in most cases, people will want to use
federated queries to join data from their own Wikibase instances with the
Wikidata one.

My project - sophox.org (OpenStreetMap data and metadata) - has set up an
additional set of prefixes that mirror the wd* ones -- osmd, osmdt, ...,
but users still have to override the default wd: meaning to point back to
Wikidata, otherwise they cannot meaningfully use Wikidata federation.

On Wed, Nov 28, 2018 at 8:24 AM Baptiste de Coulon (le lieu imaginaire) <
Post by Baptiste de Coulon (le lieu imaginaire)
Hello,
In the pre-conference of SWIB18 [1], Stacy Allison-Cassin and Dan Scott
have lead yesterday a great workshop on "Wikibase: configure, customize,
and collaborate".
Among others, the discussion on the panel have show the big interest on a
decentralized mode to use Wikidata throug a network of Wikibase instances.
- Wikibase instance on Docker have to be update to current version of
the software.
- A users' community have to be build and remain in close connecting interactions
with the development team.
- Performing Import and export script between Wikidata and Wikibase
have to be achieve.
- Connecting Properties have to be developping in the way to
interoperate the instances.
Is the Wikidata Community agree with this proposal? The development team
also?
Is it necessary to open a second mailing-list dedicate to Wikibase?
Where is the best place to discuss of all this things?
Best Regards
Baptiste
[1] http://swib.org/swib18/index.html
Pour le lieu imaginaire
Baptiste de Coulon
conseiller en gestion de l'information
le lieu imaginaire
rue des Oeillets 14
2502 Bienne
Suisse
+41 78 636 32 17
lelieuimaginaire.ch
_______________________________________________
Wikidata mailing list
https://lists.wikimedia.org/mailman/listinfo/wikidata
James Heald
2018-11-28 18:15:09 UTC
Permalink
It should also be made possible for the local wikibase to use local
prefixes other than 'P' and 'Q' for its own local properties and items,
otherwise it makes things needlessly confusing -- but currently I think
this is not possible.

-- James
Post by Yuri Astrakhan
I would add another very important aspect - query prefixes - to build some
cohesion within Wikibase community.
Currently, WDQS hardcodes prefixes like "wd:" and "wdt:" to be based on the
"conceptUri" parameter. Which means that any Wikibase installation that
has its own data would still use well-recognized wd* style prefixes, but
they would not mean the same thing as for Wikidata, causing confusion.
This is especially important because in most cases, people will want to use
federated queries to join data from their own Wikibase instances with the
Wikidata one.
My project - sophox.org (OpenStreetMap data and metadata) - has set up an
additional set of prefixes that mirror the wd* ones -- osmd, osmdt, ...,
but users still have to override the default wd: meaning to point back to
Wikidata, otherwise they cannot meaningfully use Wikidata federation.
On Wed, Nov 28, 2018 at 8:24 AM Baptiste de Coulon (le lieu imaginaire) <
Post by Baptiste de Coulon (le lieu imaginaire)
Hello,
In the pre-conference of SWIB18 [1], Stacy Allison-Cassin and Dan Scott
have lead yesterday a great workshop on "Wikibase: configure, customize,
and collaborate".
Among others, the discussion on the panel have show the big interest on a
decentralized mode to use Wikidata throug a network of Wikibase instances.
- Wikibase instance on Docker have to be update to current version of
the software.
- A users' community have to be build and remain in close connecting interactions
with the development team.
- Performing Import and export script between Wikidata and Wikibase
have to be achieve.
- Connecting Properties have to be developping in the way to
interoperate the instances.
Is the Wikidata Community agree with this proposal? The development team
also?
Is it necessary to open a second mailing-list dedicate to Wikibase?
Where is the best place to discuss of all this things?
Best Regards
Baptiste
[1] http://swib.org/swib18/index.html
Pour le lieu imaginaire
Baptiste de Coulon
conseiller en gestion de l'information
le lieu imaginaire
rue des Oeillets 14
2502 Bienne
Suisse
+41 78 636 32 17
lelieuimaginaire.ch
_______________________________________________
Wikidata mailing list
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
https://lists.wikimedia.org/mailman/listinfo/wikidata
---
This email has been checked for viruses by AVG.
https://www.avg.com
Yuri Astrakhan
2018-11-28 18:30:13 UTC
Permalink
James, this would be possible the moment Wikibase team accept this to be a
requirement. This is not a technical issue, it's a philosophical one.
I have written a patch that allows wikis to customize it very easily, but
alas, no progress. Feel free to chime in.

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/455480
Post by James Heald
It should also be made possible for the local wikibase to use local
prefixes other than 'P' and 'Q' for its own local properties and items,
otherwise it makes things needlessly confusing -- but currently I think
this is not possible.
-- James
Post by Yuri Astrakhan
I would add another very important aspect - query prefixes - to build
some
Post by Yuri Astrakhan
cohesion within Wikibase community.
Currently, WDQS hardcodes prefixes like "wd:" and "wdt:" to be based on
the
Post by Yuri Astrakhan
"conceptUri" parameter. Which means that any Wikibase installation that
has its own data would still use well-recognized wd* style prefixes, but
they would not mean the same thing as for Wikidata, causing confusion.
This is especially important because in most cases, people will want to
use
Post by Yuri Astrakhan
federated queries to join data from their own Wikibase instances with the
Wikidata one.
My project - sophox.org (OpenStreetMap data and metadata) - has set up
an
Post by Yuri Astrakhan
additional set of prefixes that mirror the wd* ones -- osmd, osmdt, ...,
but users still have to override the default wd: meaning to point back to
Wikidata, otherwise they cannot meaningfully use Wikidata federation.
On Wed, Nov 28, 2018 at 8:24 AM Baptiste de Coulon (le lieu imaginaire) <
Post by Baptiste de Coulon (le lieu imaginaire)
Hello,
In the pre-conference of SWIB18 [1], Stacy Allison-Cassin and Dan Scott
have lead yesterday a great workshop on "Wikibase: configure, customize,
and collaborate".
Among others, the discussion on the panel have show the big interest on
a
Post by Yuri Astrakhan
Post by Baptiste de Coulon (le lieu imaginaire)
decentralized mode to use Wikidata throug a network of Wikibase
instances.
Post by Yuri Astrakhan
Post by Baptiste de Coulon (le lieu imaginaire)
- Wikibase instance on Docker have to be update to current version
of
Post by Yuri Astrakhan
Post by Baptiste de Coulon (le lieu imaginaire)
the software.
- A users' community have to be build and remain in close
connecting interactions
Post by Yuri Astrakhan
Post by Baptiste de Coulon (le lieu imaginaire)
with the development team.
- Performing Import and export script between Wikidata and Wikibase
have to be achieve.
- Connecting Properties have to be developping in the way to
interoperate the instances.
Is the Wikidata Community agree with this proposal? The development team
also?
Is it necessary to open a second mailing-list dedicate to Wikibase?
Where is the best place to discuss of all this things?
Best Regards
Baptiste
[1] http://swib.org/swib18/index.html
Pour le lieu imaginaire
Baptiste de Coulon
conseiller en gestion de l'information
le lieu imaginaire
rue des Oeillets 14
2502 Bienne
Suisse
+41 78 636 32 17
lelieuimaginaire.ch
_______________________________________________
Wikidata mailing list
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
https://lists.wikimedia.org/mailman/listinfo/wikidata
---
This email has been checked for viruses by AVG.
https://www.avg.com
_______________________________________________
Wikidata mailing list
https://lists.wikimedia.org/mailman/listinfo/wikidata
Daniel Kinzler
2018-11-29 01:02:45 UTC
Permalink
It should also be made possible for the local wikibase to use local prefixes
other than 'P' and 'Q' for its own local properties and items, otherwise it
makes things needlessly confusing -- but currently I think this is not possible.
I think the opposite is the case: ending up with a zoo of prefixes, with items
being called A73834 and F0924095 and Q98985 and W094509, would be very
confusing. The current approach is to to use the same approach that RDF and XML
use: add a kind of namespace identifier in front of "foreign" identifiers. So
you would have Q437643 for "local" items, xy:Q8743 for items from xy,
foo:Q873287 for items from foo, etc. This is how foreign IDs are currently
implemented in Wikibase.
--
Daniel Kinzler
Principal Software Engineer, Core Platform
Wikimedia Foundation
Yuri Astrakhan
2018-11-29 02:14:13 UTC
Permalink
Daniel, it is not so clear cut. Most users will not be exposed to a
"zoo". Case in point - Open Street Map. In OSM, the entire user base of
tens of thousands of people know the meaning of Q123. The "Q" prefix has a
strong identity in itself. Anyone will instantly say - yes, it's a
Wikidata identifier attached to the majority of important OSM objects. So
whenever someone sees an object with the tag "wikidata=Q123" or
"brand:wikidata=Q123" or even "species:wikidata=Q123" they know that there
is a WD item describing this object, or the brand of this object (e.g.
Mc.Donalds store), or the tree species.

As Lydia said, Wikidata is a huge tree in a forest, overshadowing all other
trees. It is totally ok for both OSM and some genetics storage to both use
the same prefix - there will be no confusion between the users of the two.
Yet both of them are likely to reference Wikidata itself. Keeping "Q" as
primarily Wikidata identifier will help the users. That's why I call this
a philosophical debate - on one hand, there is very real usability problem.
On the other, there is a philosophical dilemma - the best approach in a
hypothetical world.

Now that we also have Wikibase on OSM wiki, all of the metadata about those
tags is also stored in the Q numbers. So "wikidata" key itself is Q827
[1]. Now lets say at some point we decide to store an item's "class" in
osm Wiki, e.g. "item_class=Q123". How often do you think users will
confuse this Q123 to be wikidata's ID vs OSM wiki ID? This is almost
certain to cause confusion, especially among the novice users, without
actually benefiting anyone except the philosophical "everything must be a
prefix". Note that unlike Mediawiki, there are hundreds of different tools
in OSM, and they do not share anything except key-value pairs. So it would
not be possible to make the same "smart" interface for each of them.
People will have to use Q123 as a string.

Lastly, up until this morning, the Query Service hardcoded wd:, wdt:, and
other prefixes to always mean "current wiki" (conceptUri), which obviously
was very confusing -- wd:Q123 had different meaning depending on where you
ran it, and if you used federation query with Wikidata itself, you had to
hardcode a new prefix into your query to revert the meaning of wd: back to
wikidata's. Luckily, it wasn't too hard of a fix that I hope will be
merged soon [2].

[1] https://wiki.openstreetmap.org/wiki/Item:Q827
[2] https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/476398
Post by Yuri Astrakhan
Post by James Heald
It should also be made possible for the local wikibase to use local
prefixes
Post by James Heald
other than 'P' and 'Q' for its own local properties and items, otherwise
it
Post by James Heald
makes things needlessly confusing -- but currently I think this is not
possible.
I think the opposite is the case: ending up with a zoo of prefixes, with items
being called A73834 and F0924095 and Q98985 and W094509, would be very
confusing. The current approach is to to use the same approach that RDF and XML
use: add a kind of namespace identifier in front of "foreign" identifiers. So
you would have Q437643 for "local" items, xy:Q8743 for items from xy,
foo:Q873287 for items from foo, etc. This is how foreign IDs are currently
implemented in Wikibase.
--
Daniel Kinzler
Principal Software Engineer, Core Platform
Wikimedia Foundation
_______________________________________________
Wikidata mailing list
https://lists.wikimedia.org/mailman/listinfo/wikidata
Federico Leva (Nemo)
2018-11-29 05:51:01 UTC
Permalink
The "Q" prefix has a strong identity in itself.  Anyone will instantly
say - yes, it's a Wikidata identifier
But that's because most people only know one Wikibase installation, not
the other way around.

Federico
Yuri Astrakhan
2018-11-29 06:17:12 UTC
Permalink
Post by Federico Leva (Nemo)
The "Q" prefix has a strong identity in itself. Anyone will instantly
say - yes, it's a Wikidata identifier
But that's because most people only know one Wikibase installation, not
the other way around.
Of course! More specifically, at OSM that's the only Q-numbers people are
aware of. All other ID systems do not have nearly the same level of
recognition. It would be silly to wait for government agencies to switch
to the Q-numbers too, right? Or to wait for 5-10 years until (and IF!) Q
numbers become more common at other projects that are large enough to
become well known, and use that potential future as a justification to not
use a much more convenient system for the next 10 years. The cost of that
10 years of "wait and see" is a significant user confusion.
Imre Samu
2018-11-29 16:21:45 UTC
Permalink
More specifically, at OSM that's the only Q-numbers people are aware of.
I would like to share my use case ( sorry if sometimes is offtopic )

I am:
- member of Wikimédia Magyarország EgyesÌlet (Wikimedia Hungary)
- OSM meetup organizer
- in my mind: 'Q' == Wikidata ; 'Q' == Quality ( but this is a
false associations )
- I have experience working with data warehousing / relational databases

Q/P prefix for me like a https://en.wikipedia.org/wiki/Hungarian_notation

* "Hungarian notation aims to remedy this by providing the programmer with
explicit knowledge of each variable's data type."*
but now I am not sure:
- What is the real meaning of Q/P prefix -> Wikidata or Wikibase?


I am involved in some open geodata projects.
#1. adding Wikidata ID concordances to Natural Earth ( this is my work )

https://www.naturalearthdata.com/blog/miscellaneous/natural-earth-v4-1-0-release-notes/
#2. adding Wikidata ID concordances to https://whosonfirst.org/ ( Who's On
First is a gazetteer of places. )
#3. OSM

First time: I tried SPARQL + Wikidata Query Service
My experience:
- more and more data -> ( like: Q486972, human settlement ) -> more
timeouts ( in my complex geo queries )
(a lot of farms imported in the Netherlands area, so I have to limit the
search radius;... )
- data changes every time, so hard to write and validate complex program
codes.
After a few months, I have learned that for heavy data users the Wikidata
Query Service sometimes not perfect. ( but good for light queries ! )

So now I am loading "Wikidata JSON dump" to Postgres/PostGIS database -
and I am writing complex codes in SQL
My codes are very complex codes ( jaro_winkler distance, geo distance,
detecting Cebuno imports ; ranking multiple candidates for matching ) ;
And finally I can control the performance of the system ( not timeout
) and I have reproducible results.

for example: my simple SQL example code - you can see lot of P/Q codes
inside ,
and you can expect - now I am know lot of Q/P codes by heart !
select
wd_id
,wd_label
,get_wdcqv_globecoordinate(data,'P625','P518','Q1233637') as river_mouth
,get_wdcqv_globecoordinate(data,'P625','P518','Q7376362') as river_source
from wd.wdx
where wd_id='Q626';


And now the "Natural Earth" tables looks like this ( relational database
)
+-------------+------------+-----------+
| name | wikidataid | iata_code |
+-------------+------------+-----------+
| Birsa Munda | Q598231 | IXR |
| Barnaul | Q1858312 | BAX |
| Bareilly | Q2788745 | |

this is my current workflow.

But my real nightmare will start - if other databases start using Q/P
prefix:
for example, other Airport related databases start using Wikibase - with Q
codes
- http://ourairports.com/ ;
- https://www.flightradar24.com/data/airports
- https://www.airnav.com/airports/

So every airport have at least 4 different Q codes!
And in the future, I have to check errors in this spreadsheet ( and
sometimes I don't see the header )
+-------------+------------+-----------+-------------+-----------+-----------+
| name | wikidataid | iata_code | ourairports | flightR24 | AirNav
|
+-------------+------------+-----------+-------------+-----------+-----------+
| Birsa Munda | Q598231 | IXR | Q325324 | Q973 | Q1
|
| Barnaul | Q1858312 | BAX | Q42 | Q1 | Q8312
|
| Bareilly | Q2788745 | | Q1 | Q31 | Q45
|

Q1 - everywhere - with different meanings

And what if some users want to add the new airport ID-s back to the
wikidata ( linking databases ) Why not
so in the future, If I check the https://www.wikidata.org/wiki/Q598231
I will see a lot of different Q codes:
Ourairports Q325324
FlightR24 Q973
AirNav Q1

And sometimes very hard to communicate for the new contributors that
Q1(AirNav) =/= Q1(Wikidata)

If I see any database/spreadsheet.
- and I see a Q code - My current expectations that this is a Wikidata
code. :)
Just check: https://github.com/search?q=Q28+hungary&type=Code

So my current opinion:
- please don't use Q/P prefixes in any new/other databases!

for me, unlearning a lot of Q/P values is hard,
so as I have more-and-more experience in Wikidata data model - I would like
less-and-less using any other Wikibase systems with similar Q/P prefixes.


My other pain point is the "Wikidata JSON dump" , a little more
information would be a big help for me:

for detecting data quality of items:
- last modification DateTime
- last modification user type ( anonym_user, new_user, experienced_user,
bot )
- edit counts by user type , for example: { anonym_user=2 , new_user=0 ,
experienced_user=0, bot=15 }
Info about wikidata life cycle
- Wikidata redirections / deletions ( now: only in the .ttl files )


I know - I am not a typical user ... and my problems, not a priority yet,

imho:

Integrating Wikidata iDs to other databases have already started ( OSM,
Natural Earth, Who's On First , ... )
and need some guideline/support for this cases - before too late.
Probably the current practice ( OSM, Natural Earth, Who's On First , ...
) is not optimal.
A few months ago - I have learned an extremely painful lesson:
https://phabricator.wikimedia.org/T202676#4533486
quote>>>

*- "Q" does not mean "wikidata.org <http://wikidata.org>". It means "item"
and is used by all Wikibase installations so far.*
*- "Retroactively "reserving" the letter "Q" to be exclusively used by
wikidata.org <http://wikidata.org> can't work. It was never meant to be
like this, and there is no mechanism for this."- *

*- "Q" only means "wikidata.org <http://wikidata.org>" to users who know
about wikidata.org <http://wikidata.org>. These users should not have a
problem understanding that the moment an OSM Wikibase installation exists,
"osm:Q1" refers to this installation.*

<<<<quote

so now I am totally confused.

probably, my current practice is a "bad practice" ? :(
And the "Natural Earth" wikidata integrations should add a "wd:" prefix
everywhere?,
but maybe it is too late to change
+-------------+---------------+-----------+
| name | wikidataid | iata_code |
+-------------+---------------+-----------+
| Birsa Munda | wd:Q598231 | IXR |
| Barnaul | wd:Q1858312 | BAX |
| Bareilly | wd:Q2788745 | |


this is my retrospective, thank you for reading.


best,
Imre
Post by Federico Leva (Nemo)
The "Q" prefix has a strong identity in itself. Anyone will instantly
say - yes, it's a Wikidata identifier
But that's because most people only know one Wikibase installation, not
the other way around.
Of course! More specifically, at OSM that's the only Q-numbers people are
aware of. All other ID systems do not have nearly the same level of
recognition. It would be silly to wait for government agencies to switch
to the Q-numbers too, right? Or to wait for 5-10 years until (and IF!) Q
numbers become more common at other projects that are large enough to
become well known, and use that potential future as a justification to not
use a much more convenient system for the next 10 years. The cost of that
10 years of "wait and see" is a significant user confusion.
_______________________________________________
Wikidata mailing list
https://lists.wikimedia.org/mailman/listinfo/wikidata
Daniel Kinzler
2018-11-29 17:41:58 UTC
Permalink
- What is the real meaning of Q/P prefix  ->  Wikidata or Wikibase?  
The intention was:

P and Q indicate the *type* of the entity ("P" = "Property", "Q" = "Item" for
arcane reasons), "L" = Lexeme, "F" = Form, "S" = Sense, "M" = MediaInfo). As you
can tell, we'd quickly run out of letters and cause confusion if this became
configurable.

Using prefixes to indicate where the entity comes from is indeed useful and is
already part of the model. The prefix for Wikidata is "wd:", wo "wd:Q12345" is
an item from Wikidata. The prefix can be omitted for local entities, so Q12345
is an item on the local repo (or the default repo of a wikibase client).
--
Daniel Kinzler
Principal Software Engineer, Core Platform
Wikimedia Foundation
Yuri Astrakhan
2018-11-29 18:40:46 UTC
Permalink
Daniel,
Post by Daniel Kinzler
P and Q indicate the *type* of the entity ("P" = "Property", "Q" = "Item" for
arcane reasons), "L" = Lexeme, "F" = Form, "S" = Sense, "M" = MediaInfo). As you
can tell, we'd quickly run out of letters and cause confusion if this became
configurable.
I don't think this would cause a confusion, because the lexicographical
project is really a separate project that just happens to reside on the
same Wikidata domain. Essentially you did internally what we are asking for
other sites - you mixed two projects, and kept them distinct by using
different prefixes. If at some point you decide to add some new area of
data, e.g. biological, you could add new prefixes for that too, but that
would also be a "separate" project.

Most other sites that link to Wikidata only care about just one of those
projects. E.g. OSM would have very little interest in lexical data, so it
is OK if "L" prefix would be used in OSM and in WD because it won't be as
confusing to the users as reusing the Q.
Post by Daniel Kinzler
The prefix can be omitted for local entities, so Q12345
is an item on the local repo (or the default repo of a wikibase client).
I think that was a big mistake -- the "(or the default repo of a wikibase
client)" -- because wd implies Wikidata, not Wikibase, so it dilutes the
meaning of "wd:". See my other email on how I fixed it.
Daniel Kinzler
2018-11-29 21:36:44 UTC
Permalink
Am 29.11.18 um 10:40 schrieb Yuri Astrakhan:>If at
some point you decide to add some new area of data, e.g. biological, you could
add new prefixes for that too, but that would also be a "separate" project.
The Q, P, L, M, etc are used to identify the *type* of entity. They are not for
keeping projects separate. That was never their purpose. Wikibase uses prefixes
before that, but they are prefixed *before* the letter that indicates the type.
The prefix can be omitted for local entities, so Q12345
is an item on the local repo (or the default repo of a wikibase client).
I think that was a big mistake -- the "(or the default repo of a wikibase
client)" -- because wd implies Wikidata, not Wikibase, so it dilutes the
meaning of "wd:". See my other email on how I fixed it.
I'm confused - yes, we: should ALWAYS imply wikidata. Your wikibase instance
would have its own prefix (that can be omitted for local use), e.g. "osm:".

For the record, I'm just voicing my oppinion here, and telling you what the
original intention was. I'm no longer working on Wikidata or Wikibase, and I
can't make any decisions on any of this.
--
Daniel Kinzler
Principal Software Engineer, Core Platform
Wikimedia Foundation
Stas Malyshev
2018-11-29 22:01:31 UTC
Permalink
Hi!
Post by Yuri Astrakhan
I don't think this would cause a confusion, because the lexicographical
project is really a separate project that just happens to reside on the
same Wikidata domain. Essentially you did internally what we are asking
No, the difference here is that L items are not the same as Q items -
e.g. L items do not have sitelinks, and do have lemmas and senses. Data
structure is different. If you use different data structure than Q items
- i.e., no labels, descriptions, sitelinks, etc. - then you should use a
different letter. But if it's the same structure, but for different
domain - then it should be Q.
Post by Yuri Astrakhan
Most other sites that link to Wikidata only care about just one of those
projects. E.g. OSM would have very little interest in lexical data, so
it is OK if "L" prefix would be used in OSM and in WD because it won't
be as confusing to the users as reusing the Q.
No, that would be confusing. If OSM wants own data type, because Q item
does not fit - e.g. OSM doesn't want descriptions and sitelinks - then
it should use a separate letter, like MediaInfo uses M. But using L
would not be smart since then this data would not integrate well with
lexicografical data.
--
Stas Malyshev
***@wikimedia.org
Olaf Simons
2018-11-29 07:53:01 UTC
Permalink
What is more problematic than the p/q business:

If I run a SPARQL search at our endpoint - such as this one:

https://database.factgrid.de/query/#SELECT%20%3FIlluminatenorden%20%3FIlluminatenordenLabel%20WHERE%20%7B%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%20%20%3FIlluminatenorden%20wdt%3AP91%20wd%3AQ10677.%0A%7D

I will receive answers in the form of

wd:q25

but they do not lenk to wd, wikidata, but into our database https://database.factgrid.de/entity/Q25.

The same problem in the other direction: If our users have never seen a SPARQL search in their lives (and that's 100%) and if they now click at sample queries - they will qet Wikidata sample queries which do not work on our database - just as our P and Q numbers do not match.

Olaf
Post by Daniel Kinzler
It should also be made possible for the local wikibase to use local prefixes
other than 'P' and 'Q' for its own local properties and items, otherwise it
makes things needlessly confusing -- but currently I think this is not possible.
I think the opposite is the case: ending up with a zoo of prefixes, with items
being called A73834 and F0924095 and Q98985 and W094509, would be very
confusing. The current approach is to to use the same approach that RDF and XML
use: add a kind of namespace identifier in front of "foreign" identifiers. So
you would have Q437643 for "local" items, xy:Q8743 for items from xy,
foo:Q873287 for items from foo, etc. This is how foreign IDs are currently
implemented in Wikibase.
--
Daniel Kinzler
Principal Software Engineer, Core Platform
Wikimedia Foundation
_______________________________________________
Wikidata mailing list
https://lists.wikimedia.org/mailman/listinfo/wikidata
Dr. Olaf Simons
Forschungszentrum Gotha der Universität Erfurt
Schloss Friedenstein, Pagenhaus
99867 Gotha

Büro: +49-361-737-1722
Mobil: +49-179-5196880

Privat: Hauptmarkt 17b/ 99867 Gotha
Andra Waagmeester
2018-11-29 08:44:49 UTC
Permalink
I fully agree. I rather see the scarse development resources being focused
on fixing this, than the p/q business, as you nicely call it. Tbh, I really
don't see an issue with multiple p's and q's over different Wikibases. That
is where prefixes are for, to distinguish between different resources.
Examples of identical identifier (literal) schemes between multiple
resources are abundant. (e.g. PubMed and NCBI gene) It really is a matter
of getting used to, or am I missing something?
Post by Olaf Simons
https://database.factgrid.de/query/#SELECT%20%3FIlluminatenorden%20%3FIlluminatenordenLabel%20WHERE%20%7B%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%20%20%3FIlluminatenorden%20wdt%3AP91%20wd%3AQ10677.%0A%7D
I will receive answers in the form of
wd:q25
but they do not lenk to wd, wikidata, but into our database
https://database.factgrid.de/entity/Q25.
The same problem in the other direction: If our users have never seen a
SPARQL search in their lives (and that's 100%) and if they now click at
sample queries - they will qet Wikidata sample queries which do not work on
our database - just as our P and Q numbers do not match.
Olaf
Post by Daniel Kinzler
Post by James Heald
It should also be made possible for the local wikibase to use local
prefixes
Post by Daniel Kinzler
Post by James Heald
other than 'P' and 'Q' for its own local properties and items,
otherwise it
Post by Daniel Kinzler
Post by James Heald
makes things needlessly confusing -- but currently I think this is not
possible.
Post by Daniel Kinzler
I think the opposite is the case: ending up with a zoo of prefixes, with
items
Post by Daniel Kinzler
being called A73834 and F0924095 and Q98985 and W094509, would be very
confusing. The current approach is to to use the same approach that RDF
and XML
Post by Daniel Kinzler
use: add a kind of namespace identifier in front of "foreign"
identifiers. So
Post by Daniel Kinzler
you would have Q437643 for "local" items, xy:Q8743 for items from xy,
foo:Q873287 for items from foo, etc. This is how foreign IDs are
currently
Post by Daniel Kinzler
implemented in Wikibase.
--
Daniel Kinzler
Principal Software Engineer, Core Platform
Wikimedia Foundation
_______________________________________________
Wikidata mailing list
https://lists.wikimedia.org/mailman/listinfo/wikidata
Dr. Olaf Simons
Forschungszentrum Gotha der UniversitÀt Erfurt
Schloss Friedenstein, Pagenhaus
99867 Gotha
BÃŒro: +49-361-737-1722
Mobil: +49-179-5196880
Privat: Hauptmarkt 17b/ 99867 Gotha
_______________________________________________
Wikidata mailing list
https://lists.wikimedia.org/mailman/listinfo/wikidata
Lydia Pintscher
2018-11-29 09:00:57 UTC
Permalink
I fully agree. I rather see the scarse development resources being focused on fixing this, than the p/q business, as you nicely call it. Tbh, I really don't see an issue with multiple p's and q's over different Wikibases. That is where prefixes are for, to distinguish between different resources. Examples of identical identifier (literal) schemes between multiple resources are abundant. (e.g. PubMed and NCBI gene) It really is a matter of getting used to, or am I missing something?
Are we talking about https://phabricator.wikimedia.org/T194180? I'm
happy to push that into one of the next sprints if so.


Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
Yuri Astrakhan
2018-11-29 17:53:52 UTC
Permalink
Olaf, Andra, Lydia,

On Thu, Nov 29, 2018 at 4:01 AM Lydia Pintscher <
Post by Lydia Pintscher
Are we talking about https://phabricator.wikimedia.org/T194180? I'm
happy to push that into one of the next sprints if so.
I think my yesterday's patch fixes this issue on the server side, without
touching the frontend -- all you need to do is set the prefixes.conf file
to point "wd:" to the original wikidata prefixes, set conceptUri to your
schema, and add your own prefixes. Here's an example of OSM prefixes
configuration. Instead of "wd:" I used "osmd". Similarly replaced all
other "w" for "osm", and added "osm" when there was no "w":

OSM prefixes.conf:
https://github.com/Sophox/wikidata-query-rdf/blob/master/dist/src/script/prefixes.conf
Patch: https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/476398
Daniel Kinzler
2018-11-29 18:03:37 UTC
Permalink
Post by Lydia Pintscher
I fully agree. I rather see the scarse development resources being focused on fixing this, than the p/q business, as you nicely call it. Tbh, I really don't see an issue with multiple p's and q's over different Wikibases. That is where prefixes are for, to distinguish between different resources. Examples of identical identifier (literal) schemes between multiple resources are abundant. (e.g. PubMed and NCBI gene) It really is a matter of getting used to, or am I missing something?
Are we talking about https://phabricator.wikimedia.org/T194180? I'm
happy to push that into one of the next sprints if so.
This doesn't fix the hard-coded prefix in the RDF output generated by Wikibase.
--
Daniel Kinzler
Principal Software Engineer, Core Platform
Wikimedia Foundation
Yuri Astrakhan
2018-11-29 18:24:05 UTC
Permalink
Post by Daniel Kinzler
This doesn't fix the hard-coded prefix in the RDF output generated by Wikibase.
See my previous email - my patch fixes that too. Here's an example query
http://tinyurl.com/yav76uof in Sophox -- it calls out to Wikidata to get a
list of large cities (using wd: and wdt: prefixes), than it matches them
with OSM objects (uses data from the custom OSM importer), and also adds
the metadata item stored in OSM Wiki (osmd prefix). All result links are
clickable.

And yes, I had to add OSM prefixes to the GUI too so that it wouldn't show
them as long URIs.
Daniel Kinzler
2018-11-29 15:47:05 UTC
Permalink
Post by Olaf Simons
I will receive answers in the form of
wd:q25
but they do not lenk to wd, wikidata, but into our database https://database.factgrid.de/entity/Q25.
Right, that prefix should not be "wd" for your own query service. I'm afraid
that's currently hard coded in the RdfVocabulary class. That should indeed be
fixed.
--
Daniel Kinzler
Principal Software Engineer, Core Platform
Wikimedia Foundation
Erik Paulson
2018-12-02 01:28:10 UTC
Permalink
How do these external identifiers work, and how do I get something into one
of these namespaces? (I apologize if I have missed them in the
documentation)

If I stand up my own wikibase with the Docker containers and create an item
for the Mayor of Madison, WI - lets say that creates Q2 in my local
wikibase, and it will be accessible via http://localhost:8181/wiki/Item:Q2

Is there some way I could create an item in my local wikibase that would
have a URL of http://localhost:8181/wiki/Item:wd:Q16107138 that represents
Paul Soglin back in Wikidata but stored in my local wikibase, and that I
can reference in other properties in my wikibase - if I create my own entry
Q3 for Madison, it'd be nice to be able to point 'head of government' to
wd:Q16107138 and be able to use it in my local wikibase and mediawiki
instance, so if I have a page for Madison as well as a wikibase entry in
the local install I can deference it in the wiki markup.

Or are those namespace identifiers like wd: (and xy: or foo: or whatever
namespace) only in the WDQS for making calls out to SERVICE bits in SPARQL,
plus whatever the WDQS exporter generates for local RDF?

(Also my apologies if wikibase doesn't work like this at all and I've so
badly interpreted what Daniel is saying that I'm about to throw the whole
conversation in a dead-end direction)

Thanks!
Post by Yuri Astrakhan
Post by James Heald
It should also be made possible for the local wikibase to use local
prefixes
Post by James Heald
other than 'P' and 'Q' for its own local properties and items, otherwise
it
Post by James Heald
makes things needlessly confusing -- but currently I think this is not
possible.
I think the opposite is the case: ending up with a zoo of prefixes, with items
being called A73834 and F0924095 and Q98985 and W094509, would be very
confusing. The current approach is to to use the same approach that RDF and XML
use: add a kind of namespace identifier in front of "foreign" identifiers. So
you would have Q437643 for "local" items, xy:Q8743 for items from xy,
foo:Q873287 for items from foo, etc. This is how foreign IDs are currently
implemented in Wikibase.
--
Daniel Kinzler
Principal Software Engineer, Core Platform
Wikimedia Foundation
_______________________________________________
Wikidata mailing list
https://lists.wikimedia.org/mailman/listinfo/wikidata
Lydia Pintscher
2018-11-28 21:36:46 UTC
Permalink
Hi Baptiste,

On Wed, Nov 28, 2018 at 2:25 PM Baptiste de Coulon (le lieu
Post by Baptiste de Coulon (le lieu imaginaire)
Hello,
In the pre-conference of SWIB18 [1], Stacy Allison-Cassin and Dan Scott have lead yesterday a great workshop on "Wikibase: configure, customize, and collaborate".
Among others, the discussion on the panel have show the big interest on a decentralized mode to use Wikidata throug a network of Wikibase instances.
Wikibase instance on Docker have to be update to current version of the software.
A users' community have to be build and remain in close connecting interactions with the development team.
Performing Import and export script between Wikidata and Wikibase have to be achieve.
Connecting Properties have to be developping in the way to interoperate the instances.
Is the Wikidata Community agree with this proposal? The development team also?
The dev team is committed to a strategy that is about building an
ecosystem around Wikidata. This means making Wikibase more usable and
useful outside Wikimedia. We have put words and work into this and we
will continue to do so. It matters to me that Wikidata is not a single
oasis in a big dessert but a big tree in flourishing jungle. We want
much more data to be open, machine-readable and accessible but it
doesn't have to all be and shouldn't all have to be on Wikidata.
Post by Baptiste de Coulon (le lieu imaginaire)
Is it necessary to open a second mailing-list dedicate to Wikibase?
There is one already for the Wikibase user group at
https://lists.wikimedia.org/mailman/listinfo/wikibaseug that we are
using.
Post by Baptiste de Coulon (le lieu imaginaire)
Where is the best place to discuss of all this things?
On that mailinglist is fine :)


Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
Loading...