Markus Kroetzsch
2018-11-25 14:51:15 UTC
Hi,
I am puzzled by the behaviour of a SPARQL query. Maybe there is an error
with BlazeGraph here, but hopefully I am just overlooking something.
The query is as follows: http://tinyurl.com/y95jpmhq
SELECT ?item ?birthdate ?spouse
WHERE
{
{ ?item wdt:P569 ?birthdate
FILTER (year(?birthdate)>1900)
?item wdt:P26 []
} OPTIONAL {
?item wdt:P26 ?spouse
FILTER (year(?birthdate) = 1947)
}
# FILTER (year(?birthdate) = 1947) ## For testing: works correctly
} LIMIT 1000
What this should do: "Select married people born after 1900, and,
optionally, also select their spouses, but only for people born in
1947." What BlazeGraph does is: "Select married people born after 1900;
never select any spouses, even if the person is born in 1947".
The 1000 results should contain lines for 1947 births, so you can see
they have no spouse. The commented out filter at the bottom can be used
instead of the inner filter to verify that the condition has no typos
and really matches some of the items.
It seems that BlazeGraph gets the scope of ?birthdate wrong here, and
rather processes the whole query inside out, applying the FILTER to the
optional pattern (where ?birthdate is not bound) and then using the
(empty) result in a binary LeftJoin operation. In reality, LeftJoin in
the SPARQL algebra is a ternary operator that applies the FILTER to the
Join of both sides to determine if we have an optional match or not:
* See "Definition: LeftJoin" in Section 18.5 of the spec [1].
Filters within optional patterns become the third parameter in the
LeftJoin operation when translating queries as in my example:
* See example "{ ?s :p1 ?v1 OPTIONAL {?s :p2 ?v2 FILTER(?v1<3) } }" in
Section 18.2.3 of the spec [1].
Is my interpretation correct or did I overlook something? Is this a
known problem?
Cheers,
Markus
[1] https://www.w3.org/TR/sparql11-query
I am puzzled by the behaviour of a SPARQL query. Maybe there is an error
with BlazeGraph here, but hopefully I am just overlooking something.
The query is as follows: http://tinyurl.com/y95jpmhq
SELECT ?item ?birthdate ?spouse
WHERE
{
{ ?item wdt:P569 ?birthdate
FILTER (year(?birthdate)>1900)
?item wdt:P26 []
} OPTIONAL {
?item wdt:P26 ?spouse
FILTER (year(?birthdate) = 1947)
}
# FILTER (year(?birthdate) = 1947) ## For testing: works correctly
} LIMIT 1000
What this should do: "Select married people born after 1900, and,
optionally, also select their spouses, but only for people born in
1947." What BlazeGraph does is: "Select married people born after 1900;
never select any spouses, even if the person is born in 1947".
The 1000 results should contain lines for 1947 births, so you can see
they have no spouse. The commented out filter at the bottom can be used
instead of the inner filter to verify that the condition has no typos
and really matches some of the items.
It seems that BlazeGraph gets the scope of ?birthdate wrong here, and
rather processes the whole query inside out, applying the FILTER to the
optional pattern (where ?birthdate is not bound) and then using the
(empty) result in a binary LeftJoin operation. In reality, LeftJoin in
the SPARQL algebra is a ternary operator that applies the FILTER to the
Join of both sides to determine if we have an optional match or not:
* See "Definition: LeftJoin" in Section 18.5 of the spec [1].
Filters within optional patterns become the third parameter in the
LeftJoin operation when translating queries as in my example:
* See example "{ ?s :p1 ?v1 OPTIONAL {?s :p2 ?v2 FILTER(?v1<3) } }" in
Section 18.2.3 of the spec [1].
Is my interpretation correct or did I overlook something? Is this a
known problem?
Cheers,
Markus
[1] https://www.w3.org/TR/sparql11-query
--
Prof. Dr. Markus Kroetzsch
Knowledge-Based Systems Group
Center for Advancing Electronics Dresden (cfaed)
Faculty of Computer Science
TU Dresden
+49 351 463 38486
https://kbs.inf.tu-dresden.de/
Prof. Dr. Markus Kroetzsch
Knowledge-Based Systems Group
Center for Advancing Electronics Dresden (cfaed)
Faculty of Computer Science
TU Dresden
+49 351 463 38486
https://kbs.inf.tu-dresden.de/