[Developers] multiple create unless_exists in one query

Alec Flett alecf at metaweb.com
Thu Dec 20 17:58:11 UTC 2007


Arthur van Hoff wrote:
> It is fair to say that their ought to be an easy and obvious way to
> handle multiple inserts (in the same query) of the same topic.
>
>   
This is actually a pretty significant request - I can see how it would 
make your life a lot easier if MQL could just "do the right thing", but 
ultimately I'm not sure it's clear what the right thing is in the 
general case, because it's not clear which part of the query MQL should 
resolve first.

Even if we did have some very specific, deterministic rules about the 
order of resolution, those rules would probably have to be quite complex 
and it would be much harder for a developer to wrap their brain around 
those rules. Kurt was able to explain the current rules in one sentence 
("all verifying reads ...are done before any writes are done") and 
that's a huge win for the usability of MQL.


Here's an in-depth look at it...given the write query:

{"id": null,
  "name": "Fred",
  "type": "/people/person",
  "create": "unless_exists",
  "children": {"id": null,
                      "name": "Fred's Kid"
                      "create": "unless_exists",
                      "parents": {"id": null,
                                         "name": "Fred",
                                         "create": "unless_exists"}}}

There is maybe some ambiguity here. The first reference to "Fred" is 
looking for a person "Fred" with a potential child of "Fred's Kid" - the 
2nd reference to "Fred" is also looking for a person with a potential 
child of "Fred's Kid" - the parent/child here is actually a constraint 
on the unless_exists.

So the real question becomes, do we resolve the first Fred, then assume 
he has been created, then Fred's Kid, then Fred again? How does MQL 
recognize that these are the 'same' person? Really, it would be because 
they have the same constraints, and you'd have a sort of circular 
dependency.

Let's say we resolve these in a depth-first traversal of the query. When 
we resolve the 2nd Fred, do we have to also account for the fact that 
Fred's Kid is now attached to the Fred that we just created, so that 
that constraint is met? What if "Fred" existed, but "Fred's Kid" did 
not? Would the 2nd reference to Fred match or not match the existing 
"Fred"?

And what if MQL did a breadth-first search resolution in a wider query 
instead of depth-first, with redundant references to the same uncreated 
object? That could result in a different set of ambiguities about how 
the unless_exists are resolved.

Or, do we decide that we're going to try to resolve all /people/person's 
named "Fred" first? In that case, you'd probably get today's behavior 
since not Fred is attached to any Fred's Kid, and neither /people/person 
would match. But what if by some luck we tried to resolve "Fred's Kid" 
first, and then "Fred" - would all the constraints match?

Hopefully this illustrates that the developer would have to have a more 
in-depth knowledge of MQL's resolution ordering than one already does.. 
and someone might be posting with a different set of confusion around 
which clauses get resolved first. With today's behavior, there isn't 
much to understand - all clauses get resolved "simultaneously" and then 
the write is done.

I think this is one of those areas where on a query-by-query basis, a 
human can easily decide what makes the most sense.. but from MQL's 
perspective, there is a lot more ambiguity here.

Alec
 
> This is not perfect, but it will work, and should not affect
> performance significantly.
>
> 								Kurt :-)
>
>  
>   
>> Below is an even simpler example that illustrates the problem using
>> "create":"unless_exists" at the top level. It creates multiple foo2s
>>     
> if
>   
>> no foo2 existed. Perhaps this can be fixed so that both cases produces
>>     
> a
>   
>> single foo2, which seems the most natural outcome.
>>
>> "query": [{
>>     "type": "/user/avh/default_domain/foo",
>>     "name": "foo2",
>>     "create":"unless_exists",
>>     "id":null
>> },{
>>     "type": "/user/avh/default_domain/foo",
>>     "name": "foo2",
>>     "create":"unless_exists",
>>     "id":null
>> }]
>>     
>
> Just do this query twice:
>
> {
>   "query" : {
>     "create" : "unless_exists",
>     "id" : null,
>     "name" : "foo002",
>     "type" : "/user/avh/default_domain/foo"
>   }
> }
>
> 1st RESULT:
> {
>   "code" : "/api/status/ok",
>   "result" : {
>     "create" : "created",
>     "id" : "/guid/9202a8c04000641f8000000006e601c3",
>     "name" : "foo002",
>     "type" : "/user/avh/default_domain/foo"
>   }
> }
>
> 2nd RESULT:
>
> {
>   "code" : "/api/status/ok",
>   "result" : {
>     "create" : "existed",
>     "id" : "/guid/9202a8c04000641f8000000006e601c3",
>     "name" : "foo002",
>     "type" : "/user/avh/default_domain/foo"
>   }
> }
>
> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers
> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers
>   



More information about the Developers mailing list