[Data-modeling] Recent changes, especially with regard to bases

Kirrily Robert kirrily at metaweb.com
Sat Nov 1 00:34:27 UTC 2008


I just posted a blog post on blog.freebase.com that explains some of  
the changes which have occurred with our latest release, especially in  
regard to bases.

The following is an email that Al Marks sent to an internal mailing  
list, and which he gave me permission to repost to these external  
lists. I think it gives a good technical overview of what's going on.

K.


[ Al's words below]

I've learned some technical things this week that seem worth sharing,  
especially to the data team, related to the latest awesome  
freebase.com upgrade. These changes were designed such that consumers  
visiting the site will have an easier time, but which can be a bit  
unintuitive to those of us coming from the data side.

Saved views have replaced types as the primary mechanism for  
organizing data in our UI. The tiles shown on the homepage for bases  
and commons domains are views, not types, although each base is pre- 
populated with views for all non-CVT types. Before this release, views  
were a minor UI feature; now "everything is a view".

Views are great in that they discourage schema denormalization when  
building bases by encouraging users to make use of existing types and  
properties. Instead of a creating a type "Yale Person", Robert can  
design a view using commons properties (Person->Education->Yale), save  
it to his Yale base, and feature it prominently. In this way, bases  
are as much about highlighting windows into existing data as they are  
about building new data models.

A view is a node of type /freebase/query. All views have a /freebase/ 
query_hints/related_type which links to the type that the view is  
filtering. Optionally, a /common/document/content link to a JSON  
document will be injected into the MQL query. Views use an annotated  
subset of MQL; clauses the parser doesn't understand will be thrown  
out, so building views manually is unlikely to work.

A big revelation to me was that a side effect of saving a view in a  
base is that __everything in the view will be immediately typed with  
the "base topic" for that base__ (unless there are more than 1000  
items, or if the user has reached their 10K primitive daily write  
limit). A base topic is the type on the right of the base's /freebase/ 
domain_profile/base_type property, which by default also gets the key / 
base/foo/topic.

The justification for the base topic type is to support the "show me  
everything in this base" query. The reason this is not the same, as I  
had argued previously, as the "show me everything of a type in this  
base" query, is that some topics will appear __only in a base's view__  
without having any of the base's types (like the Yale person example).  
The two places this query shows up are: 1. the default "All topics"  
view on base homepages, and 2. in reverse form on any topic page  
(under the heading "User-created Views"). The latter of these seems to  
be the strongest UI requirement that drove the model -- we need it to  
be easy for people to find relevant bases when they're browsing topic  
pages.

Obviously, data added after the creation of a view will not cause  
topics to join a base (unless it's done with the "Add more" button of  
the view). To do that, we'd need to build a process that regularly  
cursors through saved views and adds missing topics to the  
corresponding base.

Hope this is helpful to some others too.

Al

-- 
Kirrily Robert
Freebase Community Director
kirrily at metaweb.com
http://freebase.com/






More information about the Data-modeling mailing list