From jeff at metaweb.com Tue Sep 1 19:26:17 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Tue, 1 Sep 2009 12:26:17 -0700 Subject: [Data-modeling] English Words In-Reply-To: <8D3E5DA7811B47168A1BA6C50F4180AB@amd> References: <4A8C25F5.1090602@metaweb.com> <3700175F-00E7-4959-9936-7ABB2D2079CA@metaweb.com> <2D2A19AF-A394-4AD9-B0C2-9A7C4CE48531@metaweb.com><4A8DCF26.3080909@metaweb.com><29E22A11C3EF4FB18087D813260C74AB@p4><4A9474F8.5080003@metaweb.com> <8D3E5DA7811B47168A1BA6C50F4180AB@amd> Message-ID: OK, let's see if we can agree on what aspects of synsets to model initially, so we can go ahead with a load. (Jamie tells me that he was experimenting with WordNet awhile back and already has some loading scripts.) I think Iain's Synonym Set is a good place to start: I think the existing properties for hypernym and hyponym, should definitely be included. I'd also vote for a synonym property. WordNet doesn't seem to have antonyms, so I'm not sure whether we want to keep that property for now or not (if we do, the expected type should be synset). As long as we're doing this, holonyms and meronyms would presumably be simple as well (holonym="is a part of", meronym="has parts"), although probably less important. Lexical category is probably necessary for now, but I think we should hold off on Morphology. Any other thoughts? Jeff > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Jeff Prucher > Sent: Thursday, August 27, 2009 11:45 AM > To: 'Freebase data modeling mailing list' > Subject: Re: [Data-modeling] English Words > > > > > -----Original Message----- > > From: data-modeling-bounces at freebase.com > > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Scott Meyer > > comment in more depth when I have more time to do so.> > > > > > Anyway... If you buy all this, then it seems reasonable to load > > Wordnet > > (synsets) first, then add pronunciation later. > > I agree -- let's do the synsets first (skipping the issue of > symbols entirely for now). (Not that we shouldn't continue > the other discussions -- we can do it in parallel!) > > Jeff > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From spatial.db at gmail.com Tue Sep 1 21:30:59 2009 From: spatial.db at gmail.com (Ed Laurent) Date: Tue, 1 Sep 2009 17:30:59 -0400 Subject: [Data-modeling] English Words In-Reply-To: References: <2D2A19AF-A394-4AD9-B0C2-9A7C4CE48531@metaweb.com> <4A8DCF26.3080909@metaweb.com> <29E22A11C3EF4FB18087D813260C74AB@p4> <4A9474F8.5080003@metaweb.com> <8D3E5DA7811B47168A1BA6C50F4180AB@amd> Message-ID: I still think that a homonym(s) property is relevant here, but maybe only homographs (AKA homoglyphs) for now. Homograph(s) (words with the same spelling but different meanings) should be really easy to populate after the initial load (e.g., IF synset AND name = name THEN homograph).This process might also help identify some merge candidates. -Ed On Tue, Sep 1, 2009 at 3:26 PM, Jeff Prucher wrote: > > OK, let's see if we can agree on what aspects of synsets to model initially, > so we can go ahead with a load. (Jamie tells me that he was experimenting > with WordNet awhile back and already has some loading scripts.) > > I think Iain's Synonym Set is a good place to start: > > > I think the existing properties for hypernym and hyponym, should definitely > be included. I'd also vote for a synonym property. WordNet doesn't seem to > have antonyms, so I'm not sure whether we want to keep that property for now > or not (if we do, the expected type should be synset). As long as we're > doing this, holonyms and meronyms would presumably be simple as well > (holonym="is a part of", meronym="has parts"), although probably less > important. > > Lexical category is probably necessary for now, but I think we should hold > off on Morphology. > > Any other thoughts? > > Jeff > > > -----Original Message----- > > From: data-modeling-bounces at freebase.com > > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Jeff Prucher > > Sent: Thursday, August 27, 2009 11:45 AM > > To: 'Freebase data modeling mailing list' > > Subject: Re: [Data-modeling] English Words > > > > > > > > > -----Original Message----- > > > From: data-modeling-bounces at freebase.com > > > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Scott Meyer > > > > > comment in more depth when I have more time to do so.> > > > > > > > > Anyway... If you buy all this, then it seems reasonable to load > > > Wordnet > > > (synsets) first, then add pronunciation later. > > > > I agree -- let's do the synsets first (skipping the issue of > > symbols entirely for now). (Not that we shouldn't continue > > the other discussions -- we can do it in parallel!) > > > > Jeff > > > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090901/91653fc7/attachment.htm From jeff at metaweb.com Tue Sep 1 22:30:15 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Tue, 1 Sep 2009 15:30:15 -0700 Subject: [Data-modeling] English Words In-Reply-To: References: <2D2A19AF-A394-4AD9-B0C2-9A7C4CE48531@metaweb.com> <4A8DCF26.3080909@metaweb.com><29E22A11C3EF4FB18087D813260C74AB@p4><4A9474F8.5080003@metaweb.com> <8D3E5DA7811B47168A1BA6C50F4180AB@amd> Message-ID: <491A72EBA026475C9CC92D2B9A4F949F@p4> I think homograph has to be part of the larger /common/symbol discussion -- I don't think it really applies to synsets. For example, the five noun synsets for "rat" are not homographs of each other; rather, the noun "rat" and the verb "rat" are homographs (I assume, anyway -- the two are derivationally related, so I don't really know if that counts -- the examples in dictionary definitions are always unrelated words). It also might require a model for inflected forms: the past tense of the verb "to cleave" is "cleft" which is a homograph with the noun "cleft", but we haven't even broached that subject (in this discussion, anyway: Iain made a model awhile back). Jeff _____ From: data-modeling-bounces at freebase.com [mailto:data-modeling-bounces at freebase.com] On Behalf Of Ed Laurent Sent: Tuesday, September 01, 2009 2:31 PM To: Freebase data modeling mailing list Subject: Re: [Data-modeling] English Words I still think that a homonym(s) property is relevant here, but maybe only homographs (AKA homoglyphs) for now. Homograph(s) (words with the same spelling but different meanings) should be really easy to populate after the initial load (e.g., IF synset AND name = name THEN homograph).This process might also help identify some merge candidates. -Ed On Tue, Sep 1, 2009 at 3:26 PM, Jeff Prucher wrote: > > OK, let's see if we can agree on what aspects of synsets to model initially, > so we can go ahead with a load. (Jamie tells me that he was experimenting > with WordNet awhile back and already has some loading scripts.) > > I think Iain's Synonym Set is a good place to start: > > > I think the existing properties for hypernym and hyponym, should definitely > be included. I'd also vote for a synonym property. WordNet doesn't seem to > have antonyms, so I'm not sure whether we want to keep that property for now > or not (if we do, the expected type should be synset). As long as we're > doing this, holonyms and meronyms would presumably be simple as well > (holonym="is a part of", meronym="has parts"), although probably less > important. > > Lexical category is probably necessary for now, but I think we should hold > off on Morphology. > > Any other thoughts? > > Jeff > > > -----Original Message----- > > From: data-modeling-bounces at freebase.com > > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Jeff Prucher > > Sent: Thursday, August 27, 2009 11:45 AM > > To: 'Freebase data modeling mailing list' > > Subject: Re: [Data-modeling] English Words > > > > > > > > > -----Original Message----- > > > From: data-modeling-bounces at freebase.com > > > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Scott Meyer > > > > > comment in more depth when I have more time to do so.> > > > > > > > > Anyway... If you buy all this, then it seems reasonable to load > > > Wordnet > > > (synsets) first, then add pronunciation later. > > > > I agree -- let's do the synsets first (skipping the issue of > > symbols entirely for now). (Not that we shouldn't continue > > the other discussions -- we can do it in parallel!) > > > > Jeff > > > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090901/42208cb6/attachment.htm From kirrily at metaweb.com Tue Sep 1 23:18:15 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Tue, 1 Sep 2009 16:18:15 -0700 Subject: [Data-modeling] List of backwards compatible operations on schemas In-Reply-To: References: Message-ID: <2629A6A1-9ABF-49E0-8806-5651A1CB277F@metaweb.com> On Aug 30, 2009, at 9:24 AM, Paul Mackay wrote: > Would it be possible to document a list operations that can be made > to a schema once a Base has been populated with a reasonable amount > of data? It would help to know what can be done once data is present > and what would require more complex data migration steps. > > Or if a list like this already exists does anyone know of a link? http://www.freebase.com/view/guid/9202a8c04000641f800000000b75f213 speaks mostly about the Commons but if you want to maintain schema compatibility in your bases, then the same guidelines will apply! K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com http://freebase.com/ From jeff at metaweb.com Tue Sep 1 23:56:15 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Tue, 1 Sep 2009 16:56:15 -0700 Subject: [Data-modeling] English Words In-Reply-To: References: <4A8C25F5.1090602@metaweb.com> <3700175F-00E7-4959-9936-7ABB2D2079CA@metaweb.com> <2D2A19AF-A394-4AD9-B0C2-9A7C4CE48531@metaweb.com><4A8DCF26.3080909@metaweb.com><29E22A11C3EF4FB18087D813260C74AB@p4><4A9474F8.5080003@metaweb.com><8D3E5DA7811B47168A1BA6C50F4180AB@amd> Message-ID: One additional thing I think we'll want is a property for "designatum" -- the topic that the synset signifies -- which would expect /common/topic. Jeff > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Jeff Prucher > Sent: Tuesday, September 01, 2009 12:26 PM > To: 'Freebase data modeling mailing list' > Subject: Re: [Data-modeling] English Words > > OK, let's see if we can agree on what aspects of synsets to > model initially, so we can go ahead with a load. (Jamie tells > me that he was experimenting with WordNet awhile back and > already has some loading scripts.) > > I think Iain's Synonym Set is a good place to start: > > > I think the existing properties for hypernym and hyponym, > should definitely be included. I'd also vote for a synonym > property. WordNet doesn't seem to have antonyms, so I'm not > sure whether we want to keep that property for now or not (if > we do, the expected type should be synset). As long as we're > doing this, holonyms and meronyms would presumably be simple > as well (holonym="is a part of", meronym="has parts"), > although probably less important. > > Lexical category is probably necessary for now, but I think > we should hold off on Morphology. > > Any other thoughts? > > Jeff > > > -----Original Message----- > > From: data-modeling-bounces at freebase.com > > [mailto:data-modeling-bounces at freebase.com] On Behalf Of > Jeff Prucher > > Sent: Thursday, August 27, 2009 11:45 AM > > To: 'Freebase data modeling mailing list' > > Subject: Re: [Data-modeling] English Words > > > > > > > > > -----Original Message----- > > > From: data-modeling-bounces at freebase.com > > > [mailto:data-modeling-bounces at freebase.com] On Behalf Of > Scott Meyer > > > > comment in > > more depth when I have more time to do so.> > > > > > > > > Anyway... If you buy all this, then it seems reasonable to load > > > Wordnet > > > (synsets) first, then add pronunciation later. > > > > I agree -- let's do the synsets first (skipping the issue > of symbols > > entirely for now). (Not that we shouldn't continue the other > > discussions -- we can do it in parallel!) > > > > Jeff > > > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From spencerkelly86 at gmail.com Wed Sep 2 02:24:49 2009 From: spencerkelly86 at gmail.com (Spencer Kelly) Date: Tue, 1 Sep 2009 22:24:49 -0400 Subject: [Data-modeling] English Words In-Reply-To: References: <4A8DCF26.3080909@metaweb.com> <29E22A11C3EF4FB18087D813260C74AB@p4> <4A9474F8.5080003@metaweb.com> <8D3E5DA7811B47168A1BA6C50F4180AB@amd> Message-ID: > > > Any other thoughts? > +1 >One additional thing I think we'll want is a property for "designatum" -- > denotatum? simulacrum? referendum? referee? *;) > the topic that the synset signifies -- which would expect /common/topic. ...or /common/property? the wordnet to schema links are gonna be bigtime. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090901/d161e692/attachment.htm From zenkat at metaweb.com Wed Sep 2 23:49:52 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Wed, 2 Sep 2009 16:49:52 -0700 Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: <61B633FC-96B7-4A7E-BE72-B16B0E5DD8A1@metaweb.com> References: <61B633FC-96B7-4A7E-BE72-B16B0E5DD8A1@metaweb.com> Message-ID: <430AE877-586B-47DC-B200-2B5057A5B580@metaweb.com> Hello All -- The test migration for the weak key proposal has been run on http://www.sandbox-freebase.com/ . For example: http://www.sandbox-freebase.com/edit/topic/weak/isbn/9780006153481/best Now points to the 1978 edition of "The Mauritus Command". If you follow the ISBN link: http://www.sandbox-freebase.com/edit/topic/weak/isbn/9780006153481 You'll find that there are two editions of this book that share the same ISBN. Please give it a look and let us know what you think. Note that for the test, we have maintained old ISBN property as "ISBN (Old Property)", but this will be hidden when we do the full migration on the main graph. Also, note that the new /book/book_edition/isbn property only shows up in the "edit" pages -- the new topic pages still only show the old property. This will be fixed before we deploy to the production server. Brian On Aug 21, 2009, at 4:34 PM, Reilly Hayes wrote: > > Hello All -- > > One of the challenges of loading books is dealing with ISBNs. Both > the ISO and Wikipedia claim that they are unique identifiers for > book editions. Because of this, we'd really like ISBNs to act as > keys within Freebase. Ideally, we'd like to have an /isbn/ > namespace so that people can externally reference book editions in > Freebase with an ISBN-based URI. > > However, experience in the field shows that ISBNs aren't guaranteed > to be unique. Publishers can and do reuse ISBNs. Sometimes they > are reused for a completely different book. More commonly, they are > reused for the same book but with differences in format or binding. > This is still a small subset of cases, but it is common enough that > we can't ignore or skip these cases. But Freebase keys can point to > one and only one Freebase topic. Once a value wants to point to two > or more topics, it can no longer be used as a key. > > So, we're left with a paradox. ISBNs should act like keys, allowing > external users to reference freebase entities by ISBNs -- but ISBNs > can't be keys, since we can't guarantee uniqueness. And note that > ISBNs are the only identifiers that have this problem: UPC codes are > also notoriously reused. Freebase needs some way to deal with these > "weak keys" that somehow solves all of these constraints in a > general way. Specifically, a "weak key" should: > Provide a consistent pattern that can be used across all weak keys > Provide a mechanism to pretend the key is strong by returning a > single "best" item > Clearly demarcate that the semantics in the keyspace are different > from "normal" keys > Allow identification of all entities that share the weak key > We've spent quite a bit of time over the last few months discussing > ways to resolve this conundrum, and we think we've finally come up > with an acceptable solution that we'd like to get your feedback on. > The basic idea is that ISBNs should point to their own dedicated > nodes of type /book/isbn. Then, instead of having a /book/ > book_edition/isbn be a /type/rawstring value, it will instead be a > property link to the /book/isbn node. > A root-level namespace ("/weak/") will be created that holds all > namespaces with the weak key nature. > Keys in the weak namespace point to weak key containers. For example > "/weak/isbn/9780670063260" will point to the "container node" for > that ISBN. > Weak key containers for ISBN will be typed as /book/isbn. > The /book/book_edition/isbn13 will be created as a property that > points to nodes with an expected type of /book/isbn. > Add a property to the key value type reversing the property from the > target type (/book/isbn/items.) (Note that, because of permissioning > it is essential that the master property be FROM /book/book_edition > TO /book/isbn.) > Containers will be named with the ISBN (for client display > purposes). For example, container node "/weak/isbn/9780670063260" > will be named "9780670063260". > The container node is cotyped as namespace, containing the single > key "best" that points to the object that "best". For example, "/ > weak/isbn/9780670063260/best" would resolve to http://www.freebase.com/edit/topic/guid/9202a8c04000641f80000000099fe6b6 > Gardening tasks will be created that will look for /book/isbn nodes > that don't fit these rules, and create all necessary links so that > the rules are fulfilled. > We've thought through all the consequences of this proposal, and > we're fairly certain that this proposal gives us the desired > behavior, without too many adverse side effects. We can go into the > details in follow-up emails if you're interested. > > Please let us know you're thoughts. We'd like to implement this > proposal (along with ISBN13 normalization, remember that?) before > the end of the month. > > -r > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090902/8b31192b/attachment.htm From tfmorris at gmail.com Thu Sep 3 00:02:08 2009 From: tfmorris at gmail.com (Tom Morris) Date: Wed, 2 Sep 2009 20:02:08 -0400 Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: <6A0B4A70-0DE8-4C33-BA33-D223E8F73123@metaweb.com> References: <61B633FC-96B7-4A7E-BE72-B16B0E5DD8A1@metaweb.com> <4A902E46.5000205@metaweb.com> <8A5DE5FA-6BE3-4CCF-B0B2-ED1D98139526@twinql.com> <6A0B4A70-0DE8-4C33-BA33-D223E8F73123@metaweb.com> Message-ID: On Mon, Aug 31, 2009 at 6:50 PM, Brian Karlak wrote: > With wikipedia keys -- and other "strong" keys -- the second > constraint doesn't exist. ?Unlike an ISBN, a wikipedia key doesn't > need to show up as a topic property. ?It's OK if we pick one topic as > the "best" topic to keep the keys on since the other topics don't need > to show them. I guess I'd missed the fact that there was a strict uniqueness requirements on Freebase topics with Wikipedia keys. What happens when there's a Wikipedia article that discusses two (or more) different concepts in a single article and it gets split on the Freebase side. It's one thing of it's "Laurel and Hardy" and there's justification for three topics, but if some wants to split Syzygy into eight different Freebase topics, do we really need to make it nine where the ninth is a fake "Szygy as conflated by Wikipedia" which exists only to hold the unique WP key? Wouldn't it be better to be able to navigate directly from each Freebase topic to the corresponding "best" WP article? Tom From philip-freebase at shadowmagic.org.uk Thu Sep 3 14:34:06 2009 From: philip-freebase at shadowmagic.org.uk (Philip Kendall) Date: Thu, 3 Sep 2009 15:34:06 +0100 Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: <430AE877-586B-47DC-B200-2B5057A5B580@metaweb.com> References: <61B633FC-96B7-4A7E-BE72-B16B0E5DD8A1@metaweb.com> <430AE877-586B-47DC-B200-2B5057A5B580@metaweb.com> Message-ID: <20090903143405.GO13273@sphinx.mythic-beasts.com> On Wed, Sep 02, 2009 at 04:49:52PM -0700, Brian Karlak wrote: > > Please give it a look and let us know what you think. I don't see any documentation explaining the system in use here - I hope it won't be deployed on production without some being produced. It's a complicated system and is just going to confuse people unless there's some good documentation written. > Note that for the test, we have maintained old ISBN property as "ISBN (Old > Property)", but this will be hidden when we do the full migration on > the main graph. I still think that recording what is actually written on the book is a valuable thing to do. Ariel's said something about eMQL, but I don't see why that's needed -- just keep the old ISBN property around as "what is actually written on the book". Cheers, Phil -- Philip Kendall http://www.shadowmagic.org.uk/ From stefano at metaweb.com Thu Sep 3 16:19:02 2009 From: stefano at metaweb.com (Stefano Mazzocchi) Date: Thu, 03 Sep 2009 09:19:02 -0700 Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: <7EADEC1C-578C-4027-9953-D31AF8B79092@metaweb.com> References: <61B633FC-96B7-4A7E-BE72-B16B0E5DD8A1@metaweb.com> <4A902E46.5000205@metaweb.com> <7EADEC1C-578C-4027-9953-D31AF8B79092@metaweb.com> Message-ID: <4A9FEC76.9050500@metaweb.com> Brian Karlak wrote: > On Aug 22, 2009, at 10:43 AM, Stefano Mazzocchi wrote: > >> I understand that 'weak' is a very precise definition of what this >> identification scheme is, and can apply to others just as well, but >> maybe there is a word we can use that has the same meaning but doesn't >> inspire a "our identifiers are better than yours" undertone to the >> casual observer. > > Well, it's important to note that the "/weak/" namespace is an > indication of how the keys are used in Freebase. It's not a property > of the keys themselves. > > For instance, ISBNs are a strong key for booksellers. An ISBN > uniquely identifies a single book that is available for sale. A > bookseller's supply chain software will use ISBNs as strong keys, just > as logistics companies will use UPC as strong keys. > > Freebase, however, is trying to use these keys in ways that they were > not designed for. Specifically, we're trying to track the historic > usage of ISBN & UPC codes on items that were available for purchase in > the past. We're also trying to track book editions with more > granularity than a bookseller by tracking binding format, cover art, > price, printing run, and the like. > > In other words, it's only because the Freebase definition of what a > "book edition" is differs from a booksellers that we're forced to use > the key in a "weak" manner. > > Because of this, I believe it's perfectly appropriate to use the > "weak" namespace to manage these keys in Freebase. It's not a > reflection on the keys themselves -- just how they are used in > Freebase. If anyone happens to see the namespace in Freebase (they > are rather hidden, after all), it's a simple matter to explain that > the limitation is ours, not theirs. I continue to remain against the use of the word 'weak' in this proposal. It might appear to be perfectly appropriate if you spend time to read docs and evaluate nuances, like we're doing in this thread (meaning, I don't disagree with your arguments on why it's a reasonable term). I just don't think many will get to that point: they will come across "weak" as part of a key and think "arrogant". And that would be the end of their interest in Freebase. Having the ability to disambiguate data with keys that collide given a big enough context should make people want to work with us more, not less. If not, why are we doing it? Sure, keys are hidden from most users view, but those who will get to see them (think 'view source') and care, they are more likely to be the people we want to attract not repulse. -- Stefano Mazzocchi Application Catalyst Metaweb Technologies, Inc. stefano at metaweb.com ------------------------------------------------------------------- From stefano at metaweb.com Thu Sep 3 18:02:07 2009 From: stefano at metaweb.com (Stefano Mazzocchi) Date: Thu, 03 Sep 2009 11:02:07 -0700 Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: <4A9FEC76.9050500@metaweb.com> References: <61B633FC-96B7-4A7E-BE72-B16B0E5DD8A1@metaweb.com> <4A902E46.5000205@metaweb.com> <7EADEC1C-578C-4027-9953-D31AF8B79092@metaweb.com> <4A9FEC76.9050500@metaweb.com> Message-ID: <4AA0049F.9090501@metaweb.com> Stefano Mazzocchi wrote: > Brian Karlak wrote: >> On Aug 22, 2009, at 10:43 AM, Stefano Mazzocchi wrote: >> >>> I understand that 'weak' is a very precise definition of what this >>> identification scheme is, and can apply to others just as well, but >>> maybe there is a word we can use that has the same meaning but doesn't >>> inspire a "our identifiers are better than yours" undertone to the >>> casual observer. >> Well, it's important to note that the "/weak/" namespace is an >> indication of how the keys are used in Freebase. It's not a property >> of the keys themselves. >> >> For instance, ISBNs are a strong key for booksellers. An ISBN >> uniquely identifies a single book that is available for sale. A >> bookseller's supply chain software will use ISBNs as strong keys, just >> as logistics companies will use UPC as strong keys. >> >> Freebase, however, is trying to use these keys in ways that they were >> not designed for. Specifically, we're trying to track the historic >> usage of ISBN & UPC codes on items that were available for purchase in >> the past. We're also trying to track book editions with more >> granularity than a bookseller by tracking binding format, cover art, >> price, printing run, and the like. >> >> In other words, it's only because the Freebase definition of what a >> "book edition" is differs from a booksellers that we're forced to use >> the key in a "weak" manner. >> >> Because of this, I believe it's perfectly appropriate to use the >> "weak" namespace to manage these keys in Freebase. It's not a >> reflection on the keys themselves -- just how they are used in >> Freebase. If anyone happens to see the namespace in Freebase (they >> are rather hidden, after all), it's a simple matter to explain that >> the limitation is ours, not theirs. > > I continue to remain against the use of the word 'weak' in this proposal. > > It might appear to be perfectly appropriate if you spend time to read > docs and evaluate nuances, like we're doing in this thread (meaning, I > don't disagree with your arguments on why it's a reasonable term). > > I just don't think many will get to that point: they will come across > "weak" as part of a key and think "arrogant". And that would be the end > of their interest in Freebase. > > Having the ability to disambiguate data with keys that collide given a > big enough context should make people want to work with us more, not > less. If not, why are we doing it? > > Sure, keys are hidden from most users view, but those who will get to > see them (think 'view source') and care, they are more likely to be the > people we want to attract not repulse. What do you guys think of "soft" instead of "weak"? What I like about it is: 1) it's reasonably self-explanatory 2) it doesn't have automatic negative connotations 3) it's short 4) it's curious enough to get people's attention but neutral enough to avoid triggering negative emotions Thoughts? -- Stefano Mazzocchi Application Catalyst Metaweb Technologies, Inc. stefano at metaweb.com ------------------------------------------------------------------- From spatial.db at gmail.com Thu Sep 3 18:23:14 2009 From: spatial.db at gmail.com (Ed Laurent) Date: Thu, 3 Sep 2009 14:23:14 -0400 Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: <4AA0049F.9090501@metaweb.com> References: <61B633FC-96B7-4A7E-BE72-B16B0E5DD8A1@metaweb.com> <4A902E46.5000205@metaweb.com> <7EADEC1C-578C-4027-9953-D31AF8B79092@metaweb.com> <4A9FEC76.9050500@metaweb.com> <4AA0049F.9090501@metaweb.com> Message-ID: +1. "Soft" still has a negative connotations but less so than "Weak", and it's short, simple, and appropriate. -Ed On Thu, Sep 3, 2009 at 2:02 PM, Stefano Mazzocchi wrote: > Stefano Mazzocchi wrote: > > Brian Karlak wrote: > >> On Aug 22, 2009, at 10:43 AM, Stefano Mazzocchi wrote: > >> > >>> I understand that 'weak' is a very precise definition of what this > >>> identification scheme is, and can apply to others just as well, but > >>> maybe there is a word we can use that has the same meaning but doesn't > >>> inspire a "our identifiers are better than yours" undertone to the > >>> casual observer. > >> Well, it's important to note that the "/weak/" namespace is an > >> indication of how the keys are used in Freebase. It's not a property > >> of the keys themselves. > >> > >> For instance, ISBNs are a strong key for booksellers. An ISBN > >> uniquely identifies a single book that is available for sale. A > >> bookseller's supply chain software will use ISBNs as strong keys, just > >> as logistics companies will use UPC as strong keys. > >> > >> Freebase, however, is trying to use these keys in ways that they were > >> not designed for. Specifically, we're trying to track the historic > >> usage of ISBN & UPC codes on items that were available for purchase in > >> the past. We're also trying to track book editions with more > >> granularity than a bookseller by tracking binding format, cover art, > >> price, printing run, and the like. > >> > >> In other words, it's only because the Freebase definition of what a > >> "book edition" is differs from a booksellers that we're forced to use > >> the key in a "weak" manner. > >> > >> Because of this, I believe it's perfectly appropriate to use the > >> "weak" namespace to manage these keys in Freebase. It's not a > >> reflection on the keys themselves -- just how they are used in > >> Freebase. If anyone happens to see the namespace in Freebase (they > >> are rather hidden, after all), it's a simple matter to explain that > >> the limitation is ours, not theirs. > > > > I continue to remain against the use of the word 'weak' in this proposal. > > > > It might appear to be perfectly appropriate if you spend time to read > > docs and evaluate nuances, like we're doing in this thread (meaning, I > > don't disagree with your arguments on why it's a reasonable term). > > > > I just don't think many will get to that point: they will come across > > "weak" as part of a key and think "arrogant". And that would be the end > > of their interest in Freebase. > > > > Having the ability to disambiguate data with keys that collide given a > > big enough context should make people want to work with us more, not > > less. If not, why are we doing it? > > > > Sure, keys are hidden from most users view, but those who will get to > > see them (think 'view source') and care, they are more likely to be the > > people we want to attract not repulse. > > What do you guys think of "soft" instead of "weak"? > > What I like about it is: > > 1) it's reasonably self-explanatory > 2) it doesn't have automatic negative connotations > 3) it's short > 4) it's curious enough to get people's attention but neutral enough to > avoid triggering negative emotions > > Thoughts? > > -- > Stefano Mazzocchi Application Catalyst > Metaweb Technologies, Inc. stefano at metaweb.com > ------------------------------------------------------------------- > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090903/25a5a1b8/attachment.htm From kirrily at metaweb.com Thu Sep 3 18:37:57 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Thu, 3 Sep 2009 11:37:57 -0700 Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: References: <61B633FC-96B7-4A7E-BE72-B16B0E5DD8A1@metaweb.com> <4A902E46.5000205@metaweb.com> <7EADEC1C-578C-4027-9953-D31AF8B79092@metaweb.com> <4A9FEC76.9050500@metaweb.com> <4AA0049F.9090501@metaweb.com> Message-ID: <3D969C41-51B9-48DE-A3B7-678BBB7285A9@metaweb.com> +1 On Sep 3, 2009, at 11:23 AM, Ed Laurent wrote: > +1. "Soft" still has a negative connotations but less so than > "Weak", and it's short, simple, and appropriate. > > -Ed > > On Thu, Sep 3, 2009 at 2:02 PM, Stefano Mazzocchi > wrote: > Stefano Mazzocchi wrote: > > Brian Karlak wrote: > >> On Aug 22, 2009, at 10:43 AM, Stefano Mazzocchi wrote: > >> > >>> I understand that 'weak' is a very precise definition of what this > >>> identification scheme is, and can apply to others just as well, > but > >>> maybe there is a word we can use that has the same meaning but > doesn't > >>> inspire a "our identifiers are better than yours" undertone to the > >>> casual observer. > >> Well, it's important to note that the "/weak/" namespace is an > >> indication of how the keys are used in Freebase. It's not a > property > >> of the keys themselves. > >> > >> For instance, ISBNs are a strong key for booksellers. An ISBN > >> uniquely identifies a single book that is available for sale. A > >> bookseller's supply chain software will use ISBNs as strong keys, > just > >> as logistics companies will use UPC as strong keys. > >> > >> Freebase, however, is trying to use these keys in ways that they > were > >> not designed for. Specifically, we're trying to track the historic > >> usage of ISBN & UPC codes on items that were available for > purchase in > >> the past. We're also trying to track book editions with more > >> granularity than a bookseller by tracking binding format, cover > art, > >> price, printing run, and the like. > >> > >> In other words, it's only because the Freebase definition of what a > >> "book edition" is differs from a booksellers that we're forced to > use > >> the key in a "weak" manner. > >> > >> Because of this, I believe it's perfectly appropriate to use the > >> "weak" namespace to manage these keys in Freebase. It's not a > >> reflection on the keys themselves -- just how they are used in > >> Freebase. If anyone happens to see the namespace in Freebase (they > >> are rather hidden, after all), it's a simple matter to explain that > >> the limitation is ours, not theirs. > > > > I continue to remain against the use of the word 'weak' in this > proposal. > > > > It might appear to be perfectly appropriate if you spend time to > read > > docs and evaluate nuances, like we're doing in this thread > (meaning, I > > don't disagree with your arguments on why it's a reasonable term). > > > > I just don't think many will get to that point: they will come > across > > "weak" as part of a key and think "arrogant". And that would be > the end > > of their interest in Freebase. > > > > Having the ability to disambiguate data with keys that collide > given a > > big enough context should make people want to work with us more, not > > less. If not, why are we doing it? > > > > Sure, keys are hidden from most users view, but those who will get > to > > see them (think 'view source') and care, they are more likely to > be the > > people we want to attract not repulse. > > What do you guys think of "soft" instead of "weak"? > > What I like about it is: > > 1) it's reasonably self-explanatory > 2) it doesn't have automatic negative connotations > 3) it's short > 4) it's curious enough to get people's attention but neutral enough > to > avoid triggering negative emotions > > Thoughts? > > -- > Stefano Mazzocchi Application Catalyst > Metaweb Technologies, Inc. stefano at metaweb.com > ------------------------------------------------------------------- > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -- Kirrily Robert Freebase Community Director kirrily at metaweb.com http://freebase.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090903/d419d38a/attachment-0001.htm From zenkat at metaweb.com Thu Sep 3 18:45:40 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Thu, 3 Sep 2009 11:45:40 -0700 Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: <3D969C41-51B9-48DE-A3B7-678BBB7285A9@metaweb.com> References: <61B633FC-96B7-4A7E-BE72-B16B0E5DD8A1@metaweb.com> <4A902E46.5000205@metaweb.com> <7EADEC1C-578C-4027-9953-D31AF8B79092@metaweb.com> <4A9FEC76.9050500@metaweb.com> <4AA0049F.9090501@metaweb.com> <3D969C41-51B9-48DE-A3B7-678BBB7285A9@metaweb.com> Message-ID: It sounds like we may have consensus. We'll go with "/soft/" unless there are any objections ... On Sep 3, 2009, at 11:37 AM, Kirrily Robert wrote: > +1 > > On Sep 3, 2009, at 11:23 AM, Ed Laurent wrote: > >> +1. "Soft" still has a negative connotations but less so than >> "Weak", and it's short, simple, and appropriate. >> >> -Ed >> >> On Thu, Sep 3, 2009 at 2:02 PM, Stefano Mazzocchi > > wrote: >> Stefano Mazzocchi wrote: >> > Brian Karlak wrote: >> >> On Aug 22, 2009, at 10:43 AM, Stefano Mazzocchi wrote: >> >> >> >>> I understand that 'weak' is a very precise definition of what >> this >> >>> identification scheme is, and can apply to others just as well, >> but >> >>> maybe there is a word we can use that has the same meaning but >> doesn't >> >>> inspire a "our identifiers are better than yours" undertone to >> the >> >>> casual observer. >> >> Well, it's important to note that the "/weak/" namespace is an >> >> indication of how the keys are used in Freebase. It's not a >> property >> >> of the keys themselves. >> >> >> >> For instance, ISBNs are a strong key for booksellers. An ISBN >> >> uniquely identifies a single book that is available for sale. A >> >> bookseller's supply chain software will use ISBNs as strong >> keys, just >> >> as logistics companies will use UPC as strong keys. >> >> >> >> Freebase, however, is trying to use these keys in ways that they >> were >> >> not designed for. Specifically, we're trying to track the >> historic >> >> usage of ISBN & UPC codes on items that were available for >> purchase in >> >> the past. We're also trying to track book editions with more >> >> granularity than a bookseller by tracking binding format, cover >> art, >> >> price, printing run, and the like. >> >> >> >> In other words, it's only because the Freebase definition of >> what a >> >> "book edition" is differs from a booksellers that we're forced >> to use >> >> the key in a "weak" manner. >> >> >> >> Because of this, I believe it's perfectly appropriate to use the >> >> "weak" namespace to manage these keys in Freebase. It's not a >> >> reflection on the keys themselves -- just how they are used in >> >> Freebase. If anyone happens to see the namespace in Freebase >> (they >> >> are rather hidden, after all), it's a simple matter to explain >> that >> >> the limitation is ours, not theirs. >> > >> > I continue to remain against the use of the word 'weak' in this >> proposal. >> > >> > It might appear to be perfectly appropriate if you spend time to >> read >> > docs and evaluate nuances, like we're doing in this thread >> (meaning, I >> > don't disagree with your arguments on why it's a reasonable term). >> > >> > I just don't think many will get to that point: they will come >> across >> > "weak" as part of a key and think "arrogant". And that would be >> the end >> > of their interest in Freebase. >> > >> > Having the ability to disambiguate data with keys that collide >> given a >> > big enough context should make people want to work with us more, >> not >> > less. If not, why are we doing it? >> > >> > Sure, keys are hidden from most users view, but those who will >> get to >> > see them (think 'view source') and care, they are more likely to >> be the >> > people we want to attract not repulse. >> >> What do you guys think of "soft" instead of "weak"? >> >> What I like about it is: >> >> 1) it's reasonably self-explanatory >> 2) it doesn't have automatic negative connotations >> 3) it's short >> 4) it's curious enough to get people's attention but neutral >> enough to >> avoid triggering negative emotions >> >> Thoughts? >> >> -- >> Stefano Mazzocchi Application Catalyst >> Metaweb Technologies, Inc. stefano at metaweb.com >> ------------------------------------------------------------------- >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling > > -- > Kirrily Robert > Freebase Community Director > kirrily at metaweb.com > http://freebase.com/ > > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090903/ddb324a8/attachment.htm From tfmorris at gmail.com Thu Sep 3 21:41:50 2009 From: tfmorris at gmail.com (Tom Morris) Date: Thu, 3 Sep 2009 17:41:50 -0400 Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: References: <61B633FC-96B7-4A7E-BE72-B16B0E5DD8A1@metaweb.com> <4A902E46.5000205@metaweb.com> <7EADEC1C-578C-4027-9953-D31AF8B79092@metaweb.com> <4A9FEC76.9050500@metaweb.com> <4AA0049F.9090501@metaweb.com> <3D969C41-51B9-48DE-A3B7-678BBB7285A9@metaweb.com> Message-ID: Fuzzy? Or to be even more approachable, cuddly? :-) On Thu, Sep 3, 2009 at 2:45 PM, Brian Karlak wrote: > > It sounds like we may have consensus. ?We'll go with "/soft/" unless there > are any objections ... > On Sep 3, 2009, at 11:37 AM, Kirrily Robert wrote: > > +1 > On Sep 3, 2009, at 11:23 AM, Ed Laurent wrote: > > +1. "Soft" still has a negative connotations but less so than "Weak", and > it's short, simple, and appropriate. > > -Ed > > On Thu, Sep 3, 2009 at 2:02 PM, Stefano Mazzocchi > wrote: >> >> Stefano Mazzocchi wrote: >> > Brian Karlak wrote: >> >> On Aug 22, 2009, at 10:43 AM, Stefano Mazzocchi wrote: >> >> >> >>> I understand that 'weak' is a very precise definition of what this >> >>> identification scheme is, and can apply to others just as well, but >> >>> maybe there is a word we can use that has the same meaning but doesn't >> >>> inspire a "our identifiers are better than yours" undertone to the >> >>> casual observer. >> >> Well, it's important to note that the "/weak/" namespace is an >> >> indication of how the keys are used in Freebase. It's not a property >> >> of the keys themselves. >> >> >> >> For instance, ISBNs are a strong key for booksellers. ?An ISBN >> >> uniquely identifies a single book that is available for sale. ?A >> >> bookseller's supply chain software will use ISBNs as strong keys, just >> >> as logistics companies will use UPC as strong keys. >> >> >> >> Freebase, however, is trying to use these keys in ways that they were >> >> not designed for. ?Specifically, we're trying to track the historic >> >> usage of ISBN & UPC codes on items that were available for purchase in >> >> the past. ?We're also trying to track book editions with more >> >> granularity than a bookseller by tracking binding format, cover art, >> >> price, printing run, and the like. >> >> >> >> In other words, it's only because the Freebase definition of what a >> >> "book edition" is differs from a booksellers that we're forced to use >> >> the key in a "weak" manner. >> >> >> >> Because of this, I believe it's perfectly appropriate to use the >> >> "weak" namespace to manage these keys in Freebase. ?It's not a >> >> reflection on the keys themselves -- just how they are used in >> >> Freebase. ?If anyone happens to see the namespace in Freebase (they >> >> are rather hidden, after all), it's a simple matter to explain that >> >> the limitation is ours, not theirs. >> > >> > I continue to remain against the use of the word 'weak' in this >> > proposal. >> > >> > It might appear to be perfectly appropriate if you spend time to read >> > docs and evaluate nuances, like we're doing in this thread (meaning, I >> > don't disagree with your arguments on why it's a reasonable term). >> > >> > I just don't think many will get to that point: they will come across >> > "weak" as part of a key and think "arrogant". And that would be the end >> > of their interest in Freebase. >> > >> > Having the ability to disambiguate data with keys that collide given a >> > big enough context should make people want to work with us more, not >> > less. If not, why are we doing it? >> > >> > Sure, keys are hidden from most users view, but those who will get to >> > see them (think 'view source') and care, they are more likely to be the >> > people we want to attract not repulse. >> >> What do you guys think of "soft" instead of "weak"? >> >> What I like about it is: >> >> ?1) it's reasonably self-explanatory >> ?2) it doesn't have automatic negative connotations >> ?3) it's short >> ?4) it's curious enough to get people's attention but neutral enough to >> avoid triggering negative emotions >> >> Thoughts? >> >> -- >> Stefano Mazzocchi ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Application Catalyst >> Metaweb Technologies, Inc. ? ? ? ? ? ? ? ? ? ? ?stefano at metaweb.com >> ------------------------------------------------------------------- >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > -- > Kirrily Robert > Freebase Community Director > kirrily at metaweb.com > http://freebase.com/ > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > From spencerkelly86 at gmail.com Thu Sep 3 21:52:02 2009 From: spencerkelly86 at gmail.com (Spencer Kelly) Date: Thu, 3 Sep 2009 17:52:02 -0400 Subject: [Data-modeling] Conference schema Message-ID: hi data modelers, looking for some input with fixing the conference<%20http://www.freebase.com/view/conferences>commons. a conference (like Ontario linux fest) connects to its events (like Linuxfest '97) through an 'instance of' property. but this should be done with 'event'/'recurring_event' right? the only dilemma is that these instances inherit properties from 'conference series' (conference type, subject etc). so ...delegation? i dont know whats right. the other thing is I'd like to be able to connect a conference geographically. Ontario linux fest has a 'scope' of Ontario. But looking at the types that cotype recurring eventit seems they all have a geographical scope, and if they don't, the geographcal scope is 'international'. Maybe this is a good property to have on recurring event and not conferences. i'd like to promote public speaking eventto cotype unless there's dispute- its much cleaner than what currently exists. as well there's disucssion on how best to model conference sponsorship. yuck -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090903/31905b11/attachment.htm From vishal at metaweb.com Thu Sep 3 22:12:08 2009 From: vishal at metaweb.com (Vishal Talwar) Date: Thu, 3 Sep 2009 15:12:08 -0700 (PDT) Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: Message-ID: <2095333532.59821252015928388.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> How about "shared"? ----- Original Message ----- From: "Tom Morris" To: "Freebase data modeling mailing list" Sent: Thursday, September 3, 2009 2:41:50 PM GMT -08:00 US/Canada Pacific Subject: Re: [Data-modeling] The Curse of the ISBN Fuzzy? Or to be even more approachable, cuddly? :-) On Thu, Sep 3, 2009 at 2:45 PM, Brian Karlak wrote: > > It sounds like we may have consensus. ?We'll go with "/soft/" unless there > are any objections ... > On Sep 3, 2009, at 11:37 AM, Kirrily Robert wrote: > > +1 > On Sep 3, 2009, at 11:23 AM, Ed Laurent wrote: > > +1. "Soft" still has a negative connotations but less so than "Weak", and > it's short, simple, and appropriate. > > -Ed > > On Thu, Sep 3, 2009 at 2:02 PM, Stefano Mazzocchi > wrote: >> >> Stefano Mazzocchi wrote: >> > Brian Karlak wrote: >> >> On Aug 22, 2009, at 10:43 AM, Stefano Mazzocchi wrote: >> >> >> >>> I understand that 'weak' is a very precise definition of what this >> >>> identification scheme is, and can apply to others just as well, but >> >>> maybe there is a word we can use that has the same meaning but doesn't >> >>> inspire a "our identifiers are better than yours" undertone to the >> >>> casual observer. >> >> Well, it's important to note that the "/weak/" namespace is an >> >> indication of how the keys are used in Freebase. It's not a property >> >> of the keys themselves. >> >> >> >> For instance, ISBNs are a strong key for booksellers. ?An ISBN >> >> uniquely identifies a single book that is available for sale. ?A >> >> bookseller's supply chain software will use ISBNs as strong keys, just >> >> as logistics companies will use UPC as strong keys. >> >> >> >> Freebase, however, is trying to use these keys in ways that they were >> >> not designed for. ?Specifically, we're trying to track the historic >> >> usage of ISBN & UPC codes on items that were available for purchase in >> >> the past. ?We're also trying to track book editions with more >> >> granularity than a bookseller by tracking binding format, cover art, >> >> price, printing run, and the like. >> >> >> >> In other words, it's only because the Freebase definition of what a >> >> "book edition" is differs from a booksellers that we're forced to use >> >> the key in a "weak" manner. >> >> >> >> Because of this, I believe it's perfectly appropriate to use the >> >> "weak" namespace to manage these keys in Freebase. ?It's not a >> >> reflection on the keys themselves -- just how they are used in >> >> Freebase. ?If anyone happens to see the namespace in Freebase (they >> >> are rather hidden, after all), it's a simple matter to explain that >> >> the limitation is ours, not theirs. >> > >> > I continue to remain against the use of the word 'weak' in this >> > proposal. >> > >> > It might appear to be perfectly appropriate if you spend time to read >> > docs and evaluate nuances, like we're doing in this thread (meaning, I >> > don't disagree with your arguments on why it's a reasonable term). >> > >> > I just don't think many will get to that point: they will come across >> > "weak" as part of a key and think "arrogant". And that would be the end >> > of their interest in Freebase. >> > >> > Having the ability to disambiguate data with keys that collide given a >> > big enough context should make people want to work with us more, not >> > less. If not, why are we doing it? >> > >> > Sure, keys are hidden from most users view, but those who will get to >> > see them (think 'view source') and care, they are more likely to be the >> > people we want to attract not repulse. >> >> What do you guys think of "soft" instead of "weak"? >> >> What I like about it is: >> >> ?1) it's reasonably self-explanatory >> ?2) it doesn't have automatic negative connotations >> ?3) it's short >> ?4) it's curious enough to get people's attention but neutral enough to >> avoid triggering negative emotions >> >> Thoughts? >> >> -- >> Stefano Mazzocchi ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Application Catalyst >> Metaweb Technologies, Inc. ? ? ? ? ? ? ? ? ? ? ?stefano at metaweb.com >> ------------------------------------------------------------------- >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > -- > Kirrily Robert > Freebase Community Director > kirrily at metaweb.com > http://freebase.com/ > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ Data-modeling mailing list Data-modeling at freebase.com http://lists.freebase.com/mailman/listinfo/data-modeling From bryan.cheung at metaweb.com Thu Sep 3 22:37:03 2009 From: bryan.cheung at metaweb.com (Bryan Cheung) Date: Thu, 3 Sep 2009 15:37:03 -0700 Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: <20090903143405.GO13273@sphinx.mythic-beasts.com> References: <61B633FC-96B7-4A7E-BE72-B16B0E5DD8A1@metaweb.com> <430AE877-586B-47DC-B200-2B5057A5B580@metaweb.com> <20090903143405.GO13273@sphinx.mythic-beasts.com> Message-ID: > I don't see any documentation explaining the system in use here - I > hope > it won't be deployed on production without some being produced. It's a > complicated system and is just going to confuse people unless there's > some good documentation written. Operations is currently setting up a Freebase wiki - we'll post the documentation up there. The properties currently have documentation, and type documentation will definitely be added. On Sep 3, 2009, at 7:34 AM, Philip Kendall wrote: > On Wed, Sep 02, 2009 at 04:49:52PM -0700, Brian Karlak wrote: >> >> Please give it a look and let us know what you think. > > I don't see any documentation explaining the system in use here - I > hope > it won't be deployed on production without some being produced. It's a > complicated system and is just going to confuse people unless there's > some good documentation written. > >> Note that for the test, we have maintained old ISBN property as >> "ISBN (Old >> Property)", but this will be hidden when we do the full migration on >> the main graph. > > I still think that recording what is actually written on the book is a > valuable thing to do. Ariel's said something about eMQL, but I don't > see > why that's needed -- just keep the old ISBN property around as "what > is > actually written on the book". > > Cheers, > > Phil > > -- > Philip Kendall > http://www.shadowmagic.org.uk/ > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From bryan.cheung at metaweb.com Thu Sep 3 23:28:05 2009 From: bryan.cheung at metaweb.com (Bryan Cheung) Date: Thu, 3 Sep 2009 16:28:05 -0700 (PDT) Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: <2095333532.59821252015928388.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: <386718842.60261252020485364.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> +1 ----- Original Message ----- From: Vishal Talwar To: Freebase data modeling mailing list Sent: Thu, 3 Sep 2009 15:12:08 -0700 (PDT) Subject: Re: [Data-modeling] The Curse of the ISBN How about "shared"? ----- Original Message ----- From: "Tom Morris" To: "Freebase data modeling mailing list" Sent: Thursday, September 3, 2009 2:41:50 PM GMT -08:00 US/Canada Pacific Subject: Re: [Data-modeling] The Curse of the ISBN Fuzzy? Or to be even more approachable, cuddly? :-) On Thu, Sep 3, 2009 at 2:45 PM, Brian Karlak wrote: > > It sounds like we may have consensus. ?We'll go with "/soft/" unless there > are any objections ... > On Sep 3, 2009, at 11:37 AM, Kirrily Robert wrote: > > +1 > On Sep 3, 2009, at 11:23 AM, Ed Laurent wrote: > > +1. "Soft" still has a negative connotations but less so than "Weak", and > it's short, simple, and appropriate. > > -Ed > > On Thu, Sep 3, 2009 at 2:02 PM, Stefano Mazzocchi > wrote: >> >> Stefano Mazzocchi wrote: >> > Brian Karlak wrote: >> >> On Aug 22, 2009, at 10:43 AM, Stefano Mazzocchi wrote: >> >> >> >>> I understand that 'weak' is a very precise definition of what this >> >>> identification scheme is, and can apply to others just as well, but >> >>> maybe there is a word we can use that has the same meaning but doesn't >> >>> inspire a "our identifiers are better than yours" undertone to the >> >>> casual observer. >> >> Well, it's important to note that the "/weak/" namespace is an >> >> indication of how the keys are used in Freebase. It's not a property >> >> of the keys themselves. >> >> >> >> For instance, ISBNs are a strong key for booksellers. ?An ISBN >> >> uniquely identifies a single book that is available for sale. ?A >> >> bookseller's supply chain software will use ISBNs as strong keys, just >> >> as logistics companies will use UPC as strong keys. >> >> >> >> Freebase, however, is trying to use these keys in ways that they were >> >> not designed for. ?Specifically, we're trying to track the historic >> >> usage of ISBN & UPC codes on items that were available for purchase in >> >> the past. ?We're also trying to track book editions with more >> >> granularity than a bookseller by tracking binding format, cover art, >> >> price, printing run, and the like. >> >> >> >> In other words, it's only because the Freebase definition of what a >> >> "book edition" is differs from a booksellers that we're forced to use >> >> the key in a "weak" manner. >> >> >> >> Because of this, I believe it's perfectly appropriate to use the >> >> "weak" namespace to manage these keys in Freebase. ?It's not a >> >> reflection on the keys themselves -- just how they are used in >> >> Freebase. ?If anyone happens to see the namespace in Freebase (they >> >> are rather hidden, after all), it's a simple matter to explain that >> >> the limitation is ours, not theirs. >> > >> > I continue to remain against the use of the word 'weak' in this >> > proposal. >> > >> > It might appear to be perfectly appropriate if you spend time to read >> > docs and evaluate nuances, like we're doing in this thread (meaning, I >> > don't disagree with your arguments on why it's a reasonable term). >> > >> > I just don't think many will get to that point: they will come across >> > "weak" as part of a key and think "arrogant". And that would be the end >> > of their interest in Freebase. >> > >> > Having the ability to disambiguate data with keys that collide given a >> > big enough context should make people want to work with us more, not >> > less. If not, why are we doing it? >> > >> > Sure, keys are hidden from most users view, but those who will get to >> > see them (think 'view source') and care, they are more likely to be the >> > people we want to attract not repulse. >> >> What do you guys think of "soft" instead of "weak"? >> >> What I like about it is: >> >> ?1) it's reasonably self-explanatory >> ?2) it doesn't have automatic negative connotations >> ?3) it's short >> ?4) it's curious enough to get people's attention but neutral enough to >> avoid triggering negative emotions >> >> Thoughts? >> >> -- >> Stefano Mazzocchi ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Application Catalyst >> Metaweb Technologies, Inc. ? ? ? ? ? ? ? ? ? ? ?stefano at metaweb.com >> ------------------------------------------------------------------- >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > -- > Kirrily Robert > Freebase Community Director > kirrily at metaweb.com > http://freebase.com/ > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ Data-modeling mailing list Data-modeling at freebase.com http://lists.freebase.com/mailman/listinfo/data-modeling _______________________________________________ Data-modeling mailing list Data-modeling at freebase.com http://lists.freebase.com/mailman/listinfo/data-modeling From zenkat at metaweb.com Fri Sep 4 20:50:29 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Fri, 4 Sep 2009 13:50:29 -0700 Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: References: <61B633FC-96B7-4A7E-BE72-B16B0E5DD8A1@metaweb.com> <4A902E46.5000205@metaweb.com> <7EADEC1C-578C-4027-9953-D31AF8B79092@metaweb.com> <4A9FEC76.9050500@metaweb.com> <4AA0049F.9090501@metaweb.com> <3D969C41-51B9-48DE-A3B7-678BBB7285A9@metaweb.com> Message-ID: <9856597A-47E5-4D51-A417-11B876A2F1E7@metaweb.com> It sounds like we're happy with "soft" -- I haven't seen significant support for shared, fuzzy, cuddly, or furry. This will be implemented on the production graph sometimes before Tuesday evening. Happy Labor Day, all! (those of you in the US, at least ;-) Brian On Sep 3, 2009, at 11:45 AM, Brian Karlak wrote: > It sounds like we may have consensus. We'll go with "/soft/" unless > there are any objections ... > > On Sep 3, 2009, at 11:37 AM, Kirrily Robert wrote: > >> +1 >> >> On Sep 3, 2009, at 11:23 AM, Ed Laurent wrote: >> >>> +1. "Soft" still has a negative connotations but less so than >>> "Weak", and it's short, simple, and appropriate. >>> >>> -Ed >>> >>> On Thu, Sep 3, 2009 at 2:02 PM, Stefano Mazzocchi >> > wrote: >>> Stefano Mazzocchi wrote: >>> > Brian Karlak wrote: >>> >> On Aug 22, 2009, at 10:43 AM, Stefano Mazzocchi wrote: >>> >> >>> >>> I understand that 'weak' is a very precise definition of what >>> this >>> >>> identification scheme is, and can apply to others just as >>> well, but >>> >>> maybe there is a word we can use that has the same meaning but >>> doesn't >>> >>> inspire a "our identifiers are better than yours" undertone to >>> the >>> >>> casual observer. >>> >> Well, it's important to note that the "/weak/" namespace is an >>> >> indication of how the keys are used in Freebase. It's not a >>> property >>> >> of the keys themselves. >>> >> >>> >> For instance, ISBNs are a strong key for booksellers. An ISBN >>> >> uniquely identifies a single book that is available for sale. A >>> >> bookseller's supply chain software will use ISBNs as strong >>> keys, just >>> >> as logistics companies will use UPC as strong keys. >>> >> >>> >> Freebase, however, is trying to use these keys in ways that >>> they were >>> >> not designed for. Specifically, we're trying to track the >>> historic >>> >> usage of ISBN & UPC codes on items that were available for >>> purchase in >>> >> the past. We're also trying to track book editions with more >>> >> granularity than a bookseller by tracking binding format, cover >>> art, >>> >> price, printing run, and the like. >>> >> >>> >> In other words, it's only because the Freebase definition of >>> what a >>> >> "book edition" is differs from a booksellers that we're forced >>> to use >>> >> the key in a "weak" manner. >>> >> >>> >> Because of this, I believe it's perfectly appropriate to use the >>> >> "weak" namespace to manage these keys in Freebase. It's not a >>> >> reflection on the keys themselves -- just how they are used in >>> >> Freebase. If anyone happens to see the namespace in Freebase >>> (they >>> >> are rather hidden, after all), it's a simple matter to explain >>> that >>> >> the limitation is ours, not theirs. >>> > >>> > I continue to remain against the use of the word 'weak' in this >>> proposal. >>> > >>> > It might appear to be perfectly appropriate if you spend time to >>> read >>> > docs and evaluate nuances, like we're doing in this thread >>> (meaning, I >>> > don't disagree with your arguments on why it's a reasonable term). >>> > >>> > I just don't think many will get to that point: they will come >>> across >>> > "weak" as part of a key and think "arrogant". And that would be >>> the end >>> > of their interest in Freebase. >>> > >>> > Having the ability to disambiguate data with keys that collide >>> given a >>> > big enough context should make people want to work with us more, >>> not >>> > less. If not, why are we doing it? >>> > >>> > Sure, keys are hidden from most users view, but those who will >>> get to >>> > see them (think 'view source') and care, they are more likely to >>> be the >>> > people we want to attract not repulse. >>> >>> What do you guys think of "soft" instead of "weak"? >>> >>> What I like about it is: >>> >>> 1) it's reasonably self-explanatory >>> 2) it doesn't have automatic negative connotations >>> 3) it's short >>> 4) it's curious enough to get people's attention but neutral >>> enough to >>> avoid triggering negative emotions >>> >>> Thoughts? >>> >>> -- >>> Stefano Mazzocchi Application Catalyst >>> Metaweb Technologies, Inc. stefano at metaweb.com >>> ------------------------------------------------------------------- >>> >>> _______________________________________________ >>> Data-modeling mailing list >>> Data-modeling at freebase.com >>> http://lists.freebase.com/mailman/listinfo/data-modeling >>> >>> _______________________________________________ >>> Data-modeling mailing list >>> Data-modeling at freebase.com >>> http://lists.freebase.com/mailman/listinfo/data-modeling >> >> -- >> Kirrily Robert >> Freebase Community Director >> kirrily at metaweb.com >> http://freebase.com/ >> >> >> >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090904/cce18068/attachment.htm From zenkat at metaweb.com Fri Sep 4 20:57:11 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Fri, 4 Sep 2009 13:57:11 -0700 Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: References: <61B633FC-96B7-4A7E-BE72-B16B0E5DD8A1@metaweb.com> <4A902E46.5000205@metaweb.com> <8A5DE5FA-6BE3-4CCF-B0B2-ED1D98139526@twinql.com> <6A0B4A70-0DE8-4C33-BA33-D223E8F73123@metaweb.com> Message-ID: <8C351BF8-58C7-458E-B80C-9C872D61F5C6@metaweb.com> On Sep 2, 2009, at 5:02 PM, Tom Morris wrote: > I guess I'd missed the fact that there was a strict uniqueness > requirements on Freebase topics with Wikipedia keys. What happens > when there's a Wikipedia article that discusses two (or more) > different concepts in a single article and it gets split on the > Freebase side. This happens all of the time. The wikipedia key goes to the "best" matching topic -- usually the one that should get the blurb. Consider a musical artist and her discography. A single wikipedia article will discuss both the artist and her albums. It would be a mess if we tried to somehow attach a single wikipedia key to all of them, especially when it came to the blurbs. The wikipedia key really belongs on a single topic -- the musical artist. > It's one thing of it's "Laurel and Hardy" and there's justification > for three topics, but if some wants to split Syzygy into eight > different Freebase topics, do we really need to make it nine where the > ninth is a fake "Szygy as conflated by Wikipedia" which exists only to > hold the unique WP key? No, that's not what we're doing. One Szygy-derived topic will get the wikipedia key. The others will not have wikipedia keys. Brian From zenkat at metaweb.com Fri Sep 4 21:09:29 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Fri, 4 Sep 2009 14:09:29 -0700 Subject: [Data-modeling] The Curse of the ISBN In-Reply-To: <20090903143405.GO13273@sphinx.mythic-beasts.com> References: <61B633FC-96B7-4A7E-BE72-B16B0E5DD8A1@metaweb.com> <430AE877-586B-47DC-B200-2B5057A5B580@metaweb.com> <20090903143405.GO13273@sphinx.mythic-beasts.com> Message-ID: <8E29B754-889E-484E-9FE6-17F430C5C459@metaweb.com> On Sep 3, 2009, at 7:34 AM, Philip Kendall wrote: >> Note that for the test, we have maintained old ISBN property as >> "ISBN (Old >> Property)", but this will be hidden when we do the full migration on >> the main graph. > > I still think that recording what is actually written on the book is a > valuable thing to do. Ariel's said something about eMQL, but I don't > see > why that's needed -- just keep the old ISBN property around as "what > is > actually written on the book". I fear that if we try to keep ISBN information in two places in the graph, they will invariably get out of sync. Then we'll be facing the "two clocks" problem -- there won't be any way to determine which is correct. Also, it seems that many of the sources we're planning using for book data don't give any guarantees about "what is actually on the book", which means we'd have no way to automatically set this property during mass data loads. Since 99%+ of our books have come from automated loads, it's unlikely this property would ever be populated. The best we can do, I think, it to provide the raw untrammeled source data along with our fields. We're currently speccing out a web service to provide the "raw data" used to populate our topics. This will contain the ISBN as recorded in whatever source system we're drawing from ... Brian From jeff at metaweb.com Thu Sep 10 20:34:05 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Thu, 10 Sep 2009 13:34:05 -0700 Subject: [Data-modeling] Refactoring Physical Geography a bit Message-ID: <07ABED2AA0C540DDA0D41CFF062CB771@amd> Cross-posted from Developers List: Based on this discussion (), I'd like refactor some types in the Physical Geography domain. 1. The type /geography/geographical_feature is currently a bucket holding different kinds of features (tarn, isthmus, cave, etc.). I'd like to change the semantics of this type so that it instead holds actual features (Lake Geneva, Isthmus of Panama, Mammoth Cave, etc.). A new property will be added to this type ("Class" or something similar) which will connect to a new type "Geographical Feature Type". This new type will have as instances the same sorts of topics that Geographical Feature currently does. (In fact, all will simply be migrated to the new type.) 2. These types, which have no properties, will be deleted, since the is-a relationship with the kind of feature it is will be handled by a property instead: /geography/desert /geography/bay /geography/strait /geography/wetland The JIRA task is here: https://bugs.freebase.com/browse/DA-916 Will this affect anyone's code? Jeff From tfmorris at gmail.com Fri Sep 11 02:35:57 2009 From: tfmorris at gmail.com (Tom Morris) Date: Thu, 10 Sep 2009 22:35:57 -0400 Subject: [Data-modeling] U.S. Laws Message-ID: So someone just posted to a thread that I tried, unsuccessfully, to revive back in April. Toby made the initial request back at the beginning of 2008 (almost 2 years ago). http://www.freebase.com/discuss/threads/guid/9202a8c04000641f8000000006fc25bb What would it take to move this forward? Surely, laws are important to someone. (plus, I've noticed, as Toby did, that it would allow typing of lots of untyped topics). Tom From philip-freebase at shadowmagic.org.uk Fri Sep 11 12:42:29 2009 From: philip-freebase at shadowmagic.org.uk (Philip Kendall) Date: Fri, 11 Sep 2009 13:42:29 +0100 Subject: [Data-modeling] Schema Visualisation app Message-ID: <20090911124229.GK13273@sphinx.mythic-beasts.com> Some of you may have already seen Kirrily's tweet, but if not... a little app while people may be interested in: http://schemaviz.freebaseapps.com/ It should give you a visualisation of any domain you're interested in, showing how the types in that domain are linked by properties. * Nodes represent types, edges represent properties. * Reverse properties are shown in parentheses after the master property name. * Grey edges represent hidden properties * Blue nodes are types outside the domain being viewed, but which are linked to from the domain (the edge will be dashed if the master property is from the type outside the domain to that inside the domain) * Red nodes represent undocumented types * All nodes are clickable to view that type Please let me know of any bugs / feature requests. Thanks should obviously go to everyone at Metaweb for this, in particular the Acre developers and Alexander (for his Genealogy app which was the inspiration for this). Cheers, Phil -- Philip Kendall http://www.shadowmagic.org.uk/ From stefano at metaweb.com Fri Sep 11 16:29:17 2009 From: stefano at metaweb.com (Stefano Mazzocchi) Date: Fri, 11 Sep 2009 09:29:17 -0700 Subject: [Data-modeling] Schema Visualisation app In-Reply-To: <20090911124229.GK13273@sphinx.mythic-beasts.com> References: <20090911124229.GK13273@sphinx.mythic-beasts.com> Message-ID: <4AAA7ADD.3000102@metaweb.com> Philip Kendall wrote: > Some of you may have already seen Kirrily's tweet, but if not... a little > app while people may be interested in: > > http://schemaviz.freebaseapps.com/ > > It should give you a visualisation of any domain you're interested in, > showing how the types in that domain are linked by properties. > > * Nodes represent types, edges represent properties. > * Reverse properties are shown in parentheses after the master property > name. > * Grey edges represent hidden properties > * Blue nodes are types outside the domain being viewed, but which are > linked to from the domain (the edge will be dashed if the master > property is from the type outside the domain to that inside the > domain) > * Red nodes represent undocumented types > * All nodes are clickable to view that type > > Please let me know of any bugs / feature requests. Thanks should > obviously go to everyone at Metaweb for this, in particular the Acre > developers and Alexander (for his Genealogy app which was the > inspiration for this). Phil, goes without saying, outstanding job. -- Stefano Mazzocchi Application Catalyst Metaweb Technologies, Inc. stefano at metaweb.com ------------------------------------------------------------------- From jason at metaweb.com Fri Sep 11 17:20:17 2009 From: jason at metaweb.com (Jason Douglas) Date: Fri, 11 Sep 2009 10:20:17 -0700 Subject: [Data-modeling] Schema Visualisation app In-Reply-To: <4AAA7ADD.3000102@metaweb.com> References: <20090911124229.GK13273@sphinx.mythic-beasts.com> <4AAA7ADD.3000102@metaweb.com> Message-ID: On Sep 11, 2009, at 9:29 AM, Stefano Mazzocchi wrote: > Philip Kendall wrote: >> Some of you may have already seen Kirrily's tweet, but if not... a >> little >> app while people may be interested in: >> >> http://schemaviz.freebaseapps.com/ >> >> It should give you a visualisation of any domain you're interested >> in, >> showing how the types in that domain are linked by properties. >> >> * Nodes represent types, edges represent properties. >> * Reverse properties are shown in parentheses after the master >> property >> name. >> * Grey edges represent hidden properties >> * Blue nodes are types outside the domain being viewed, but which are >> linked to from the domain (the edge will be dashed if the master >> property is from the type outside the domain to that inside the >> domain) >> * Red nodes represent undocumented types >> * All nodes are clickable to view that type >> >> Please let me know of any bugs / feature requests. Thanks should >> obviously go to everyone at Metaweb for this, in particular the Acre >> developers and Alexander (for his Genealogy app which was the >> inspiration for this). > > Phil, goes without saying, outstanding job. Indeed... it's that rare breed that is fun demo-ware and yet, actually useful. It's found a spot in my bookmarks toolbar already. :-) -jason From spatial.db at gmail.com Fri Sep 11 18:16:01 2009 From: spatial.db at gmail.com (Ed Laurent) Date: Fri, 11 Sep 2009 14:16:01 -0400 Subject: [Data-modeling] Refactoring Physical Geography a bit In-Reply-To: <07ABED2AA0C540DDA0D41CFF062CB771@amd> References: <07ABED2AA0C540DDA0D41CFF062CB771@amd> Message-ID: I've been doing this a bit already with my "Geographical feature category" type that includes my "Code category" type. While some of their properties are not for the commons, others may be, so it would be nice if you would consider/migrate them with the refactoring if possible. http://www.freebase.com/type/schema/base/landcover/geographical_feature_category?domain=%2Fbase%2Flandcover http://www.freebase.com/type/schema/base/landcover/code_category?domain=%2Fbase%2Flandcover -Ed On Thu, Sep 10, 2009 at 4:34 PM, Jeff Prucher wrote: > Cross-posted from Developers List: > > Based on this discussion > (), > I'd > like refactor some types in the Physical Geography domain. > > 1. The type /geography/geographical_feature is currently a bucket holding > different kinds of features (tarn, isthmus, cave, etc.). I'd like to change > the semantics of this type so that it instead holds actual features (Lake > Geneva, Isthmus of Panama, Mammoth Cave, etc.). A new property will be > added to this type ("Class" or something similar) which will connect to a > new type "Geographical Feature Type". This new type will have as instances > the same sorts of topics that Geographical Feature currently does. (In > fact, all will simply be migrated to the new type.) > > 2. These types, which have no properties, will be deleted, since the is-a > relationship with the kind of feature it is will be handled by a property > instead: > > /geography/desert > /geography/bay > /geography/strait > /geography/wetland > > The JIRA task is here: > https://bugs.freebase.com/browse/DA-916 > > Will this affect anyone's code? > > Jeff > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090911/175facf7/attachment.htm From iainsproat at gmail.com Fri Sep 11 18:21:48 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Fri, 11 Sep 2009 22:21:48 +0400 Subject: [Data-modeling] Refactoring Physical Geography a bit In-Reply-To: References: <07ABED2AA0C540DDA0D41CFF062CB771@amd> Message-ID: +1 but I prefer the name "Geographical feature category" for the proposed new type - it's best avoiding using the 'type' word in the name of a /type/type. Iain On Fri, Sep 11, 2009 at 10:16 PM, Ed Laurent wrote: > I've been doing this a bit already with my "Geographical feature category" > type that includes my "Code category" type. While some of their properties > are not for the commons, others may be, so it would be nice if you would > consider/migrate them with the refactoring if possible. > > http://www.freebase.com/type/schema/base/landcover/geographical_feature_category?domain=%2Fbase%2Flandcover > > http://www.freebase.com/type/schema/base/landcover/code_category?domain=%2Fbase%2Flandcover > > -Ed > > > On Thu, Sep 10, 2009 at 4:34 PM, Jeff Prucher wrote: >> >> Cross-posted from Developers List: >> >> Based on this discussion >> (), >> I'd >> like refactor some types in the Physical Geography domain. >> >> 1. The type /geography/geographical_feature is currently a bucket holding >> different kinds of features (tarn, isthmus, cave, etc.). I'd like to >> change >> the semantics of this type so that it instead holds actual features (Lake >> Geneva, Isthmus of Panama, Mammoth Cave, etc.). ?A new property will be >> added to this type ("Class" or something similar) which will connect to a >> new type "Geographical Feature Type". This new type will have as instances >> the same sorts of topics that Geographical Feature currently does. ?(In >> fact, all will simply be migrated to the new type.) >> >> 2. These types, which have no properties, will be deleted, since the is-a >> relationship with the kind of feature it is will be handled by a property >> instead: >> >> /geography/desert >> /geography/bay >> /geography/strait >> /geography/wetland >> >> The JIRA task is here: >> https://bugs.freebase.com/browse/DA-916 >> >> Will this affect anyone's code? >> >> Jeff >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > From spatial.db at gmail.com Fri Sep 11 18:49:58 2009 From: spatial.db at gmail.com (Ed Laurent) Date: Fri, 11 Sep 2009 14:49:58 -0400 Subject: [Data-modeling] Scripts for moving types, properties, and data Message-ID: Hi everyone, As my types have evolved and are getting used, I'm finding that I would like to do some refactoring to separate general and specific types (specific types would include the general types). I would also like to move some of the schema to different bases so that they are organized a little differently because the UI and several new tools rely so heavily on base membership of schema. I know that there are probably a few simple MQL scripts that I can run to do this but don't want to invest a lot of time in learning the ins and outs of MQL to write them. Could someone be so kind as to provide me with some annotated code for the following uses (assuming of course that I am an administrator of all the relevant bases): 1) Move a property and its data from one type to another type 2) Move a type from one base to another base 3) Move all data from a property of a type to an existing property of another type Maybe these scripts are already documented somewhere? If not, it would be great to start a hub for these kinds of general use modeling scripts with a tiny(!) bit of documentation on where to paste them, what parts to modify, and how to run. Such a hub would be very helpful to those of us without mad programming skillz who learn better by doing than by manual. Thanks! -Ed -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090911/c10f3457/attachment.htm From jeff at metaweb.com Fri Sep 11 19:15:09 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Fri, 11 Sep 2009 12:15:09 -0700 Subject: [Data-modeling] Refactoring Physical Geography a bit In-Reply-To: References: <07ABED2AA0C540DDA0D41CFF062CB771@amd> Message-ID: Good point. Jeff > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Iain Sproat > Sent: Friday, September 11, 2009 11:22 AM > To: Freebase data modeling mailing list > Subject: Re: [Data-modeling] Refactoring Physical Geography a bit > > +1 > but I prefer the name "Geographical feature category" for the > proposed new type - it's best avoiding using the 'type' word > in the name of a /type/type. > > Iain > > On Fri, Sep 11, 2009 at 10:16 PM, Ed Laurent > wrote: > > I've been doing this a bit already with my "Geographical > feature category" > > type that includes my "Code category" type. While some of their > > properties are not for the commons, others may be, so it > would be nice > > if you would consider/migrate them with the refactoring if possible. > > > > > http://www.freebase.com/type/schema/base/landcover/geographical_featur > > e_category?domain=%2Fbase%2Flandcover > > > > > http://www.freebase.com/type/schema/base/landcover/code_category?domai > > n=%2Fbase%2Flandcover > > > > -Ed > > > > > > On Thu, Sep 10, 2009 at 4:34 PM, Jeff Prucher > wrote: > >> > >> Cross-posted from Developers List: > >> > >> Based on this discussion > >> > ( >> >), > >> I'd > >> like refactor some types in the Physical Geography domain. > >> > >> 1. The type /geography/geographical_feature is currently a bucket > >> holding different kinds of features (tarn, isthmus, cave, > etc.). I'd > >> like to change the semantics of this type so that it instead holds > >> actual features (Lake Geneva, Isthmus of Panama, Mammoth > Cave, etc.). ? > >> A new property will be added to this type ("Class" or something > >> similar) which will connect to a new type "Geographical Feature > >> Type". This new type will have as instances the same sorts > of topics > >> that Geographical Feature currently does. ?(In fact, all > will simply > >> be migrated to the new type.) > >> > >> 2. These types, which have no properties, will be deleted, > since the > >> is-a relationship with the kind of feature it is will be > handled by a > >> property > >> instead: > >> > >> /geography/desert > >> /geography/bay > >> /geography/strait > >> /geography/wetland > >> > >> The JIRA task is here: > >> https://bugs.freebase.com/browse/DA-916 > >> > >> Will this affect anyone's code? > >> > >> Jeff > >> > >> _______________________________________________ > >> Data-modeling mailing list > >> Data-modeling at freebase.com > >> http://lists.freebase.com/mailman/listinfo/data-modeling > > > > > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From jeff at metaweb.com Fri Sep 11 19:18:01 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Fri, 11 Sep 2009 12:18:01 -0700 Subject: [Data-modeling] Refactoring Physical Geography a bit In-Reply-To: References: <07ABED2AA0C540DDA0D41CFF062CB771@amd> Message-ID: <167FF8678DA84C4B965DF1DA9327800B@p4> Thanks for the pointers. I like the idea of the phylogeny pattern on your Code Category type, which should definitely be replicated on the new Geographical Feature Category type. (Code Category is much more general than Feature Category, so we can't just use the propeties wholesale.) It looks like your Geographical Feature Category type would have the exact same instances of the commons one, so I'll add those to the migration task, and the new type should probably be made an included type of your Feature Category type (which we can just do as part of the migration, if you like). I don't think any of the properties work for the commons, though. Jeff _____ From: data-modeling-bounces at freebase.com [mailto:data-modeling-bounces at freebase.com] On Behalf Of Ed Laurent Sent: Friday, September 11, 2009 11:16 AM To: Freebase data modeling mailing list Subject: Re: [Data-modeling] Refactoring Physical Geography a bit I've been doing this a bit already with my "Geographical feature category" type that includes my "Code category" type. While some of their properties are not for the commons, others may be, so it would be nice if you would consider/migrate them with the refactoring if possible. http://www.freebase.com/type/schema/base/landcover/geographical_feature_cate gory?domain=%2Fbase%2Flandcover http://www.freebase.com/type/schema/base/landcover/code_category?domain=%2Fb ase%2Flandcover -Ed On Thu, Sep 10, 2009 at 4:34 PM, Jeff Prucher wrote: Cross-posted from Developers List: Based on this discussion (), I'd like refactor some types in the Physical Geography domain. 1. The type /geography/geographical_feature is currently a bucket holding different kinds of features (tarn, isthmus, cave, etc.). I'd like to change the semantics of this type so that it instead holds actual features (Lake Geneva, Isthmus of Panama, Mammoth Cave, etc.). A new property will be added to this type ("Class" or something similar) which will connect to a new type "Geographical Feature Type". This new type will have as instances the same sorts of topics that Geographical Feature currently does. (In fact, all will simply be migrated to the new type.) 2. These types, which have no properties, will be deleted, since the is-a relationship with the kind of feature it is will be handled by a property instead: /geography/desert /geography/bay /geography/strait /geography/wetland The JIRA task is here: https://bugs.freebase.com/browse/DA-916 Will this affect anyone's code? Jeff _______________________________________________ Data-modeling mailing list Data-modeling at freebase.com http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090911/a6c70171/attachment.htm From spatial.db at gmail.com Fri Sep 11 19:31:44 2009 From: spatial.db at gmail.com (Ed Laurent) Date: Fri, 11 Sep 2009 15:31:44 -0400 Subject: [Data-modeling] Scripts for moving types, properties, and data In-Reply-To: References: Message-ID: Some more ideas for scripts to be included in a modelers hub: Increasingly, common types are being modeled that duplicate some types and properties that I have already created. This causes a couple kinds of problems: a) data that should be linked through the same type are split into two or more types, b) I am forced to move what is sometimes a lot of data to a type with little data, and c) data are no longer associated with my base if I move them to the other types. For these reasons, I prefer to delegate the property to the common type but keep my type with the delegated property to hold other properties that are not part of the common type. Making this change is impossible through the UI without exporting the data, deleting the property, adding the new delegated property, and importing the exported data. Migrating the data in my property to that of the common type and including the common type in my type is sometimes, but not always, another option. Thus, some other needed scripts: 4) Change the expected type of a property to delegate a property from another type. 5) Add all included types of a type to all topics of that type. -Ed On Fri, Sep 11, 2009 at 2:49 PM, Ed Laurent wrote: > Hi everyone, > > As my types have evolved and are getting used, I'm finding that I would > like to do some refactoring to separate general and specific types (specific > types would include the general types). I would also like to move some of > the schema to different bases so that they are organized a little > differently because the UI and several new tools rely so heavily on base > membership of schema. I know that there are probably a few simple MQL > scripts that I can run to do this but don't want to invest a lot of time in > learning the ins and outs of MQL to write them. Could someone be so kind as > to provide me with some annotated code for the following uses (assuming of > course that I am an administrator of all the relevant bases): > > 1) Move a property and its data from one type to another type > 2) Move a type from one base to another base > 3) Move all data from a property of a type to an existing property of > another type > > Maybe these scripts are already documented somewhere? If not, it would be > great to start a hub for these kinds of general use modeling scripts with a > tiny(!) bit of documentation on where to paste them, what parts to modify, > and how to run. Such a hub would be very helpful to those of us without mad > programming skillz who learn better by doing than by manual. > > Thanks! > -Ed > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090911/c0136d05/attachment.htm From spatial.db at gmail.com Fri Sep 11 20:03:54 2009 From: spatial.db at gmail.com (Ed Laurent) Date: Fri, 11 Sep 2009 16:03:54 -0400 Subject: [Data-modeling] Refactoring Physical Geography a bit In-Reply-To: <167FF8678DA84C4B965DF1DA9327800B@p4> References: <07ABED2AA0C540DDA0D41CFF062CB771@amd> <167FF8678DA84C4B965DF1DA9327800B@p4> Message-ID: Yes, please do include the inclusion as part of the migration. I guess I need to rename my type... This discussion brings up an interesting duplication of phylogeny patterns. In this case, "Code category" could always contain the proposed "Geographical Feature Category" phylogeny but not vice versa. Is there a way to sync the two, to funnel the data from a more specific type into a more general one when the general one may be too general to be an included type? -Ed On Fri, Sep 11, 2009 at 3:18 PM, Jeff Prucher wrote: > Thanks for the pointers. I like the idea of the phylogeny pattern on > your Code Category type, which should definitely be replicated on the new > Geographical Feature Category type. (Code Category is much more general than > Feature Category, so we can't just use the propeties wholesale.) > > It looks like your Geographical Feature Category type would have the exact > same instances of the commons one, so I'll add those to the migration task, > and the new type should probably be made an included type of your Feature > Category type (which we can just do as part of the migration, if you like). > I don't think any of the properties work for the commons, though. > > Jeff > > ------------------------------ > *From:* data-modeling-bounces at freebase.com [mailto: > data-modeling-bounces at freebase.com] *On Behalf Of *Ed Laurent > *Sent:* Friday, September 11, 2009 11:16 AM > *To:* Freebase data modeling mailing list > *Subject:* Re: [Data-modeling] Refactoring Physical Geography a bit > > I've been doing this a bit already with my "Geographical feature category" > type that includes my "Code category" type. While some of their properties > are not for the commons, others may be, so it would be nice if you would > consider/migrate them with the refactoring if possible. > > > http://www.freebase.com/type/schema/base/landcover/geographical_feature_category?domain=%2Fbase%2Flandcover > > > http://www.freebase.com/type/schema/base/landcover/code_category?domain=%2Fbase%2Flandcover > > -Ed > > > On Thu, Sep 10, 2009 at 4:34 PM, Jeff Prucher wrote: > >> Cross-posted from Developers List: >> >> Based on this discussion >> (), >> I'd >> like refactor some types in the Physical Geography domain. >> >> 1. The type /geography/geographical_feature is currently a bucket holding >> different kinds of features (tarn, isthmus, cave, etc.). I'd like to >> change >> the semantics of this type so that it instead holds actual features (Lake >> Geneva, Isthmus of Panama, Mammoth Cave, etc.). A new property will be >> added to this type ("Class" or something similar) which will connect to a >> new type "Geographical Feature Type". This new type will have as instances >> the same sorts of topics that Geographical Feature currently does. (In >> fact, all will simply be migrated to the new type.) >> >> 2. These types, which have no properties, will be deleted, since the is-a >> relationship with the kind of feature it is will be handled by a property >> instead: >> >> /geography/desert >> /geography/bay >> /geography/strait >> /geography/wetland >> >> The JIRA task is here: >> https://bugs.freebase.com/browse/DA-916 >> >> Will this affect anyone's code? >> >> Jeff >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling >> > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090911/5613c407/attachment-0001.htm From jeff at metaweb.com Fri Sep 11 20:31:57 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Fri, 11 Sep 2009 13:31:57 -0700 Subject: [Data-modeling] Refactoring Physical Geography a bit In-Reply-To: References: <07ABED2AA0C540DDA0D41CFF062CB771@amd><167FF8678DA84C4B965DF1DA9327800B@p4> Message-ID: There's not a schematic way to do it. Someone would have to create a gardening process to poll changes to the specific type and apply them to the general one. I'm sure this is doable in acre, but I couldn't code my way out of a paper bag, so don't quote me on that. Jeff _____ From: data-modeling-bounces at freebase.com [mailto:data-modeling-bounces at freebase.com] On Behalf Of Ed Laurent Sent: Friday, September 11, 2009 1:04 PM To: Freebase data modeling mailing list Subject: Re: [Data-modeling] Refactoring Physical Geography a bit Yes, please do include the inclusion as part of the migration. I guess I need to rename my type... This discussion brings up an interesting duplication of phylogeny patterns. In this case, "Code category" could always contain the proposed "Geographical Feature Category" phylogeny but not vice versa. Is there a way to sync the two, to funnel the data from a more specific type into a more general one when the general one may be too general to be an included type? -Ed On Fri, Sep 11, 2009 at 3:18 PM, Jeff Prucher wrote: Thanks for the pointers. I like the idea of the phylogeny pattern on your Code Category type, which should definitely be replicated on the new Geographical Feature Category type. (Code Category is much more general than Feature Category, so we can't just use the propeties wholesale.) It looks like your Geographical Feature Category type would have the exact same instances of the commons one, so I'll add those to the migration task, and the new type should probably be made an included type of your Feature Category type (which we can just do as part of the migration, if you like). I don't think any of the properties work for the commons, though. Jeff _____ From: data-modeling-bounces at freebase.com [mailto:data-modeling-bounces at freebase.com] On Behalf Of Ed Laurent Sent: Friday, September 11, 2009 11:16 AM To: Freebase data modeling mailing list Subject: Re: [Data-modeling] Refactoring Physical Geography a bit I've been doing this a bit already with my "Geographical feature category" type that includes my "Code category" type. While some of their properties are not for the commons, others may be, so it would be nice if you would consider/migrate them with the refactoring if possible. http://www.freebase.com/type/schema/base/landcover/geographical_feature_cate gory?domain=%2Fbase%2Flandcover http://www.freebase.com/type/schema/base/landcover/code_category?domain=%2Fb ase%2Flandcover -Ed On Thu, Sep 10, 2009 at 4:34 PM, Jeff Prucher wrote: Cross-posted from Developers List: Based on this discussion (), I'd like refactor some types in the Physical Geography domain. 1. The type /geography/geographical_feature is currently a bucket holding different kinds of features (tarn, isthmus, cave, etc.). I'd like to change the semantics of this type so that it instead holds actual features (Lake Geneva, Isthmus of Panama, Mammoth Cave, etc.). A new property will be added to this type ("Class" or something similar) which will connect to a new type "Geographical Feature Type". This new type will have as instances the same sorts of topics that Geographical Feature currently does. (In fact, all will simply be migrated to the new type.) 2. These types, which have no properties, will be deleted, since the is-a relationship with the kind of feature it is will be handled by a property instead: /geography/desert /geography/bay /geography/strait /geography/wetland The JIRA task is here: https://bugs.freebase.com/browse/DA-916 Will this affect anyone's code? Jeff _______________________________________________ Data-modeling mailing list Data-modeling at freebase.com http://lists.freebase.com/mailman/listinfo/data-modeling _______________________________________________ Data-modeling mailing list Data-modeling at freebase.com http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090911/b4f07299/attachment.htm From philip-freebase at shadowmagic.org.uk Mon Sep 14 12:16:52 2009 From: philip-freebase at shadowmagic.org.uk (Philip Kendall) Date: Mon, 14 Sep 2009 13:16:52 +0100 Subject: [Data-modeling] Schema Visualisation app In-Reply-To: <20090911124229.GK13273@sphinx.mythic-beasts.com> References: <20090911124229.GK13273@sphinx.mythic-beasts.com> Message-ID: <20090914121651.GU13273@sphinx.mythic-beasts.com> On Fri, Sep 11, 2009 at 01:42:29PM +0100, Philip Kendall wrote: > Some of you may have already seen Kirrily's tweet, but if not... a little > app while people may be interested in: > > http://schemaviz.freebaseapps.com/ Mildly updated version now published: * Node shapes now indicate the display style of the type (ellipse: standard, rectangle: CVT, triangle: enumeration) * New option to show included types via dotted edges (this deliberately excludes /common/topic) * New option to ignore "common types" (those in /type, eg integer, text, etc). This can make the graph significantly simpler and (I think) easier to understand in some cases (try American Football) * One bugfix: if a domain contains more than 100 types (I'm looking at you, /location), then the app would fail as it only fetched half the data. There's still an issue with /location in that the graphviz server times out if given the full graph, but it does work if you turn *off* "show included types" and turn *on" "ignore common types". Thanks to Jeff, Iain and Ed for their suggestions. Any more welcome :-) Cheers, Phil -- Philip Kendall http://www.shadowmagic.org.uk/ From stefano at metaweb.com Mon Sep 14 19:29:58 2009 From: stefano at metaweb.com (Stefano Mazzocchi) Date: Mon, 14 Sep 2009 12:29:58 -0700 Subject: [Data-modeling] [Developers] Schema Visualisation app In-Reply-To: <20090914121651.GU13273@sphinx.mythic-beasts.com> References: <20090911124229.GK13273@sphinx.mythic-beasts.com> <20090914121651.GU13273@sphinx.mythic-beasts.com> Message-ID: <4AAE99B6.8070709@metaweb.com> Philip Kendall wrote: > On Fri, Sep 11, 2009 at 01:42:29PM +0100, Philip Kendall wrote: >> Some of you may have already seen Kirrily's tweet, but if not... a little >> app while people may be interested in: >> >> http://schemaviz.freebaseapps.com/ > > Mildly updated version now published: > > * Node shapes now indicate the display style of the type (ellipse: > standard, rectangle: CVT, triangle: enumeration) > * New option to show included types via dotted edges (this deliberately > excludes /common/topic) > * New option to ignore "common types" (those in /type, eg integer, text, > etc). This can make the graph significantly simpler and (I think) > easier to understand in some cases (try American Football) > * One bugfix: if a domain contains more than 100 types (I'm looking at > you, /location), then the app would fail as it only fetched half the > data. > There's still an issue with /location in that the graphviz server > times out if given the full graph, but it does work if you turn *off* > "show included types" and turn *on" "ignore common types". > > Thanks to Jeff, Iain and Ed for their suggestions. Any more welcome :-) Phil, I'm trying to integrate schemaviz into the freebase schema explorer and I'm thinking that it would be way cool to show directly a thumbnail for people to click on, for that I would need the schemaviz to be able to able to answer something like http://schemaviz.freebaseapps.com/image?domain=/whatever and redirect to the right image, that way I could embed it as a clickable thumbnail in the domain page of the schema explorer. Thoughts? -- Stefano Mazzocchi Application Catalyst Metaweb Technologies, Inc. stefano at metaweb.com ------------------------------------------------------------------- From stefano at metaweb.com Mon Sep 14 22:09:20 2009 From: stefano at metaweb.com (Stefano Mazzocchi) Date: Mon, 14 Sep 2009 15:09:20 -0700 Subject: [Data-modeling] [Developers] Schema Visualisation app In-Reply-To: <4AAE99B6.8070709@metaweb.com> References: <20090911124229.GK13273@sphinx.mythic-beasts.com> <20090914121651.GU13273@sphinx.mythic-beasts.com> <4AAE99B6.8070709@metaweb.com> Message-ID: <4AAEBF10.9000202@metaweb.com> Stefano Mazzocchi wrote: > Philip Kendall wrote: >> On Fri, Sep 11, 2009 at 01:42:29PM +0100, Philip Kendall wrote: >>> Some of you may have already seen Kirrily's tweet, but if not... a little >>> app while people may be interested in: >>> >>> http://schemaviz.freebaseapps.com/ >> Mildly updated version now published: >> >> * Node shapes now indicate the display style of the type (ellipse: >> standard, rectangle: CVT, triangle: enumeration) >> * New option to show included types via dotted edges (this deliberately >> excludes /common/topic) >> * New option to ignore "common types" (those in /type, eg integer, text, >> etc). This can make the graph significantly simpler and (I think) >> easier to understand in some cases (try American Football) >> * One bugfix: if a domain contains more than 100 types (I'm looking at >> you, /location), then the app would fail as it only fetched half the >> data. >> There's still an issue with /location in that the graphviz server >> times out if given the full graph, but it does work if you turn *off* >> "show included types" and turn *on" "ignore common types". >> >> Thanks to Jeff, Iain and Ed for their suggestions. Any more welcome :-) > > Phil, > > I'm trying to integrate schemaviz into the freebase schema explorer and > I'm thinking that it would be way cool to show directly a thumbnail for > people to click on, for that I would need the schemaviz to be able to > able to answer something like > > http://schemaviz.freebaseapps.com/image?domain=/whatever > > and redirect to the right image, that way I could embed it as a > clickable thumbnail in the domain page of the schema explorer. > > Thoughts? Nevermind, found something close enough, here we go: http://schemas.freebaseapps.com/domain?id=/food -- Stefano Mazzocchi Application Catalyst Metaweb Technologies, Inc. stefano at metaweb.com ------------------------------------------------------------------- From philip-freebase at shadowmagic.org.uk Tue Sep 15 07:08:55 2009 From: philip-freebase at shadowmagic.org.uk (Philip Kendall) Date: Tue, 15 Sep 2009 08:08:55 +0100 Subject: [Data-modeling] [Developers] Schema Visualisation app In-Reply-To: <4AAEBF10.9000202@metaweb.com> References: <20090911124229.GK13273@sphinx.mythic-beasts.com> <20090914121651.GU13273@sphinx.mythic-beasts.com> <4AAE99B6.8070709@metaweb.com> <4AAEBF10.9000202@metaweb.com> Message-ID: <20090915070855.GA13273@sphinx.mythic-beasts.com> On Mon, Sep 14, 2009 at 03:09:20PM -0700, Stefano Mazzocchi wrote: > > Nevermind, found something close enough, here we go: > > http://schemas.freebaseapps.com/domain?id=/food Thanks! (and thanks Kirrily for the blog post) However... you may want to back this off for now, or at the very least wrap it in a try/catch block - some domains (including at least /time and /music) are erroring out for reasons I don't understand, and unfortunately I won't have time to look at this today and that's now taking down the schema viewer as well. [ If anyone wants to look at it, it's something to do with a reverse property being found when the master hasn't been found. I apologise for the lack of comments in the source. ] Cheers, Phil -- Philip Kendall http://www.shadowmagic.org.uk/ From spatial.db at gmail.com Tue Sep 15 18:16:44 2009 From: spatial.db at gmail.com (Ed Laurent) Date: Tue, 15 Sep 2009 14:16:44 -0400 Subject: [Data-modeling] Scripts for moving types, properties, and data In-Reply-To: References: Message-ID: Cross referencing https://bugs.freebase.com/browse/DOC-117 On Fri, Sep 11, 2009 at 3:31 PM, Ed Laurent wrote: > Some more ideas for scripts to be included in a modelers hub: > > Increasingly, common types are being modeled that duplicate some types and > properties that I have already created. This causes a couple kinds of > problems: a) data that should be linked through the same type are split into > two or more types, b) I am forced to move what is sometimes a lot of data to > a type with little data, and c) data are no longer associated with my base > if I move them to the other types. For these reasons, I prefer to delegate > the property to the common type but keep my type with the delegated property > to hold other properties that are not part of the common type. Making this > change is impossible through the UI without exporting the data, deleting the > property, adding the new delegated property, and importing the exported > data. Migrating the data in my property to that of the common type and > including the common type in my type is sometimes, but not always, another > option. Thus, some other needed scripts: > > 4) Change the expected type of a property to delegate a property from > another type. > 5) Add all included types of a type to all topics of that type. > > -Ed > > > > On Fri, Sep 11, 2009 at 2:49 PM, Ed Laurent wrote: > >> Hi everyone, >> >> As my types have evolved and are getting used, I'm finding that I would >> like to do some refactoring to separate general and specific types (specific >> types would include the general types). I would also like to move some of >> the schema to different bases so that they are organized a little >> differently because the UI and several new tools rely so heavily on base >> membership of schema. I know that there are probably a few simple MQL >> scripts that I can run to do this but don't want to invest a lot of time in >> learning the ins and outs of MQL to write them. Could someone be so kind as >> to provide me with some annotated code for the following uses (assuming of >> course that I am an administrator of all the relevant bases): >> >> 1) Move a property and its data from one type to another type >> 2) Move a type from one base to another base >> 3) Move all data from a property of a type to an existing property of >> another type >> >> Maybe these scripts are already documented somewhere? If not, it would be >> great to start a hub for these kinds of general use modeling scripts with a >> tiny(!) bit of documentation on where to paste them, what parts to modify, >> and how to run. Such a hub would be very helpful to those of us without mad >> programming skillz who learn better by doing than by manual. >> >> Thanks! >> -Ed >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090915/6e4d0d24/attachment.htm From stefano at metaweb.com Tue Sep 15 18:59:15 2009 From: stefano at metaweb.com (Stefano Mazzocchi) Date: Tue, 15 Sep 2009 11:59:15 -0700 Subject: [Data-modeling] [Developers] Schema Visualisation app In-Reply-To: <20090915070855.GA13273@sphinx.mythic-beasts.com> References: <20090911124229.GK13273@sphinx.mythic-beasts.com> <20090914121651.GU13273@sphinx.mythic-beasts.com> <4AAE99B6.8070709@metaweb.com> <4AAEBF10.9000202@metaweb.com> <20090915070855.GA13273@sphinx.mythic-beasts.com> Message-ID: <4AAFE403.9080907@metaweb.com> Philip Kendall wrote: > On Mon, Sep 14, 2009 at 03:09:20PM -0700, Stefano Mazzocchi wrote: >> Nevermind, found something close enough, here we go: >> >> http://schemas.freebaseapps.com/domain?id=/food > > Thanks! (and thanks Kirrily for the blog post) > > However... you may want to back this off for now, or at the very least > wrap it in a try/catch block - some domains (including at least /time > and /music) are erroring out for reasons I don't understand, and > unfortunately I won't have time to look at this today and that's now > taking down the schema viewer as well. Will do. > [ If anyone wants to look at it, it's something to do with a reverse > property being found when the master hasn't been found. I apologise > for the lack of comments in the source. ] If you grant me access to the app, I can try to fix it. -- Stefano Mazzocchi Application Catalyst Metaweb Technologies, Inc. stefano at metaweb.com ------------------------------------------------------------------- From sm at metaweb.com Wed Sep 16 17:55:35 2009 From: sm at metaweb.com (Scott Meyer) Date: Wed, 16 Sep 2009 10:55:35 -0700 Subject: [Data-modeling] Scripts for moving types, properties, and data In-Reply-To: References: Message-ID: <4AB12697.5000308@metaweb.com> Ed Laurent wrote: > Cross referencing https://bugs.freebase.com/browse/DOC-117 Sorry I missed your original email. Have you looked at the schema support in the latest revision of freebase-python? -Scott From kirrily at metaweb.com Thu Sep 17 19:55:52 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Thu, 17 Sep 2009 12:55:52 -0700 Subject: [Data-modeling] Freebase wiki Message-ID: <61451E93-ED0E-43A2-B707-871F5C59615F@metaweb.com> Hey everyone, We've set up a Mediawiki installation for community-contributed docs, recipes, FAQs, and more. You can check it out at http://wiki.freebase.com/ We'll need volunteers to trawl through the dev and d-m list archives and find interesting content to wikify. Anyone up for it? K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com http://freebase.com/ From nirmal at fractalanalytics.com Thu Sep 17 19:58:48 2009 From: nirmal at fractalanalytics.com (Nirmal) Date: Fri, 18 Sep 2009 01:28:48 +0530 Subject: [Data-modeling] Freebase wiki In-Reply-To: <61451E93-ED0E-43A2-B707-871F5C59615F@metaweb.com> References: <61451E93-ED0E-43A2-B707-871F5C59615F@metaweb.com> Message-ID: <0252333A-16F5-4C2C-B87B-A6F64095D3DC@fractalanalytics.com> Sure. What do I do? -myPhone On 18-Sep-2009, at 1:25 AM, Kirrily Robert wrote: > Hey everyone, > > We've set up a Mediawiki installation for community-contributed docs, > recipes, FAQs, and more. You can check it out at http://wiki.freebase.com/ > > We'll need volunteers to trawl through the dev and d-m list archives > and find interesting content to wikify. Anyone up for it? > > K. > > -- > Kirrily Robert > Freebase Community Director > kirrily at metaweb.com > http://freebase.com/ > > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From kirrily at metaweb.com Thu Sep 17 20:10:14 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Thu, 17 Sep 2009 13:10:14 -0700 Subject: [Data-modeling] Freebase wiki In-Reply-To: <0252333A-16F5-4C2C-B87B-A6F64095D3DC@fractalanalytics.com> References: <61451E93-ED0E-43A2-B707-871F5C59615F@metaweb.com> <0252333A-16F5-4C2C-B87B-A6F64095D3DC@fractalanalytics.com> Message-ID: <88A6E667-1B8C-4B80-B98B-73A4E673A72B@metaweb.com> Well, you could start with the archives either on http://lists.freebase.com/ or http://freebase.markmail.org/ (whichever you find nicer to work with) and then work back through time, looking for interesting posts that provide useful information, answer frequently asked questions, etc. Then go to the wiki and find (or make) a page on that topic, and paste the email in, linking to the original mailing list post. We can edit it up as we go forward. I've done an example at http://wiki.freebase.com/wiki/ISBNs K. On Sep 17, 2009, at 12:58 PM, Nirmal wrote: > Sure. What do I do? > > -myPhone > > On 18-Sep-2009, at 1:25 AM, Kirrily Robert > wrote: > >> Hey everyone, >> >> We've set up a Mediawiki installation for community-contributed docs, >> recipes, FAQs, and more. You can check it out at http://wiki.freebase.com/ >> >> We'll need volunteers to trawl through the dev and d-m list archives >> and find interesting content to wikify. Anyone up for it? >> >> K. >> >> -- >> Kirrily Robert >> Freebase Community Director >> kirrily at metaweb.com >> http://freebase.com/ >> >> >> >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -- Kirrily Robert Freebase Community Director kirrily at metaweb.com http://freebase.com/ From cblow at meedan.net Sat Sep 19 02:27:49 2009 From: cblow at meedan.net (Chris Blow) Date: Fri, 18 Sep 2009 19:27:49 -0700 Subject: [Data-modeling] Freebase wiki In-Reply-To: <88A6E667-1B8C-4B80-B98B-73A4E673A72B@metaweb.com> References: <61451E93-ED0E-43A2-B707-871F5C59615F@metaweb.com> <0252333A-16F5-4C2C-B87B-A6F64095D3DC@fractalanalytics.com> <88A6E667-1B8C-4B80-B98B-73A4E673A72B@metaweb.com> Message-ID: <9AD64EA4-3EC3-4665-AD22-38397EE77668@meedan.net> This is a really great idea, I think, and potentially a good way for me to get some additional Freebase education while helping the community. I will try to help out where I can. Cheers! c Chris Blow | Meedan User Experience | 415.309.7900 | cblow at meedan.net | skype cgblow | www.meedan.net On Sep 17, 2009, at 1:10 PM, Kirrily Robert wrote: > Well, you could start with the archives either on http://lists.freebase.com/ > or http://freebase.markmail.org/ (whichever you find nicer to work > with) and then work back through time, looking for interesting posts > that provide useful information, answer frequently asked questions, > etc. Then go to the wiki and find (or make) a page on that topic, and > paste the email in, linking to the original mailing list post. We can > edit it up as we go forward. I've done an example at http://wiki.freebase.com/wiki/ISBNs > > K. > > > On Sep 17, 2009, at 12:58 PM, Nirmal wrote: > >> Sure. What do I do? >> >> -myPhone >> >> On 18-Sep-2009, at 1:25 AM, Kirrily Robert >> wrote: >> >>> Hey everyone, >>> >>> We've set up a Mediawiki installation for community-contributed >>> docs, >>> recipes, FAQs, and more. You can check it out at http://wiki.freebase.com/ >>> >>> We'll need volunteers to trawl through the dev and d-m list archives >>> and find interesting content to wikify. Anyone up for it? >>> >>> K. >>> >>> -- >>> Kirrily Robert >>> Freebase Community Director >>> kirrily at metaweb.com >>> http://freebase.com/ >>> >>> >>> >>> >>> _______________________________________________ >>> Data-modeling mailing list >>> Data-modeling at freebase.com >>> http://lists.freebase.com/mailman/listinfo/data-modeling >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling > > -- > Kirrily Robert > Freebase Community Director > kirrily at metaweb.com > http://freebase.com/ > > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090918/48e3eb33/attachment-0001.htm From spencerkelly86 at gmail.com Sat Sep 19 21:12:50 2009 From: spencerkelly86 at gmail.com (Spencer Kelly) Date: Sat, 19 Sep 2009 17:12:50 -0400 Subject: [Data-modeling] Scripts for moving types, properties, and data In-Reply-To: <4AB12697.5000308@metaweb.com> References: <4AB12697.5000308@metaweb.com> Message-ID: hey all, great timing for this post. I've got a general purpose data-mover finally working at http://movingday.freebaseapps.com i'm really pleased with it. give it a 'from property' and a 'to property' and off it goes. i've used it to move /airliner_accident/fatalities to /disaster/fatalities and /inventor/inventions to /innovator/original_idea. its much easier than fighting schema wars/ nagging. cant do cvts yet though. could be used to clone a whole schema one property at a time. cheers ;) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090919/3121c935/attachment.htm From pauljmackay at gmail.com Wed Sep 23 05:41:05 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Tue, 22 Sep 2009 22:41:05 -0700 Subject: [Data-modeling] Filtering topics by location Message-ID: Hi, I am interested in populating information about England, but many things that have a location tend to have addresses containing county names and then a country of UK. If I find these topics, would it be valid to add an "England" entry to the "Contained by" field, such that it is easy to filter on? I'm not sure if the Contained by field is intended to be just the immediate entity that contains the topic in question. thanks paul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090922/375d0c8f/attachment.htm From tfmorris at gmail.com Wed Sep 23 06:02:47 2009 From: tfmorris at gmail.com (Tom Morris) Date: Wed, 23 Sep 2009 02:02:47 -0400 Subject: [Data-modeling] Filtering topics by location In-Reply-To: References: Message-ID: On Wed, Sep 23, 2009 at 1:41 AM, Paul Mackay wrote: > I'm not sure if the Contained by field is intended to be just the immediate > entity that contains the topic in question. Currently users/developers are encouraged to enter multiple contained_by values as a hack to work around the fact that MQL can't do multi-level searches easily, so your proposal to add additional information is probably fine. Tom From iainsproat at gmail.com Wed Sep 23 06:17:14 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Wed, 23 Sep 2009 10:17:14 +0400 Subject: [Data-modeling] Filtering topics by location In-Reply-To: References: Message-ID: Contained by is certainly the best way to model that. Additional info can't hurt. There's been some discussion on this in the past, and if I remember correctly there was a proposal for a 2-up contained_by & 2-down contains threshold. That is a city would be contained_by its county and country (or state in the US); and would contain neighbourhoods and buildings. But it's not always clear cut - given there's various ways to define a location based on government boundaries, voting boundaries, religious boundaries etc. etc.. Iain On Wed, Sep 23, 2009 at 10:02 AM, Tom Morris wrote: > On Wed, Sep 23, 2009 at 1:41 AM, Paul Mackay wrote: > >> I'm not sure if the Contained by field is intended to be just the immediate >> entity that contains the topic in question. > > Currently users/developers are encouraged to enter multiple > contained_by values as a hack to work around the fact that MQL can't > do multi-level searches easily, so your proposal to add additional > information is probably fine. > > Tom > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From jg at metaweb.com Wed Sep 23 06:48:12 2009 From: jg at metaweb.com (John Giannandrea) Date: Tue, 22 Sep 2009 23:48:12 -0700 Subject: [Data-modeling] Filtering topics by location In-Reply-To: References: Message-ID: <4F105C37-E54B-40EC-BF9D-DE8D111A138E@metaweb.com> Paul Mackay wrote: > I am interested in populating information about England, but many > things that have a location tend to have addresses containing county > names and then a country of UK. If I find these topics, would it be > valid to add an "England" entry to the "Contained by" field, such > that it is easy to filter on? Most towns in the UK will have a "/location/location/contained_by": {"id":"/en/united_kingdom"} This will get you towns in Scotland and Northern Ireland also. > I'm not sure if the Contained by field is intended to be just the > immediate entity that contains the topic in question. The convention we have used in data loads is at least two ply of containment. -jg From pauljmackay at gmail.com Wed Sep 23 20:03:03 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Wed, 23 Sep 2009 13:03:03 -0700 Subject: [Data-modeling] Filtering topics by location In-Reply-To: <4F105C37-E54B-40EC-BF9D-DE8D111A138E@metaweb.com> References: <4F105C37-E54B-40EC-BF9D-DE8D111A138E@metaweb.com> Message-ID: I guess in that case this is more an issue of whether England or United Kingdom should be used for a location's Country field. Related to this, when viewing tables of entries with locations, I wanted to add a column to filter on location by country, but cannot see how to do this. Under Address it only includes Street Address City/Town State/Province/Region Postal Code Shouldnt Country be in that list? paul On Tue, Sep 22, 2009 at 11:48 PM, John Giannandrea wrote: > > Paul Mackay wrote: > > I am interested in populating information about England, but many > > things that have a location tend to have addresses containing county > > names and then a country of UK. If I find these topics, would it be > > valid to add an "England" entry to the "Contained by" field, such > > that it is easy to filter on? > > Most towns in the UK will have a "/location/location/contained_by": > {"id":"/en/united_kingdom"} > This will get you towns in Scotland and Northern Ireland also. > > > I'm not sure if the Contained by field is intended to be just the > > immediate entity that contains the topic in question. > > The convention we have used in data loads is at least two ply of > containment. > > -jg > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20090923/2767e6dc/attachment.htm From jeff at metaweb.com Tue Sep 29 04:27:45 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Mon, 28 Sep 2009 21:27:45 -0700 Subject: [Data-modeling] Two crossposts from developers list Message-ID: <1AE4777E1B8741B290C3CD00CF87BE7C@amd> I posted these some time ago, and they met with no objections, but the tasks were never executed (oops). Since it's been some time, I'm resubmitting them to the list in case anyone has created new code that uses these properties: 1. The type Transit System has a property System Length (/metropolitan_transit/transit_system/system_length) which expects a unique floating point number. Tfmorris has suggested that this property should really be a CVT with values for date and length, and should therefore additionally be non-unique. I think this makes good sense; would this interfere with anyone's code? Or does anyone have further comments on this? 2. There's been a request to add a way to store "From" and "To" dates for Websites and Website Owners. This will involve breaking the link between the properties /internet/website/owner and /internet/website_owner/websites_owned. Will this break anyone's code? Jeff From zenkat at metaweb.com Wed Sep 30 04:15:36 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Tue, 29 Sep 2009 21:15:36 -0700 Subject: [Data-modeling] test load: genders Message-ID: Hello All -- You may have read the blog post from a few weeks back where we talked about automatically deducing the genders of people in Freebase by their names: http://blog.freebase.com/2009/09/09/gender-and-names-in-freebase/ A test sampling of ~1800 of these gender assignments has been loaded into the main graph. The risk of error is low, since all of these assignments were done for names which showed 100% concordance on at least 100 exemplars, but we'd figured we'd still do a test load and let folks take a gander before we loaded the full set of ~190K gender assignments. If you're the in the mood to take a look before we run the complete load, you can see the first 500 of them with this ACRE app: http://gendercheck.zenkat.user.dev.freebaseapps.com/index Thanks, Brian From iainsproat at gmail.com Wed Sep 30 06:16:07 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Wed, 30 Sep 2009 10:16:07 +0400 Subject: [Data-modeling] test load: genders In-Reply-To: References: Message-ID: The only anomaly I could spot was "Patriarch of Alexandria" - but that was probably due to a previous incorrect typing as a person. I've now detyped it as a person (and added the Religious Leadership Title type). It's actually a fairly regular problem, where Religious titles, Noble titles and Political titles are incorrectly typed as Person. If it's possible, it might be worth manually checking names of people which are very similar to names of titles. Iain On Wed, Sep 30, 2009 at 8:15 AM, Brian Karlak wrote: > Hello All -- > > You may have read the blog post from a few weeks back where we talked > about automatically deducing the genders of people in Freebase by > their names: > > ? ? ? ?http://blog.freebase.com/2009/09/09/gender-and-names-in-freebase/ > > A test sampling of ~1800 of these gender assignments has been loaded > into the main graph. ?The risk of error is low, since all of these > assignments were done for names which showed 100% concordance on at > least 100 exemplars, but we'd figured we'd still do a test load and > let folks take a gander before we loaded the full set of ~190K gender > assignments. > > If you're the in the mood to take a look before we run the complete > load, you can see the first 500 of them with this ACRE app: > > ? ? ? ?http://gendercheck.zenkat.user.dev.freebaseapps.com/index > > Thanks, > Brian > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From kurt at spaceship.com Wed Sep 30 07:23:37 2009 From: kurt at spaceship.com (Kurt Bollacker) Date: Wed, 30 Sep 2009 00:23:37 -0700 Subject: [Data-modeling] test load: genders In-Reply-To: References: Message-ID: <20090930072337.GC8300@spaceship.com> On Wed, Sep 30, 2009 at 10:16:07AM +0400, Iain Sproat wrote: > The only anomaly I could spot was "Patriarch of Alexandria" - but that > was probably due to a previous incorrect typing as a person. I've now > detyped it as a person (and added the Religious Leadership Title > type). > > It's actually a fairly regular problem, where Religious titles, Noble > titles and Political titles are incorrectly typed as Person. If it's > possible, it might be worth manually checking names of people which > are very similar to names of titles. I think that while typing "Patriarch of Alexandria" as a person is wrong, giving the title the gender "male" is not since this position never in the past or expected in the future was/is to be filled by a woman. In that case we would have a "Matriarch". Should this be a different property? Kurt :-) > Iain > > On Wed, Sep 30, 2009 at 8:15 AM, Brian Karlak wrote: > > Hello All -- > > > > You may have read the blog post from a few weeks back where we talked > > about automatically deducing the genders of people in Freebase by > > their names: > > > > ? ? ? ?http://blog.freebase.com/2009/09/09/gender-and-names-in-freebase/ > > > > A test sampling of ~1800 of these gender assignments has been loaded > > into the main graph. ?The risk of error is low, since all of these > > assignments were done for names which showed 100% concordance on at > > least 100 exemplars, but we'd figured we'd still do a test load and > > let folks take a gander before we loaded the full set of ~190K gender > > assignments. > > > > If you're the in the mood to take a look before we run the complete > > load, you can see the first 500 of them with this ACRE app: > > > > ? ? ? ?http://gendercheck.zenkat.user.dev.freebaseapps.com/index > > > > Thanks, > > Brian > > > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From iainsproat at gmail.com Wed Sep 30 08:06:13 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Wed, 30 Sep 2009 12:06:13 +0400 Subject: [Data-modeling] test load: genders In-Reply-To: <20090930072337.GC8300@spaceship.com> References: <20090930072337.GC8300@spaceship.com> Message-ID: On Wed, Sep 30, 2009 at 11:23 AM, Kurt Bollacker wrote: > giving the title the gender "male" is not [...] > In that case we would have a "Matriarch". ?Should this be a > different property? In the Royalty domain, there is a Noble title gender equivalency relationship between titles, so the title "Duke of x" can be denoted as equivalent to the title "Duchess of x". The actual gender associated with the title can be derived from the gender of the title holders. A similar sort of relationship might be appropriate for Religious titles. http://www.freebase.com/view/royalty/noble_title_gender_equivalency Iain From zenkat at metaweb.com Wed Sep 30 15:13:47 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Wed, 30 Sep 2009 08:13:47 -0700 Subject: [Data-modeling] test load: genders In-Reply-To: References: Message-ID: <11E626A0-0AE6-428A-ADCC-834BF9AE903E@metaweb.com> On Sep 29, 2009, at 11:16 PM, Iain Sproat wrote: > It's actually a fairly regular problem, where Religious titles, Noble > titles and Political titles are incorrectly typed as Person. If it's > possible, it might be worth manually checking names of people which > are very similar to names of titles. Well, we can certainly set up an Incompatible Types relationship between /people/person and /religion/religious_leadership_title, / royalty/noble_title and /government/government_office_or_title. This will help catch the cases where there are obvious co-type mismatches. Brian