From freebase-data at clark-communications.com Thu Oct 1 16:05:00 2009 From: freebase-data at clark-communications.com (Don Jackson) Date: Thu, 1 Oct 2009 09:05:00 -0700 Subject: [Data-modeling] Cask Ale is a cuisine? Message-ID: <05C1E5AA-EBB0-40DE-9696-89CE933EC648@clark-communications.com> Cask Ale is considered a Cuisine within Freebase. This seems wrong to me. http://www.freebase.com/view/en/cask_ale May I change this? Also, when I have future inquiries of this type, where best should I air them? To me. this is not really a "data modeling" issue, but a possible miss-categorization of a specific entry. From jon at metaweb.com Thu Oct 1 16:49:19 2009 From: jon at metaweb.com (Jon Reitsma) Date: Thu, 1 Oct 2009 09:49:19 -0700 (PDT) Subject: [Data-modeling] Cask Ale is a cuisine? Message-ID: <646524058.139301254415759253.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Yeah, looks like a user noted it as a cuisine of a particular restaurant so it picked up the type. ?I fixed it and typed it as a beer style and yes you should feel free to do the same. If discussion is needed post it on the topic and perhaps cross post to the domain or wherever is appropriate. Thanks Don! j Don Jackson wrote: Cask Ale is considered a Cuisine within Freebase. This seems wrong to me. http://www.freebase.com/view/en/cask_ale May I change this? Also, when I have future inquiries of this type, where best should I air them? To me. this is not really a "data modeling" issue, but a possible miss-categorization of a specific entry. _______________________________________________ Data-modeling mailing list Data-modeling at freebase.com http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091001/55416b80/attachment.htm From tfmorris at gmail.com Thu Oct 1 17:15:56 2009 From: tfmorris at gmail.com (Tom Morris) Date: Thu, 1 Oct 2009 13:15:56 -0400 Subject: [Data-modeling] Cask Ale is a cuisine? In-Reply-To: <646524058.139301254415759253.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> References: <646524058.139301254415759253.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: > If discussion is needed post it on the topic and perhaps cross post to the > domain or wherever is appropriate. And generally you're better off asking forgiveness than permission. Ifit's a simple change, like this, and you're pretty sure that what you're doing is OK, go ahead and do it and, if necessary, post a note in one of the discussion forums asking for confirmation/review. That way you won't be held up waiting for a domain administrator. It's easy enough to reverse changes that were made in error later if necessary. Tom From zenkat at metaweb.com Thu Oct 1 22:26:05 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Thu, 1 Oct 2009 15:26:05 -0700 Subject: [Data-modeling] [Freebase-experts] open library editions test load In-Reply-To: References: <1FB69464-A638-43DE-9A0C-F181D0D6C9A2@metaweb.com> Message-ID: <1A5B1864-7B59-4585-ABFB-2DCC22BEB28A@metaweb.com> Hello All -- The second test load of 10K Open Library Editions has been completed. This load implements the blacklists described in the original email below, along with other miscellaneous cleanup. For those of you who are interested in looking at the data, we've created a data game to help with paging through a random sample of the load: http://book-edition-qa.vtalwar.user.dev.freebaseapps.com/queue?experts=1 When reviewing records, you may want to check: Did the edition attach to the correct book? If the edition is marked as "existed", did we match the correct Freebase edition? Is the edition data from Open Library correct, according to the LOC or Amazon? If you feel that any of the data is incorrect, you can click "No" to mark the record as suspect. Notes on why you clicked "No" (or "Yes", for that matter) are most appreciated -- you can type them in the box under the buttons. Of course, feel free to bring issues up for discussion on this list! Finally, please note that this load is only of edition information: we did not attempt to fix the /book/book and /book/author topics they are attach to. This means that the known problems with the earlier Open Library loads (missing first word of titles, punctuation in author names, etc) are not addressed. Once we have all of the editions, we'll work on this cleanup. As always, your feedback is appreciated! Thanks, Brian PS -- This load was run under attribution node http://www.freebase.com/tools/explore/user/book_bot/attr/31 if you want to query for the data. On Sep 21, 2009, at 1:40 PM, Brian Karlak wrote: > Thanks for your feedback on this test load -- it is definitely > appreciated! > > Based upon your feedback, we're going to make the following changes > to the load process: > > 1) Create a blacklist of subjects. Do not load editions for > matching OL entries, and mark the books for deletion: > > Computer Software Packages > Gifts > Novelty > Blank Books/Journals > Calendar[s] * > > 2) Create a blacklist of formats. Again, do not load / mark for > delete: > > Calendar > Stationary > > 3) Offline gardening task: Delete everything by "Pet Prints, Inc.". > > 4) Offline gardening task: delete trailing punctuation from Open > Library author names. > > Interesting, but probably not implementable in the next 10K load: > > 5) Search for books with >50% non-English words that are not marked > as being for a foreign language, delete from load. > > A test load of 10K editions should be coming down the pike shortly. > We'll let you know before we start the load. > > Brian > > On Sep 17, 2009, at 5:31 PM, Brian Karlak wrote: > >> Hello All -- >> >> We hinted a few weeks back that we're trying a new process for >> loading >> massive data sets. Instead of doing a single huge data load (and >> letting everyone know about it afterwards :-), we're doing >> incrementally larger loads, systematically QA'ing them, and notifying >> the general Freebase community after each load so that we can get >> feedback on potential problems. >> >> So ... >> >> We've just loaded the first test set of 144 OpenLibrary editions to >> sandbox. This test set came from sampling 1000 editions from the >> entire OL corpus, and taking those that matched titles and authors >> from books in our July OpenLibrary book load. This is the first tiny >> dribble of what we hope ill ultimately be a load of 2.5M editions. >> >> The ISBN nodes for these books can be found in the "Links Created" >> section of this page: >> >> http://www.sandbox-freebase.com/tools/explore/user/book_bot/attr/27?limit=200 >> >> (Yes, we know it's not the most beautiful display ... we're looking >> at >> prettier ways of showing this. However, once you click one of the / >> soft/isbn/ keys, you'll be back in Freebase proper, on an ISBN node. >> From this node, you can follow back to book edition, and then book.) >> >> Our initial QC of this load has already pointed out some areas for >> improvement. For instance, Open Library contains some non-book items >> like blank journals, audio cassettes and art prints. Future loads >> will filter these out by creating a blacklist of forbidden subjects >> and formats ("stationary", "audio cassette", "gift", etc.) We'll >> also >> probably end up deleting the books in these categories as well. >> >> Any feedback you might have is definitely appreciated! >> >> Thanks, >> Brian >> >> >> _______________________________________________ >> Freebase-experts mailing list >> Freebase-experts at freebase.com >> http://lists.freebase.com/mailman/listinfo/freebase-experts > > _______________________________________________ > Freebase-experts mailing list > Freebase-experts at freebase.com > http://lists.freebase.com/mailman/listinfo/freebase-experts -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091001/12856eb1/attachment-0001.htm From jeff at metaweb.com Fri Oct 2 21:11:53 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Fri, 2 Oct 2009 14:11:53 -0700 Subject: [Data-modeling] Deleting "Film Crew Role" type Message-ID: Crosspost from dev list: The type /film/film_crew_role has no instances and no properties, and is completely redundant with anther type, /film/film_job (which does have instances and a property). We'd like to delete the type Film Crew Role. If, for some reason, deleting this type will break anything for you, please let us know. https://bugs.freebase.com/browse/DA-950 Jeff Prucher Type Librarian & Ontologist Metaweb Technologies, Inc. From tyler at metaweb.com Mon Oct 5 19:25:28 2009 From: tyler at metaweb.com (Tyler Pirtle) Date: Mon, 05 Oct 2009 12:25:28 -0700 Subject: [Data-modeling] vocal range on music artists Message-ID: <4ACA4828.90707@metaweb.com> Can we please defenestrate this property? It's got 77 instances, it's also completely irrelevant for most musical acts, as i'm sure that about 99% of the music artists in Freebase don't even know their own vocal ranges. I say we take the 77 instances, make them their own types or something. Opera singers, sure, very crucial information. Why not move them into their own type? "Opera Singer". Done. Vocal Range? Sure. (bad example) Does Green Day have a "vocal range"? Even if they did, would anyone ever care? There's far fewer opera singers than there are other bands, and I'm simply tired of looking at properties knowing that they'll never be filled out in their lifetime. Thoughts? T From crism at maden.org Mon Oct 5 19:50:32 2009 From: crism at maden.org (Christopher R. Maden) Date: Mon, 05 Oct 2009 15:50:32 -0400 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <4ACA4828.90707@metaweb.com> References: <4ACA4828.90707@metaweb.com> Message-ID: <4ACA4E08.1050705@maden.org> Tyler Pirtle wrote: > Why not move them into their own type? "Opera Singer". Done. Vocal > Range? Sure. Opera Singer is inappropriate for this; moving a property to a type just because the type is a near match and already exists is not a good strategy. > (bad example) Does Green Day have a "vocal range"? Even if they did, > would anyone ever care? Analogously, what instruments does Green Day play? > Thoughts? The problem here is the conflation of Musical Artist with Musician. There really should be a separation; Musical Artist is the generic, and can be an individual or a band. A Musician can play instruments or have a vocal range. (We could instead create Instrumentalist and Vocalist, each with one property, but I think that?s unnecessarily complicated.) Thoughts? ~Chris -- Chris Maden, text nerd ?What a dream life would seem if you could see the world from inside an Etch-A-Sketch.? ? Andrew Bird, ?Tea & Thorazine? GnuPG Fingerprint: C6E4 E2A9 C9F8 71AC 9724 CAA3 19F8 6677 0077 C319 From tyler at metaweb.com Mon Oct 5 20:30:51 2009 From: tyler at metaweb.com (Tyler Pirtle) Date: Mon, 05 Oct 2009 13:30:51 -0700 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <4ACA4E08.1050705@maden.org> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> Message-ID: <4ACA577B.8070403@metaweb.com> Christopher R. Maden wrote: Mr. Maden! Good to hear from you. I hope all's well. I left you comments below. > Tyler Pirtle wrote: > >> Why not move them into their own type? "Opera Singer". Done. Vocal >> Range? Sure. >> > > Opera Singer is inappropriate for this; moving a property to a type just > because the type is a near match and already exists is not a good strategy. > > >> (bad example) Does Green Day have a "vocal range"? Even if they did, >> would anyone ever care? >> > > Analogously, what instruments does Green Day play? > > Understood. >> Thoughts? >> > > The problem here is the conflation of Musical Artist with Musician. > There really should be a separation; Musical Artist is the generic, and > can be an individual or a band. A Musician can play instruments or have > a vocal range. (We could instead create Instrumentalist and Vocalist, > each with one property, but I think that?s unnecessarily complicated.) > > Agreed, but in either case it seems simplest to me to abandon this property wholesale. Like you said, it doesn't make any sense. So, let's stop trying to find somewhere to put it and chuck it until we develop some strong need for it. Your suggestion for a 'musician' or 'instrumentalist' type is well heard and yes if the data existed and we had easy access to all of it then sure, I'd be all for this. But that data escapes me every time I try to find it. I literally cannot find information relating not only to 'what this guy plays' down to 'who played on what track' in a performance. Amazing and interesting data? Certainly. Easy to get a hold of? Impossible. It leads me to believe that anything much outside a 'role' of a person in a musical group is superfluous. These kinds of properties also just make me sad, they're huge largest sinkholes in the music data. In the spirit of tightening it up a bit, I propose we just start chucking these kinds of properties until we find a need for them. T > Thoughts? > > ~Chris > From crism at maden.org Mon Oct 5 20:57:39 2009 From: crism at maden.org (Christopher R. Maden) Date: Mon, 05 Oct 2009 16:57:39 -0400 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <4ACA577B.8070403@metaweb.com> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> Message-ID: <4ACA5DC3.6070804@maden.org> Tyler Pirtle wrote: > Your suggestion for a 'musician' or 'instrumentalist' type is well heard > and yes if the > data existed and we had easy access to all of it then sure, I'd be all > for this. But > that data escapes me every time I try to find it. I literally cannot > find information relating > not only to 'what this guy plays' down to 'who played on what track' in > a performance. > Amazing and interesting data? Certainly. Easy to get a hold of? Impossible. > It leads me to believe that anything much outside a 'role' of a person > in a musical > group is superfluous. I disagree. Wikipedia infoboxen are full of instruments played and vocal ranges; those were the inspiration for these properties in the first place. And MusicBrainz is rife with specific performance contributions. We just aren?t harvesting the information aggressively enough. ~Chris -- Chris Maden, text nerd ?What a dream life would seem if you could see the world from inside an Etch-A-Sketch.? ? Andrew Bird, ?Tea & Thorazine? GnuPG Fingerprint: C6E4 E2A9 C9F8 71AC 9724 CAA3 19F8 6677 0077 C319 From tyler at metaweb.com Mon Oct 5 21:34:12 2009 From: tyler at metaweb.com (Tyler Pirtle) Date: Mon, 05 Oct 2009 14:34:12 -0700 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <4ACA5DC3.6070804@maden.org> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org> Message-ID: <4ACA6654.60900@metaweb.com> Christopher R. Maden wrote: > Tyler Pirtle wrote: > >> Your suggestion for a 'musician' or 'instrumentalist' type is well heard >> and yes if the >> data existed and we had easy access to all of it then sure, I'd be all >> for this. But >> that data escapes me every time I try to find it. I literally cannot >> find information relating >> not only to 'what this guy plays' down to 'who played on what track' in >> a performance. >> Amazing and interesting data? Certainly. Easy to get a hold of? Impossible. >> It leads me to believe that anything much outside a 'role' of a person >> in a musical >> group is superfluous. >> > > I disagree. Wikipedia infoboxen are full of instruments played and > vocal ranges; those were the inspiration for these properties in the > first place. And MusicBrainz is rife with specific performance > contributions. We just aren?t harvesting the information aggressively > enough. > > ~Chris > Hold on tiger. 'Role' is fine for a person playing a guitar, etc., I'm including that, I think that should stay. Vocal Ranges, though, kind of pointless IMO, for all musical genres except maybe Opera. Same deal for 'instruments played'. As you've pointed out this is better fit for the particular member of the group. And I'm totally agreeing with you on the second point, anything we can do to rip specific performance information out of musicbrainz would be _amazing_. Amazing. Like game-changing amazing. In summary: Of course, keep role for instruments played, etc., but we don't need a vocal range. Not for an artist. While we're at it, as you pointed out, we can also get rid of Instruments Played for the same reasons. Again, not to belabor this further: simplify. If we don't need the properties, lets get rid of them. Slim it down a touch. Streamline. If we want it back, it's quite simple, we add it back. But we don't need it, and it doesn't really make much sense. From crism at maden.org Mon Oct 5 21:49:10 2009 From: crism at maden.org (Christopher R. Maden) Date: Mon, 05 Oct 2009 17:49:10 -0400 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <4ACA6654.60900@metaweb.com> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org> <4ACA6654.60900@metaweb.com> Message-ID: <4ACA69D6.2010704@maden.org> Tyler Pirtle wrote: > Hold on tiger. 'Role' is fine for a person playing a guitar, etc., I'm > including > that, I think that should stay. Vocal Ranges, though, kind of pointless > IMO, for all > musical genres except maybe Opera. Same deal for 'instruments played'. > As you've pointed > out this is better fit for the particular member of the group. The Wikipedia community seems to think it?s interesting. > And I'm totally agreeing with you on the second point, anything we can > do to rip > specific performance information out of musicbrainz would be _amazing_. > Amazing. > Like game-changing amazing. The code is in Metaweb?s subversion somewhere. > In summary: Of course, keep role for instruments played, etc., but we > don't need a vocal range. > Not for an artist. While we're at it, as you pointed out, we can also > get rid of Instruments Played for > the same reasons. I don?t understand the problem that these properties present. They are partly populated. If they don?t make sense on the current type, move them to another type. Why nuke the information that?s already present? ~Chris -- Chris Maden, text nerd ?What a dream life would seem if you could see the world from inside an Etch-A-Sketch.? ? Andrew Bird, ?Tea & Thorazine? GnuPG Fingerprint: C6E4 E2A9 C9F8 71AC 9724 CAA3 19F8 6677 0077 C319 From gordon at metaweb.com Mon Oct 5 21:52:16 2009 From: gordon at metaweb.com (Gordon Mackenzie) Date: Mon, 5 Oct 2009 14:52:16 -0700 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <4ACA6654.60900@metaweb.com> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org> <4ACA6654.60900@metaweb.com> Message-ID: <679A1304-6270-4D61-8B78-1ECA131B6BF1@metaweb.com> Well, I see data for singers in musicals about their vocal range. The woman from Wicked and in Pushing Daisies, Kristin Chemoweth: 4 Octaves! "Chenoweth is a classically trained coloratura soprano, and well known for her skilled singing technique and artistic interpretations. She has a vocal range of four octaves.[3] Chenoweth is able to sing the note "F6" (1396.913Hz), also known as "F above High C".[4]" What about Mariah Carey, Cindi Lauper, Mahalia Jackson? Interested minds want to know. ~ Gordon <<< gordon at metaweb.com >>> On Oct 5, 2009, at 2:34 PM, Tyler Pirtle wrote: > Christopher R. Maden wrote: >> Tyler Pirtle wrote: >> >>> Your suggestion for a 'musician' or 'instrumentalist' type is well >>> heard >>> and yes if the >>> data existed and we had easy access to all of it then sure, I'd be >>> all >>> for this. But >>> that data escapes me every time I try to find it. I literally cannot >>> find information relating >>> not only to 'what this guy plays' down to 'who played on what >>> track' in >>> a performance. >>> Amazing and interesting data? Certainly. Easy to get a hold of? >>> Impossible. >>> It leads me to believe that anything much outside a 'role' of a >>> person >>> in a musical >>> group is superfluous. >>> >> >> I disagree. Wikipedia infoboxen are full of instruments played and >> vocal ranges; those were the inspiration for these properties in the >> first place. And MusicBrainz is rife with specific performance >> contributions. We just aren?t harvesting the information >> aggressively >> enough. >> >> ~Chris >> > Hold on tiger. 'Role' is fine for a person playing a guitar, etc., I'm > including > that, I think that should stay. Vocal Ranges, though, kind of > pointless > IMO, for all > musical genres except maybe Opera. Same deal for 'instruments played'. > As you've pointed > out this is better fit for the particular member of the group. > > And I'm totally agreeing with you on the second point, anything we can > do to rip > specific performance information out of musicbrainz would be > _amazing_. > Amazing. > Like game-changing amazing. > > In summary: Of course, keep role for instruments played, etc., but we > don't need a vocal range. > Not for an artist. While we're at it, as you pointed out, we can also > get rid of Instruments Played for > the same reasons. > > Again, not to belabor this further: simplify. > > If we don't need the properties, lets get rid of them. Slim it down a > touch. Streamline. > If we want it back, it's quite simple, we add it back. > But we don't need it, and it doesn't really make much sense. > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091005/ca1006e3/attachment-0001.htm From micah at metaweb.com Mon Oct 5 21:58:44 2009 From: micah at metaweb.com (Micah Saul) Date: Mon, 5 Oct 2009 14:58:44 -0700 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <4ACA69D6.2010704@maden.org> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org> <4ACA6654.60900@metaweb.com> <4ACA69D6.2010704@maden.org> Message-ID: <79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com> This comes down to an issue I've had with the music schema all along. Musical Artist vs. Musical Group vs. Musical Group Member: There are some properties on Musical Artist that apply to both bands and people. Some apply to only people. We created Musical Group to capture those properties that only apply to bands, why have we not created Musical Person (or repurposed Musical Group Member) to capture only the properties relevant to people? And we could then make Musical Person and Musical Group incompatible types which would prevent all these bad merges between bands and their members (see Tom Petty or Justin Vernon/Bon Iver). Plus, this might make Tyler happy. Vocal Range and Instruments Played make a lot more sense when you can't have an assertion for the Vocal Range of Metallica. m On Oct 5, 2009, at 2:49 PM, Christopher R. Maden wrote: > Tyler Pirtle wrote: >> Hold on tiger. 'Role' is fine for a person playing a guitar, etc., >> I'm >> including >> that, I think that should stay. Vocal Ranges, though, kind of >> pointless >> IMO, for all >> musical genres except maybe Opera. Same deal for 'instruments >> played'. >> As you've pointed >> out this is better fit for the particular member of the group. > > The Wikipedia community seems to think it?s interesting. > >> And I'm totally agreeing with you on the second point, anything we >> can >> do to rip >> specific performance information out of musicbrainz would be >> _amazing_. >> Amazing. >> Like game-changing amazing. > > The code is in Metaweb?s subversion somewhere. > >> In summary: Of course, keep role for instruments played, etc., but we >> don't need a vocal range. >> Not for an artist. While we're at it, as you pointed out, we can also >> get rid of Instruments Played for >> the same reasons. > > I don?t understand the problem that these properties present. They > are > partly populated. If they don?t make sense on the current type, move > them to another type. Why nuke the information that?s already > present? > > ~Chris > -- > Chris Maden, text nerd > ?What a dream life would seem if you could see the world from > inside an Etch-A-Sketch.? ? Andrew Bird, ?Tea & Thorazine? > GnuPG Fingerprint: C6E4 E2A9 C9F8 71AC 9724 CAA3 19F8 6677 0077 C319 > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From jeff at metaweb.com Mon Oct 5 22:06:16 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Mon, 5 Oct 2009 15:06:16 -0700 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org><4ACA6654.60900@metaweb.com> <4ACA69D6.2010704@maden.org> <79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com> Message-ID: <8E3351A81A4446098B398620D097FCB1@p4> > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Micah Saul > > There are some properties on Musical Artist that apply to > both bands and people. Some apply to only people. We created > Musical Group to capture those properties that only apply to > bands, why have we not created Musical Person (or repurposed > Musical Group Member) to capture only the properties relevant > to people? And we could then make Musical Person and Musical > Group incompatible types which would prevent all these bad > merges between bands and their members (see Tom Petty or > Justin Vernon/Bon Iver). I think this is a good idea. We could call it "Musician" rather than "Musical Person". > Plus, this might make Tyler happy. Wait, when did that become a priority? Jeff > Vocal Range and > Instruments Played make a lot more sense when you can't have > an assertion for the Vocal Range of Metallica. > > m > > > On Oct 5, 2009, at 2:49 PM, Christopher R. Maden wrote: > > > Tyler Pirtle wrote: > >> Hold on tiger. 'Role' is fine for a person playing a guitar, etc., > >> I'm including that, I think that should stay. Vocal > Ranges, though, > >> kind of pointless IMO, for all musical genres except maybe Opera. > >> Same deal for 'instruments played'. > >> As you've pointed > >> out this is better fit for the particular member of the group. > > > > The Wikipedia community seems to think it's interesting. > > > >> And I'm totally agreeing with you on the second point, anything we > >> can do to rip specific performance information out of musicbrainz > >> would be _amazing_. > >> Amazing. > >> Like game-changing amazing. > > > > The code is in Metaweb's subversion somewhere. > > > >> In summary: Of course, keep role for instruments played, > etc., but we > >> don't need a vocal range. > >> Not for an artist. While we're at it, as you pointed out, > we can also > >> get rid of Instruments Played for the same reasons. > > > > I don't understand the problem that these properties present. They > > are partly populated. If they don't make sense on the > current type, > > move them to another type. Why nuke the information that's already > > present? > > > > ~Chris > > -- > > Chris Maden, text nerd > "What a dream > > life would seem if you could see the world from inside an > > Etch-A-Sketch." - Andrew Bird, "Tea & Thorazine" > > GnuPG Fingerprint: C6E4 E2A9 C9F8 71AC 9724 CAA3 19F8 6677 > 0077 C319 > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From tyler at metaweb.com Mon Oct 5 22:04:37 2009 From: tyler at metaweb.com (Tyler Pirtle) Date: Mon, 05 Oct 2009 15:04:37 -0700 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org> <4ACA6654.60900@metaweb.com> <4ACA69D6.2010704@maden.org> <79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com> Message-ID: <4ACA6D75.30004@metaweb.com> Micah Saul wrote: > This comes down to an issue I've had with the music schema all along. > > Musical Artist vs. Musical Group vs. Musical Group Member: > > There are some properties on Musical Artist that apply to both bands > and people. Some apply to only people. We created Musical Group to > capture those properties that only apply to bands, why have we not > created Musical Person (or repurposed Musical Group Member) to capture > only the properties relevant to people? And we could then make Musical > Person and Musical Group incompatible types which would prevent all > these bad merges between bands and their members (see Tom Petty or > Justin Vernon/Bon Iver). > > Plus, this might make Tyler happy. Vocal Range and Instruments Played > make a lot more sense when you can't have an assertion for the Vocal > Range of Metallica. > > m > > > Yes please and thank you. > On Oct 5, 2009, at 2:49 PM, Christopher R. Maden wrote: > > >> Tyler Pirtle wrote: >> >>> Hold on tiger. 'Role' is fine for a person playing a guitar, etc., >>> I'm >>> including >>> that, I think that should stay. Vocal Ranges, though, kind of >>> pointless >>> IMO, for all >>> musical genres except maybe Opera. Same deal for 'instruments >>> played'. >>> As you've pointed >>> out this is better fit for the particular member of the group. >>> >> The Wikipedia community seems to think it?s interesting. >> >> >>> And I'm totally agreeing with you on the second point, anything we >>> can >>> do to rip >>> specific performance information out of musicbrainz would be >>> _amazing_. >>> Amazing. >>> Like game-changing amazing. >>> >> The code is in Metaweb?s subversion somewhere. >> >> >>> In summary: Of course, keep role for instruments played, etc., but we >>> don't need a vocal range. >>> Not for an artist. While we're at it, as you pointed out, we can also >>> get rid of Instruments Played for >>> the same reasons. >>> >> I don?t understand the problem that these properties present. They >> are >> partly populated. If they don?t make sense on the current type, move >> them to another type. Why nuke the information that?s already >> present? >> >> ~Chris >> -- >> Chris Maden, text nerd >> ?What a dream life would seem if you could see the world from >> inside an Etch-A-Sketch.? ? Andrew Bird, ?Tea & Thorazine? >> GnuPG Fingerprint: C6E4 E2A9 C9F8 71AC 9724 CAA3 19F8 6677 0077 C319 >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling >> > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From micah at metaweb.com Mon Oct 5 22:50:56 2009 From: micah at metaweb.com (Micah Saul) Date: Mon, 5 Oct 2009 15:50:56 -0700 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <8E3351A81A4446098B398620D097FCB1@p4> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org><4ACA6654.60900@metaweb.com> <4ACA69D6.2010704@maden.org> <79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com> <8E3351A81A4446098B398620D097FCB1@p4> Message-ID: <44A1105A-B1C9-4943-957B-0045F578F350@metaweb.com> ... Yes, "Musician", that's the word I was looking for. So, aside from Instruments Played and Vocal Range, are there any other properties on Musical Artist that are Musician specific? Also, I'd propose that Musical Group Member be rolled into Musician, as well. That would give Musician three properties: Instruments Played, Vocal Range, and Member of. There is, however, a lot of bad data that needs to be cleaned up if we choose to do that second part. A quick search brings up a lot of spurious Musical Groups with names like "Queen & David Bowie" to capture the Group that performed "Under Pressure." Clearly, that song was performed by two musical artists, "Queen" and "David Bowie." This is a fairly prevalent issue stemming from the old Musicbrainz loads. Any ideas on how to resolve that? m On Oct 5, 2009, at 3:06 PM, Jeff Prucher wrote: > > >> -----Original Message----- >> From: data-modeling-bounces at freebase.com >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Micah Saul >> >> There are some properties on Musical Artist that apply to >> both bands and people. Some apply to only people. We created >> Musical Group to capture those properties that only apply to >> bands, why have we not created Musical Person (or repurposed >> Musical Group Member) to capture only the properties relevant >> to people? And we could then make Musical Person and Musical >> Group incompatible types which would prevent all these bad >> merges between bands and their members (see Tom Petty or >> Justin Vernon/Bon Iver). > > I think this is a good idea. We could call it "Musician" rather than > "Musical Person". > >> Plus, this might make Tyler happy. > > Wait, when did that become a priority? > > Jeff > >> Vocal Range and >> Instruments Played make a lot more sense when you can't have >> an assertion for the Vocal Range of Metallica. >> >> m >> >> >> On Oct 5, 2009, at 2:49 PM, Christopher R. Maden wrote: >> >>> Tyler Pirtle wrote: >>>> Hold on tiger. 'Role' is fine for a person playing a guitar, etc., >>>> I'm including that, I think that should stay. Vocal >> Ranges, though, >>>> kind of pointless IMO, for all musical genres except maybe Opera. >>>> Same deal for 'instruments played'. >>>> As you've pointed >>>> out this is better fit for the particular member of the group. >>> >>> The Wikipedia community seems to think it's interesting. >>> >>>> And I'm totally agreeing with you on the second point, anything we >>>> can do to rip specific performance information out of musicbrainz >>>> would be _amazing_. >>>> Amazing. >>>> Like game-changing amazing. >>> >>> The code is in Metaweb's subversion somewhere. >>> >>>> In summary: Of course, keep role for instruments played, >> etc., but we >>>> don't need a vocal range. >>>> Not for an artist. While we're at it, as you pointed out, >> we can also >>>> get rid of Instruments Played for the same reasons. >>> >>> I don't understand the problem that these properties present. They >>> are partly populated. If they don't make sense on the >> current type, >>> move them to another type. Why nuke the information that's already >>> present? >>> >>> ~Chris >>> -- >>> Chris Maden, text nerd >> "What a dream >>> life would seem if you could see the world from inside an >>> Etch-A-Sketch." - Andrew Bird, "Tea & Thorazine" >>> GnuPG Fingerprint: C6E4 E2A9 C9F8 71AC 9724 CAA3 19F8 6677 >> 0077 C319 >>> _______________________________________________ >>> Data-modeling mailing list >>> Data-modeling at freebase.com >>> http://lists.freebase.com/mailman/listinfo/data-modeling >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling >> > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From jeff at metaweb.com Mon Oct 5 23:09:35 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Mon, 5 Oct 2009 16:09:35 -0700 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <44A1105A-B1C9-4943-957B-0045F578F350@metaweb.com> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org><4ACA6654.60900@metaweb.com><4ACA69D6.2010704@maden.org><79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com><8E3351A81A4446098B398620D097FCB1@p4> <44A1105A-B1C9-4943-957B-0045F578F350@metaweb.com> Message-ID: <8F026C5D915440B4AF5E0361D20980BF@p4> > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Micah Saul > Sent: Monday, October 05, 2009 3:51 PM > To: Freebase data modeling mailing list > Subject: Re: [Data-modeling] vocal range on music artists > > ... Yes, "Musician", that's the word I was looking for. > > So, aside from Instruments Played and Vocal Range, are there > any other properties on Musical Artist that are Musician specific? > > Also, I'd propose that Musical Group Member be rolled into > Musician, as well. That would give Musician three properties: > Instruments Played, Vocal Range, and Member of. > > There is, however, a lot of bad data that needs to be cleaned > up if we choose to do that second part. A quick search brings > up a lot of spurious Musical Groups with names like "Queen & > David Bowie" to capture the Group that performed "Under > Pressure." Clearly, that song was performed by two musical > artists, "Queen" and "David Bowie." This is a fairly > prevalent issue stemming from the old Musicbrainz loads. > Any ideas on how to resolve that? Musicbrainz does (now) distinguish these types of things as collaborations (see http://wiki.musicbrainz.org/Collaboration_Relationship_Type), so it should theoretically be possible to garden these based on the MusicBrainz data. Jeff > m > > On Oct 5, 2009, at 3:06 PM, Jeff Prucher wrote: > > > > > > >> -----Original Message----- > >> From: data-modeling-bounces at freebase.com > >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Micah Saul > >> > >> There are some properties on Musical Artist that apply to > both bands > >> and people. Some apply to only people. We created Musical Group to > >> capture those properties that only apply to bands, why have we not > >> created Musical Person (or repurposed Musical Group Member) to > >> capture only the properties relevant to people? And we could then > >> make Musical Person and Musical Group incompatible types > which would > >> prevent all these bad merges between bands and their > members (see Tom > >> Petty or Justin Vernon/Bon Iver). > > > > I think this is a good idea. We could call it "Musician" > rather than > > "Musical Person". > > > >> Plus, this might make Tyler happy. > > > > Wait, when did that become a priority? > > > > Jeff > > > >> Vocal Range and > >> Instruments Played make a lot more sense when you can't have an > >> assertion for the Vocal Range of Metallica. > >> > >> m > >> > >> > >> On Oct 5, 2009, at 2:49 PM, Christopher R. Maden wrote: > >> > >>> Tyler Pirtle wrote: > >>>> Hold on tiger. 'Role' is fine for a person playing a > guitar, etc., > >>>> I'm including that, I think that should stay. Vocal > >> Ranges, though, > >>>> kind of pointless IMO, for all musical genres except maybe Opera. > >>>> Same deal for 'instruments played'. > >>>> As you've pointed > >>>> out this is better fit for the particular member of the group. > >>> > >>> The Wikipedia community seems to think it's interesting. > >>> > >>>> And I'm totally agreeing with you on the second point, > anything we > >>>> can do to rip specific performance information out of > musicbrainz > >>>> would be _amazing_. > >>>> Amazing. > >>>> Like game-changing amazing. > >>> > >>> The code is in Metaweb's subversion somewhere. > >>> > >>>> In summary: Of course, keep role for instruments played, > >> etc., but we > >>>> don't need a vocal range. > >>>> Not for an artist. While we're at it, as you pointed out, > >> we can also > >>>> get rid of Instruments Played for the same reasons. > >>> > >>> I don't understand the problem that these properties > present. They > >>> are partly populated. If they don't make sense on the > >> current type, > >>> move them to another type. Why nuke the information > that's already > >>> present? > >>> > >>> ~Chris > >>> -- > >>> Chris Maden, text nerd > >> "What a dream > >>> life would seem if you could see the world from inside an > >>> Etch-A-Sketch." - Andrew Bird, "Tea & Thorazine" > >>> GnuPG Fingerprint: C6E4 E2A9 C9F8 71AC 9724 CAA3 19F8 6677 > >> 0077 C319 > >>> _______________________________________________ > >>> Data-modeling mailing list > >>> Data-modeling at freebase.com > >>> http://lists.freebase.com/mailman/listinfo/data-modeling > >> > >> _______________________________________________ > >> Data-modeling mailing list > >> Data-modeling at freebase.com > >> http://lists.freebase.com/mailman/listinfo/data-modeling > >> > > > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From kurt at spaceship.com Mon Oct 5 23:25:31 2009 From: kurt at spaceship.com (Kurt Bollacker) Date: Mon, 5 Oct 2009 16:25:31 -0700 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <8F026C5D915440B4AF5E0361D20980BF@p4> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <44A1105A-B1C9-4943-957B-0045F578F350@metaweb.com> <8F026C5D915440B4AF5E0361D20980BF@p4> Message-ID: <20091005232531.GK8300@spaceship.com> On Mon, Oct 05, 2009 at 04:09:35PM -0700, Jeff Prucher wrote: > > > > -----Original Message----- > > From: data-modeling-bounces at freebase.com > > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Micah Saul > > Sent: Monday, October 05, 2009 3:51 PM > > To: Freebase data modeling mailing list > > Subject: Re: [Data-modeling] vocal range on music artists > > > > ... Yes, "Musician", that's the word I was looking for. > > > > So, aside from Instruments Played and Vocal Range, are there > > any other properties on Musical Artist that are Musician specific? > > > > Also, I'd propose that Musical Group Member be rolled into > > Musician, as well. That would give Musician three properties: > > Instruments Played, Vocal Range, and Member of. > > > > There is, however, a lot of bad data that needs to be cleaned > > up if we choose to do that second part. A quick search brings > > up a lot of spurious Musical Groups with names like "Queen & > > David Bowie" to capture the Group that performed "Under > > Pressure." Clearly, that song was performed by two musical > > artists, "Queen" and "David Bowie." This is a fairly > > prevalent issue stemming from the old Musicbrainz loads. > > Any ideas on how to resolve that? > > Musicbrainz does (now) distinguish these types of things as collaborations > (see http://wiki.musicbrainz.org/Collaboration_Relationship_Type), so it > should theoretically be possible to garden these based on the MusicBrainz > data. If there is an authority that can distinguish a collaboration from a musical group, then great, but in the real world, it's not always clear. How frequently does a collaboration have to happen before it's treated as a group? What is the cost of conflation? Also, there should always be an expectation that when trying to find a group or collaboration, that they may be typed as one rather than the other. If the music schema is actually refactored, I'd prefer that "collaboration" simply be a boolean property or cotype on "music group" rather than a mutually exclusive typing. Would that break anything or create bad data? Kurt :-) > Jeff > > > m > > > > On Oct 5, 2009, at 3:06 PM, Jeff Prucher wrote: > > > > > > > > > > >> -----Original Message----- > > >> From: data-modeling-bounces at freebase.com > > >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Micah Saul > > >> > > >> There are some properties on Musical Artist that apply to > > both bands > > >> and people. Some apply to only people. We created Musical Group to > > >> capture those properties that only apply to bands, why have we not > > >> created Musical Person (or repurposed Musical Group Member) to > > >> capture only the properties relevant to people? And we could then > > >> make Musical Person and Musical Group incompatible types > > which would > > >> prevent all these bad merges between bands and their > > members (see Tom > > >> Petty or Justin Vernon/Bon Iver). > > > > > > I think this is a good idea. We could call it "Musician" > > rather than > > > "Musical Person". > > > > > >> Plus, this might make Tyler happy. > > > > > > Wait, when did that become a priority? > > > > > > Jeff > > > > > >> Vocal Range and > > >> Instruments Played make a lot more sense when you can't have an > > >> assertion for the Vocal Range of Metallica. > > >> > > >> m > > >> > > >> > > >> On Oct 5, 2009, at 2:49 PM, Christopher R. Maden wrote: > > >> > > >>> Tyler Pirtle wrote: > > >>>> Hold on tiger. 'Role' is fine for a person playing a > > guitar, etc., > > >>>> I'm including that, I think that should stay. Vocal > > >> Ranges, though, > > >>>> kind of pointless IMO, for all musical genres except maybe Opera. > > >>>> Same deal for 'instruments played'. > > >>>> As you've pointed > > >>>> out this is better fit for the particular member of the group. > > >>> > > >>> The Wikipedia community seems to think it's interesting. > > >>> > > >>>> And I'm totally agreeing with you on the second point, > > anything we > > >>>> can do to rip specific performance information out of > > musicbrainz > > >>>> would be _amazing_. > > >>>> Amazing. > > >>>> Like game-changing amazing. > > >>> > > >>> The code is in Metaweb's subversion somewhere. > > >>> > > >>>> In summary: Of course, keep role for instruments played, > > >> etc., but we > > >>>> don't need a vocal range. > > >>>> Not for an artist. While we're at it, as you pointed out, > > >> we can also > > >>>> get rid of Instruments Played for the same reasons. > > >>> > > >>> I don't understand the problem that these properties > > present. They > > >>> are partly populated. If they don't make sense on the > > >> current type, > > >>> move them to another type. Why nuke the information > > that's already > > >>> present? > > >>> > > >>> ~Chris > > >>> -- > > >>> Chris Maden, text nerd > > >> "What a dream > > >>> life would seem if you could see the world from inside an > > >>> Etch-A-Sketch." - Andrew Bird, "Tea & Thorazine" > > >>> GnuPG Fingerprint: C6E4 E2A9 C9F8 71AC 9724 CAA3 19F8 6677 > > >> 0077 C319 > > >>> _______________________________________________ > > >>> Data-modeling mailing list > > >>> Data-modeling at freebase.com > > >>> http://lists.freebase.com/mailman/listinfo/data-modeling > > >> > > >> _______________________________________________ > > >> Data-modeling mailing list > > >> Data-modeling at freebase.com > > >> http://lists.freebase.com/mailman/listinfo/data-modeling > > >> > > > > > > _______________________________________________ > > > Data-modeling mailing list > > > Data-modeling at freebase.com > > > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From crism at maden.org Tue Oct 6 02:29:38 2009 From: crism at maden.org (Christopher R. Maden) Date: Mon, 05 Oct 2009 22:29:38 -0400 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org> <4ACA6654.60900@metaweb.com> <4ACA69D6.2010704@maden.org> <79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com> Message-ID: <4ACAAB92.5060604@maden.org> Micah Saul wrote: > This comes down to an issue I've had with the music schema all along. > > Musical Artist vs. Musical Group vs. Musical Group Member: I agree with this split. Jeff, can you open a gardening task to do it? As a historical note, any time there is something weird like this in Music, it can be explained by two things: 1) Trying to parallel the MusicBrainz model to ease import. 2) Deciding to conserve primitives in the first load, which meant dropping a lot of interesting properties and not populating others. > There is, however, a lot of bad data that needs to be cleaned up if we > choose to do that second part. A quick search brings up a lot of > spurious Musical Groups with names like "Queen & David Bowie" to > capture the Group that performed "Under Pressure." Clearly, that song > was performed by two musical artists, "Queen" and "David Bowie." This > is a fairly prevalent issue stemming from the old Musicbrainz loads. > Any ideas on how to resolve that? I believe there is an open Jira task on this; I started to address this at one point. The problem is that MusicBrainz, though it distinguishes between collaborations and groups, still has a single entry for *the* artist of ?Under Pressure.? If we drop the topic for ?Queen & David Bowie,? we lose the ability to correlate anything in Freebase with the corresponding MusicBrainz entry. We also potentially complicate any future re-import from MusicBrainz. Since Metaweb seems uninterested in continuing an ongoing correlation with MusicBrainz, throwing out the MusicBrainz link for collaborations is probably the right thing to do. ~Chris -- Chris Maden, text nerd ?What a dream life would seem if you could see the world from inside an Etch-A-Sketch.? ? Andrew Bird, ?Tea & Thorazine? GnuPG Fingerprint: C6E4 E2A9 C9F8 71AC 9724 CAA3 19F8 6677 0077 C319 From zenkat at metaweb.com Tue Oct 6 05:08:35 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Mon, 5 Oct 2009 22:08:35 -0700 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <4ACAAB92.5060604@maden.org> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org> <4ACA6654.60900@metaweb.com> <4ACA69D6.2010704@maden.org> <79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com> <4ACAAB92.5060604@maden.org> Message-ID: On Oct 5, 2009, at 7:29 PM, Christopher R. Maden wrote: > The problem is that MusicBrainz, though it distinguishes > between collaborations and groups, still has a single entry for *the* > artist of ?Under Pressure.? If we drop the topic for ?Queen & David > Bowie,? we lose the ability to correlate anything in Freebase with the > corresponding MusicBrainz entry. We also potentially complicate any > future re-import from MusicBrainz. I believe this is only true of MusicBrainz's old schema. The Next Generation Schema (http://wiki.musicbrainz.org/NGS) addresses this issue by creating "Artist Credit" and "Release Group" entities to mediate between groups and releases. "Queen & David Bowie" is given as a specific example in the NGS docs. It is modeled with: -- Two artist records, one for "Queen" and one for "David Bowie" -- One artist_credit record for their collaboration -- Two join records between the artist_credit and each artist to note the name the artist performed under (and other join phrases, like "&") -- One release_group record for "Under Pressure" So MusicBrainz no longer has a single artist entry for "Queen & David Bowie" (unless you count the artist_credit mediator). However, it does beg the question: how should we best model these relationships in Freebase? As it stands, the /music/track --> /music/ artist relationship is unique, so even if MusicBrainz no longer says there has to be a single entity responsible for recording a track, we still do. > Since Metaweb seems uninterested in continuing an ongoing correlation > with MusicBrainz, throwing out the MusicBrainz link for collaborations > is probably the right thing to do. Not at all! Synchronizing with MusicBrainz is a high priority for Metaweb. We're actively working on creating an updated triples-based pipeline for the Next Generation Schema. (Updates soon -- honest!) Part of the deployment of the NGS pipeline will be a review & cleanup of any old MusicBrainz data that is crufty, misreconciled, or otherwise dirty. I would prefer that we do this as a unified effort, instead of as a series of ad hoc actions. We'll definitely bounce our findings off this list. Brian From crism at maden.org Tue Oct 6 05:20:08 2009 From: crism at maden.org (Christopher R. Maden) Date: Tue, 06 Oct 2009 01:20:08 -0400 Subject: [Data-modeling] vocal range on music artists In-Reply-To: References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org> <4ACA6654.60900@metaweb.com> <4ACA69D6.2010704@maden.org> <79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com> <4ACAAB92.5060604@maden.org> Message-ID: <4ACAD388.7030500@maden.org> Brian Karlak wrote: > I believe this is only true of MusicBrainz's old schema. The Next > Generation Schema (http://wiki.musicbrainz.org/NGS) addresses this > issue by creating "Artist Credit" and "Release Group" entities to > mediate between groups and releases. Holy cow! I helped design NGS in London, over two years ago, but I had given up on it ever being implemented. Rock on! > So MusicBrainz no longer has a single artist entry for "Queen & David > Bowie" (unless you count the artist_credit mediator). The artist_credit would be analogous to our ?credited as? property on Creative Work. Time to re-open that old Jira ticket. > However, it does beg the question: how should we best model these > relationships in Freebase? As it stands, the /music/track --> /music/ > artist relationship is unique, so even if MusicBrainz no longer says > there has to be a single entity responsible for recording a track, we > still do. It shouldn?t be unique. It hasn?t been on Album for a while, and shouldn?t be on Track, either. > Not at all! Synchronizing with MusicBrainz is a high priority for > Metaweb. We're actively working on creating an updated triples-based > pipeline for the Next Generation Schema. (Updates soon -- honest!) That is a delightfully pleasant change. > Part of the deployment of the NGS pipeline will be a review & cleanup > of any old MusicBrainz data that is crufty, misreconciled, or > otherwise dirty. I would prefer that we do this as a unified effort, > instead of as a series of ad hoc actions. We'll definitely bounce our > findings off this list. Keep me in the loop, please. I am emotionally invested in both sides of this operation. ~Chris -- Chris Maden, text nerd ?What a dream life would seem if you could see the world from inside an Etch-A-Sketch.? ? Andrew Bird, ?Tea & Thorazine? GnuPG Fingerprint: C6E4 E2A9 C9F8 71AC 9724 CAA3 19F8 6677 0077 C319 From zenkat at metaweb.com Tue Oct 6 05:37:38 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Mon, 5 Oct 2009 22:37:38 -0700 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <4ACAD388.7030500@maden.org> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org> <4ACA6654.60900@metaweb.com> <4ACA69D6.2010704@maden.org> <79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com> <4ACAAB92.5060604@maden.org> <4ACAD388.7030500@maden.org> Message-ID: <506CE8E9-2DE3-4895-9D83-05286A00BB2B@metaweb.com> On Oct 5, 2009, at 10:20 PM, Christopher R. Maden wrote: >> I believe this is only true of MusicBrainz's old schema. The Next >> Generation Schema (http://wiki.musicbrainz.org/NGS) addresses this >> issue by creating "Artist Credit" and "Release Group" entities to >> mediate between groups and releases. > > Holy cow! I helped design NGS in London, over two years ago, but I > had > given up on it ever being implemented. Rock on! I believe that MusicBrainz's collaboration with the Beeb helped catalyze things. So you know, Robert Kaye definitely credited your input as critical for the design of the NGS. I believe you'll find many of the concepts you proposed in the final schema. >> So MusicBrainz no longer has a single artist entry for "Queen & David >> Bowie" (unless you count the artist_credit mediator). > The artist_credit would be analogous to our ?credited as? property on > Creative Work. > >> As it stands, the /music/track --> /music/ artist relationship is >> unique > It shouldn?t be unique. It hasn?t been on Album for a while, and > shouldn?t be on Track, either. Yes, the combination of a non-unique artist plus a credited_as string property gets us most of the way there. The only potential hiccup I see is with keeping track of reconciliation; there's not a good place to keep track of the NGS artist_credit mediator relationships. That said, we may not need to manage the recon info for that mediator in the graph; we may be able to handle it offline. I need to ponder ... > Time to re-open that old Jira ticket. Actually ... one thing I'd rather not do is handle the MusicBrainz NGS revamp as a series of JIRA tickets. I'd rather take a bit more of a holistic view when putting it together. So, for instance, while I think the Musician split is a good idea, we should probably bundle it in with the rest of the changes that may be coming down the pike ... That said, we should definitely keep the open JIRA tickets in mind, to make sure we *have* a holistic view of all of the former hotspots we've discussed! Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091005/22dbba0a/attachment-0001.htm From philip-freebase at shadowmagic.org.uk Wed Oct 7 12:51:17 2009 From: philip-freebase at shadowmagic.org.uk (Philip Kendall) Date: Wed, 7 Oct 2009 13:51:17 +0100 Subject: [Data-modeling] Data game of the day: IATA codes Message-ID: <20091007125114.GX31466@sphinx.int.mythic-beasts.com> Add IATA codes to airports (or merge with existing topics[1]): http://airportcoder.freebaseapps.com/ Let's clear this lot out by this time tomorrow :-) Cheers, Phil [1] flag for merge currently only very lightly tested. Please let me know of any problems -- Philip Kendall http://www.shadowmagic.org.uk/ From philip-freebase at shadowmagic.org.uk Wed Oct 7 14:09:44 2009 From: philip-freebase at shadowmagic.org.uk (Philip Kendall) Date: Wed, 7 Oct 2009 15:09:44 +0100 Subject: [Data-modeling] Books that aren't Message-ID: <20091007140944.GY31466@sphinx.int.mythic-beasts.com> What's the best procedure when finding a "book" uploaded from OpenLibrary which isn't a book? The particular case I've found is "Unhappy Homes" /guid/9202a8c04000641f800000000ca13f55 which isn't a book at all, but is really an expansion for the Gloom card game - will it be sufficient to just detype it as a book (and written work), or should we also delete the (synthetic) OpenLibrary key? Cheers, Phil -- Philip Kendall http://www.shadowmagic.org.uk/ From tfmorris at gmail.com Wed Oct 7 15:47:39 2009 From: tfmorris at gmail.com (Tom Morris) Date: Wed, 7 Oct 2009 11:47:39 -0400 Subject: [Data-modeling] Data game of the day: IATA codes In-Reply-To: <20091007125114.GX31466@sphinx.int.mythic-beasts.com> References: <20091007125114.GX31466@sphinx.int.mythic-beasts.com> Message-ID: On Wed, Oct 7, 2009 at 8:51 AM, Philip Kendall wrote: > Add IATA codes to airports (or merge with existing topics[1]): > > http://airportcoder.freebaseapps.com/ > > Let's clear this lot out by this time tomorrow :-) > If you need more grist for the mill, there are another 500+ topics here: http://untyped.freebaseapps.com/?topic_keyword=+airport%24&type_name=Airport&page=2&type=%2Faviation%2Fairport which are a combination of Wikipedia disambiguation pages for airports, untyped airports, and airport subway/rail stations. Untyped is sporting a new look that I did a few weeks ago, but the mass type/delete feature isn't implemented yet, so you can just ignore those buttons (maybe this'll be enough of an incentive for me to put a few minutes into it). Tom p.s. When I was doing some of this merging by hand yesterday, I came across an unmergeable topic which had been created by global_airport_db_bot, so you may run into more of these. Not sure who owns that bot, it seems to have stopped running a year ago. From bryan.cheung at metaweb.com Wed Oct 7 17:15:06 2009 From: bryan.cheung at metaweb.com (Bryan Cheung) Date: Wed, 7 Oct 2009 10:15:06 -0700 Subject: [Data-modeling] Data game of the day: IATA codes In-Reply-To: References: <20091007125114.GX31466@sphinx.int.mythic-beasts.com> Message-ID: <02C48FE7-68CE-4AB6-9193-6B20F8F28BD4@metaweb.com> > p.s. When I was doing some of this merging by hand yesterday, I came > across an unmergeable topic which had been created by > global_airport_db_bot, so you may run into more of these. Not sure > who owns that bot, it seems to have stopped running a year ago. Global airport db was a one time load - I was the operator. Can you link me to the topic that you found that was unmergeable? On Oct 7, 2009, at 8:47 AM, Tom Morris wrote: > On Wed, Oct 7, 2009 at 8:51 AM, Philip Kendall > wrote: >> Add IATA codes to airports (or merge with existing topics[1]): >> >> http://airportcoder.freebaseapps.com/ >> >> Let's clear this lot out by this time tomorrow :-) >> > > If you need more grist for the mill, there are another 500+ topics > here: > > http://untyped.freebaseapps.com/?topic_keyword=+airport%24&type_name=Airport&page=2&type=%2Faviation%2Fairport > > which are a combination of Wikipedia disambiguation pages for > airports, untyped airports, and airport subway/rail stations. > > Untyped is sporting a new look that I did a few weeks ago, but the > mass type/delete feature isn't implemented yet, so you can just ignore > those buttons (maybe this'll be enough of an incentive for me to put a > few minutes into it). > > Tom > > p.s. When I was doing some of this merging by hand yesterday, I came > across an unmergeable topic which had been created by > global_airport_db_bot, so you may run into more of these. Not sure > who owns that bot, it seems to have stopped running a year ago. > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From zenkat at metaweb.com Wed Oct 7 17:16:27 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Wed, 7 Oct 2009 10:16:27 -0700 Subject: [Data-modeling] Books that aren't In-Reply-To: <20091007140944.GY31466@sphinx.int.mythic-beasts.com> References: <20091007140944.GY31466@sphinx.int.mythic-beasts.com> Message-ID: Hi Phil -- In this case, I'd say go ahead and put it on the delete queue. There's very little value in having it as an untyped entity in Freebase. Even if we were focusing on a load of card game extension packs, the untyped "Unhappy Homes" topic would just be noise, an unreconcilable topic that we would likely need to inspect manually. Thanks, Brian On Oct 7, 2009, at 7:09 AM, Philip Kendall wrote: > What's the best procedure when finding a "book" uploaded from > OpenLibrary which isn't a book? > > The particular case I've found is "Unhappy Homes" > /guid/9202a8c04000641f800000000ca13f55 which isn't a book at all, > but is > really an expansion for the Gloom card game - will it be sufficient to > just detype it as a book (and written work), or should we also delete > the (synthetic) OpenLibrary key? > > Cheers, > > Phil > > -- > Philip Kendall > http://www.shadowmagic.org.uk/ > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From jeff at metaweb.com Wed Oct 7 18:33:58 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Wed, 7 Oct 2009 11:33:58 -0700 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <506CE8E9-2DE3-4895-9D83-05286A00BB2B@metaweb.com> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org> <4ACA6654.60900@metaweb.com> <4ACA69D6.2010704@maden.org> <79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com> <4ACAAB92.5060604@maden.org><4ACAD388.7030500@maden.org> <506CE8E9-2DE3-4895-9D83-05286A00BB2B@metaweb.com> Message-ID: <3D121F234E76499A8C87E998D1C62D35@p4> > Actually ... one thing I'd rather not do is handle the MusicBrainz NGS revamp as a series of JIRA tickets. I'd rather take a bit ?> > more of a holistic view when putting it together. So, for instance, while I think the Musician split is a good idea, we should probably > > bundle it in with the rest of the changes that may be coming down the pike ... > > That said, we should definitely keep the open JIRA tickets in mind, to make sure we *have* a holistic view of all of the former > > hotspots we've discussed! > > Brian With that in mind, I've made a catch-all task in JIRA to link all the various tickets related to MusicBrainz and the /music schema together, for ease of reference. https://bugs.freebase.com/browse/DTG-123 Jeff From zenkat at metaweb.com Wed Oct 7 18:42:16 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Wed, 7 Oct 2009 11:42:16 -0700 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <3D121F234E76499A8C87E998D1C62D35@p4> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org> <4ACA6654.60900@metaweb.com> <4ACA69D6.2010704@maden.org> <79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com> <4ACAAB92.5060604@maden.org><4ACAD388.7030500@maden.org> <506CE8E9-2DE3-4895-9D83-05286A00BB2B@metaweb.com> <3D121F234E76499A8C87E998D1C62D35@p4> Message-ID: <8B26D608-6C5E-4EFA-983E-A13A64373C3B@metaweb.com> On Oct 7, 2009, at 11:33 AM, Jeff Prucher wrote: > With that in mind, I've made a catch-all task in JIRA to link all the > various tickets related to MusicBrainz and the /music schema > together, for > ease of reference. https://bugs.freebase.com/browse/DTG-123 Sweet. Thanks, Jeff! Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091007/5d0269eb/attachment.htm From faye at metaweb.com Wed Oct 7 18:54:46 2009 From: faye at metaweb.com (Faye Harris) Date: Wed, 07 Oct 2009 11:54:46 -0700 Subject: [Data-modeling] vocal range on music artists In-Reply-To: <4ACAD388.7030500@maden.org> References: <4ACA4828.90707@metaweb.com> <4ACA4E08.1050705@maden.org> <4ACA577B.8070403@metaweb.com> <4ACA5DC3.6070804@maden.org> <4ACA6654.60900@metaweb.com> <4ACA69D6.2010704@maden.org> <79DD14E5-0B15-4353-99B5-D5F5EAEFA4BF@metaweb.com> <4ACAAB92.5060604@maden.org> <4ACAD388.7030500@maden.org> Message-ID: <4ACCE3F6.2020807@metaweb.com> Christopher R. Maden wrote: > Brian Karlak wrote: > >> However, it does beg the question: how should we best model these >> relationships in Freebase? As it stands, the /music/track --> /music/ >> artist relationship is unique, so even if MusicBrainz no longer says >> there has to be a single entity responsible for recording a track, we >> still do. >> > > It shouldn?t be unique. It hasn?t been on Album for a while, and > shouldn?t be on Track, either. > +1 on removing the unique constraint on /music/track --> /music/artist relationship, and modeling artist_credit. It's time we move beyond modeling collaborations between musical artists as (disposable) musical groups. (Going off to vote on individual Jira tasks...) -- Faye From tfmorris at gmail.com Wed Oct 7 19:01:14 2009 From: tfmorris at gmail.com (Tom Morris) Date: Wed, 7 Oct 2009 15:01:14 -0400 Subject: [Data-modeling] Data game of the day: IATA codes In-Reply-To: <02C48FE7-68CE-4AB6-9193-6B20F8F28BD4@metaweb.com> References: <20091007125114.GX31466@sphinx.int.mythic-beasts.com> <02C48FE7-68CE-4AB6-9193-6B20F8F28BD4@metaweb.com> Message-ID: On Wed, Oct 7, 2009 at 1:15 PM, Bryan Cheung wrote: >> p.s. When I was doing some of this merging by hand yesterday, I came >> across an unmergeable topic which had been created by >> global_airport_db_bot, so you may run into more of these. ?Not sure >> who owns that bot, it seems to have stopped running a year ago. > Global airport db was a one time load - I was the operator. ?Can you > link me to the topic that you found that was unmergeable? Here's the discussion thread. According to Jeff, the problem seems to have been that it wasn't typed /common/topic http://www.freebase.com/view/guid/9202a8c04000641f800000000f51444f Tom From rfh at metaweb.com Wed Oct 7 19:13:31 2009 From: rfh at metaweb.com (Reilly Hayes) Date: Wed, 7 Oct 2009 12:13:31 -0700 Subject: [Data-modeling] Deceased Person, Measured Person Message-ID: <4E78CE66-DD04-495D-A3BE-2381D3F0AF26@metaweb.com> It is my understanding that these types were originally created for aesthetic reasons. For example, many people found the display of empty "Date of death" fields on living people jarring. Laundry lists of empty fields (many of which may not apply) made data harder to navigate. I propose that these concerns are no longer relevant and that it would make the schema easier for developers to use if the fields were merged back into /People/Person. I suspect there are other cases like this hiding elsewhere in commons. -r -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2434 bytes Desc: not available Url : http://lists.freebase.com/pipermail/data-modeling/attachments/20091007/d6beaac6/attachment.bin From bryan.cheung at metaweb.com Wed Oct 7 19:19:44 2009 From: bryan.cheung at metaweb.com (Bryan Cheung) Date: Wed, 7 Oct 2009 12:19:44 -0700 Subject: [Data-modeling] Data game of the day: IATA codes In-Reply-To: <02C48FE7-68CE-4AB6-9193-6B20F8F28BD4@metaweb.com> References: <20091007125114.GX31466@sphinx.int.mythic-beasts.com> <02C48FE7-68CE-4AB6-9193-6B20F8F28BD4@metaweb.com> Message-ID: Jeff brought it to my attention that some airports are not typed as / common/topic and that is probably the reason that the airport you found could not be merged. I've gone through and added the /common/ topic cotype to all airports. Bryan On Oct 7, 2009, at 10:15 AM, Bryan Cheung wrote: >> p.s. When I was doing some of this merging by hand yesterday, I came >> across an unmergeable topic which had been created by >> global_airport_db_bot, so you may run into more of these. Not sure >> who owns that bot, it seems to have stopped running a year ago. > Global airport db was a one time load - I was the operator. Can you > link me to the topic that you found that was unmergeable? > > On Oct 7, 2009, at 8:47 AM, Tom Morris wrote: > >> On Wed, Oct 7, 2009 at 8:51 AM, Philip Kendall >> wrote: >>> Add IATA codes to airports (or merge with existing topics[1]): >>> >>> http://airportcoder.freebaseapps.com/ >>> >>> Let's clear this lot out by this time tomorrow :-) >>> >> >> If you need more grist for the mill, there are another 500+ topics >> here: >> >> http://untyped.freebaseapps.com/?topic_keyword=+airport%24&type_name=Airport&page=2&type=%2Faviation%2Fairport >> >> which are a combination of Wikipedia disambiguation pages for >> airports, untyped airports, and airport subway/rail stations. >> >> Untyped is sporting a new look that I did a few weeks ago, but the >> mass type/delete feature isn't implemented yet, so you can just >> ignore >> those buttons (maybe this'll be enough of an incentive for me to >> put a >> few minutes into it). >> >> Tom >> >> p.s. When I was doing some of this merging by hand yesterday, I came >> across an unmergeable topic which had been created by >> global_airport_db_bot, so you may run into more of these. Not sure >> who owns that bot, it seems to have stopped running a year ago. >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From jeff at metaweb.com Wed Oct 7 19:27:04 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Wed, 7 Oct 2009 12:27:04 -0700 Subject: [Data-modeling] Deceased Person, Measured Person In-Reply-To: <4E78CE66-DD04-495D-A3BE-2381D3F0AF26@metaweb.com> References: <4E78CE66-DD04-495D-A3BE-2381D3F0AF26@metaweb.com> Message-ID: <1B2A45AC044B4BDD95CD1BC464E3F91B@p4> The Deceased Person and Deceased Organism types do, themselves, carry semantic value -- that is, if you know someone (or some organism) is dead, but you don't know when or how they died, you can still mark them as being deceased. (This has always seemed like a more compelling rationale for the types to me than any concern about seeming morbid.) Jeff > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Reilly Hayes > Sent: Wednesday, October 07, 2009 12:14 PM > To: Freebase data modeling mailing list > Subject: [Data-modeling] Deceased Person, Measured Person > > > It is my understanding that these types were originally > created for aesthetic reasons. For example, many people > found the display of empty "Date of death" fields on living > people jarring. Laundry lists of empty fields (many of which > may not apply) made data harder to navigate. > > I propose that these concerns are no longer relevant and that > it would make the schema easier for developers to use if the > fields were merged back into /People/Person. I suspect there > are other cases like this hiding elsewhere in commons. > > -r > > From faye at metaweb.com Wed Oct 7 19:26:48 2009 From: faye at metaweb.com (Faye Harris) Date: Wed, 07 Oct 2009 12:26:48 -0700 Subject: [Data-modeling] Deceased Person, Measured Person In-Reply-To: <4E78CE66-DD04-495D-A3BE-2381D3F0AF26@metaweb.com> References: <4E78CE66-DD04-495D-A3BE-2381D3F0AF26@metaweb.com> Message-ID: <4ACCEB78.6030702@metaweb.com> Sometimes a type is more than a collection of properties. Often it's a label. In the case of Deceased Person, it serves to note a person as deceased, even without filling out or knowing any of the property values, such as date and cause of death. To maintain the label separate from its properties after merge Deceased Person to Person would require creating a boolean "dead or alive" property for where no death info is known. That to me is clumsy. -- Faye Reilly Hayes wrote: > > It is my understanding that these types were originally created for > aesthetic reasons. For example, many people found the display of > empty "Date of death" fields on living people jarring. Laundry lists > of empty fields (many of which may not apply) made data harder to > navigate. > > I propose that these concerns are no longer relevant and that it would > make the schema easier for developers to use if the fields were merged > back into /People/Person. I suspect there are other cases like this > hiding elsewhere in commons. > > -r > > ------------------------------------------------------------------------ > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From robert at metaweb.com Wed Oct 7 19:30:32 2009 From: robert at metaweb.com (robert at metaweb.com) Date: Wed, 7 Oct 2009 12:30:32 -0700 (PDT) Subject: [Data-modeling] Deceased Person, Measured Person In-Reply-To: <1B2A45AC044B4BDD95CD1BC464E3F91B@p4> References: <4E78CE66-DD04-495D-A3BE-2381D3F0AF26@metaweb.com> <1B2A45AC044B4BDD95CD1BC464E3F91B@p4> Message-ID: <0AE16774-9AA7-43EF-9418-8ECD8E62A1B1@metaweb.com> Reilly is right about the original source of the deceased person type, but I agree with jeff that it should remain. On Oct 7, 2009, at 12:26 PM, "Jeff Prucher" wrote: > The Deceased Person and Deceased Organism types do, themselves, carry > semantic value -- that is, if you know someone (or some organism) is > dead, > but you don't know when or how they died, you can still mark them as > being > deceased. (This has always seemed like a more compelling rationale > for the > types to me than any concern about seeming morbid.) > > Jeff > >> -----Original Message----- >> From: data-modeling-bounces at freebase.com >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Reilly Hayes >> Sent: Wednesday, October 07, 2009 12:14 PM >> To: Freebase data modeling mailing list >> Subject: [Data-modeling] Deceased Person, Measured Person >> >> >> It is my understanding that these types were originally >> created for aesthetic reasons. For example, many people >> found the display of empty "Date of death" fields on living >> people jarring. Laundry lists of empty fields (many of which >> may not apply) made data harder to navigate. >> >> I propose that these concerns are no longer relevant and that >> it would make the schema easier for developers to use if the >> fields were merged back into /People/Person. I suspect there >> are other cases like this hiding elsewhere in commons. >> >> -r >> >> > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From iainsproat at gmail.com Wed Oct 7 19:42:03 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Wed, 7 Oct 2009 23:42:03 +0400 Subject: [Data-modeling] Deceased Person, Measured Person In-Reply-To: <0AE16774-9AA7-43EF-9418-8ECD8E62A1B1@metaweb.com> References: <4E78CE66-DD04-495D-A3BE-2381D3F0AF26@metaweb.com> <1B2A45AC044B4BDD95CD1BC464E3F91B@p4> <0AE16774-9AA7-43EF-9418-8ECD8E62A1B1@metaweb.com> Message-ID: I agree it should stay. For a huge number of historical people, we have no date of death information. As Faye said, the alternative is to use a boolean dead/alive property on /people/person. The presence of /people/deceased_person indicates the same and is less clumsy; plus it has the benefit of keeping information about place of death, cremation and burial separate (which probably reduces some vandalism on topics of living persons). Iain On Wed, Oct 7, 2009 at 11:30 PM, wrote: > Reilly is right about the original source of the deceased person type, > but I agree with jeff that it should remain. > > On Oct 7, 2009, at 12:26 PM, "Jeff Prucher" wrote: > >> The Deceased Person and Deceased Organism types do, themselves, carry >> semantic value -- that is, if you know someone (or some organism) is >> dead, >> but you don't know when or how they died, you can still mark them as >> being >> deceased. ?(This has always seemed like a more compelling rationale >> for the >> types to me than any concern about seeming morbid.) >> >> Jeff >> >>> -----Original Message----- >>> From: data-modeling-bounces at freebase.com >>> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Reilly Hayes >>> Sent: Wednesday, October 07, 2009 12:14 PM >>> To: Freebase data modeling mailing list >>> Subject: [Data-modeling] Deceased Person, Measured Person >>> >>> >>> It is my understanding that these types were originally >>> created for aesthetic reasons. ?For example, many people >>> found the display of empty "Date of death" fields on living >>> people jarring. ?Laundry lists of empty fields (many of which >>> may not apply) made data harder to navigate. >>> >>> I propose that these concerns are no longer relevant and that >>> it would make the schema easier for developers to use if the >>> fields were merged back into /People/Person. ?I suspect there >>> are other cases like this hiding elsewhere in commons. >>> >>> -r >>> >>> >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From rfh at metaweb.com Wed Oct 7 20:46:22 2009 From: rfh at metaweb.com (Reilly Hayes) Date: Wed, 7 Oct 2009 13:46:22 -0700 Subject: [Data-modeling] Deceased Person, Measured Person In-Reply-To: <1B2A45AC044B4BDD95CD1BC464E3F91B@p4> References: <4E78CE66-DD04-495D-A3BE-2381D3F0AF26@metaweb.com> <1B2A45AC044B4BDD95CD1BC464E3F91B@p4> Message-ID: <89829E44-B0A6-4568-B683-35215079C3BE@metaweb.com> Modeling this in a type seems silly to me. If we were to do this today, I imagine we would use a property. /people/person/are_they_dead_yet would be a fine boolean. On Oct 7, 2009, at 12:27 PM, Jeff Prucher wrote: > The Deceased Person and Deceased Organism types do, themselves, carry > semantic value -- that is, if you know someone (or some organism) is > dead, > but you don't know when or how they died, you can still mark them as > being > deceased. (This has always seemed like a more compelling rationale > for the > types to me than any concern about seeming morbid.) > > Jeff > >> -----Original Message----- >> From: data-modeling-bounces at freebase.com >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Reilly Hayes >> Sent: Wednesday, October 07, 2009 12:14 PM >> To: Freebase data modeling mailing list >> Subject: [Data-modeling] Deceased Person, Measured Person >> >> >> It is my understanding that these types were originally >> created for aesthetic reasons. For example, many people >> found the display of empty "Date of death" fields on living >> people jarring. Laundry lists of empty fields (many of which >> may not apply) made data harder to navigate. >> >> I propose that these concerns are no longer relevant and that >> it would make the schema easier for developers to use if the >> fields were merged back into /People/Person. I suspect there >> are other cases like this hiding elsewhere in commons. >> >> -r >> >> > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2434 bytes Desc: not available Url : http://lists.freebase.com/pipermail/data-modeling/attachments/20091007/a77de852/attachment-0001.bin From robert at metaweb.com Wed Oct 7 21:01:29 2009 From: robert at metaweb.com (Robert Cook) Date: Wed, 7 Oct 2009 14:01:29 -0700 Subject: [Data-modeling] Deceased Person, Measured Person In-Reply-To: <89829E44-B0A6-4568-B683-35215079C3BE@metaweb.com> References: <4E78CE66-DD04-495D-A3BE-2381D3F0AF26@metaweb.com> <1B2A45AC044B4BDD95CD1BC464E3F91B@p4> <89829E44-B0A6-4568-B683-35215079C3BE@metaweb.com> Message-ID: One way to view this problem is that of denormalization. If we use a boolean, then what happens if some of the death fields are filled out but the boolean has no value or is set to false? Does the presence of any death-related value indicate that the person is dead? If death is indicated by that "is-a" relationship to Deceased Person and the presence of a "is-a" relationship connotes to most applications that its property values should be heeded, then there is no such ambiguity. I modeled many of the core non-system types in Freebase and just about every time I was tempted to use a Boolean I found a better way. That's not to say I didn't ever use Booleans, but when I did, it made me feel dirty. R On Oct 7, 2009, at 1:46 PM, Reilly Hayes wrote: > > Modeling this in a type seems silly to me. If we were to do this > today, I imagine we would use a property. > > /people/person/are_they_dead_yet would be a fine boolean. > > > > On Oct 7, 2009, at 12:27 PM, Jeff Prucher wrote: > >> The Deceased Person and Deceased Organism types do, themselves, carry >> semantic value -- that is, if you know someone (or some organism) >> is dead, >> but you don't know when or how they died, you can still mark them >> as being >> deceased. (This has always seemed like a more compelling rationale >> for the >> types to me than any concern about seeming morbid.) >> >> Jeff >> >>> -----Original Message----- >>> From: data-modeling-bounces at freebase.com >>> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Reilly >>> Hayes >>> Sent: Wednesday, October 07, 2009 12:14 PM >>> To: Freebase data modeling mailing list >>> Subject: [Data-modeling] Deceased Person, Measured Person >>> >>> >>> It is my understanding that these types were originally >>> created for aesthetic reasons. For example, many people >>> found the display of empty "Date of death" fields on living >>> people jarring. Laundry lists of empty fields (many of which >>> may not apply) made data harder to navigate. >>> >>> I propose that these concerns are no longer relevant and that >>> it would make the schema easier for developers to use if the >>> fields were merged back into /People/Person. I suspect there >>> are other cases like this hiding elsewhere in commons. >>> >>> -r >>> >>> >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From tfmorris at gmail.com Wed Oct 7 21:25:29 2009 From: tfmorris at gmail.com (Tom Morris) Date: Wed, 7 Oct 2009 17:25:29 -0400 Subject: [Data-modeling] Deceased Person, Measured Person In-Reply-To: References: <4E78CE66-DD04-495D-A3BE-2381D3F0AF26@metaweb.com> <1B2A45AC044B4BDD95CD1BC464E3F91B@p4> <89829E44-B0A6-4568-B683-35215079C3BE@metaweb.com> Message-ID: I'm not sure what the issue with "types as labels" is. They don't bother me at all. Having said that, my preferred way of modeling this would be with a more flexible date schema. A death date of "before 1930" conveys a lot more information than a simple "Dead/Not Dead" boolean, however it's encoded. For someone you're sure is dead now, but don't know how long ago they died, just use "before today" and refine it later. Tom On Wed, Oct 7, 2009 at 5:01 PM, Robert Cook wrote: > One way to view this problem is that of denormalization. ?If we use a > boolean, then what happens if some of the death fields are filled out > but the boolean has no value or is set to false? ?Does the presence of > any death-related value indicate that the person is dead? > > If death is indicated by that "is-a" relationship to Deceased Person > and the presence of a "is-a" relationship connotes to most > applications that its property values should be heeded, then there is > no such ambiguity. > > I modeled many of the core non-system types in Freebase and just about > every time I was tempted to use a Boolean I found a better way. > That's not to say I didn't ever use Booleans, but when I did, it made > me feel dirty. > > R > > On Oct 7, 2009, at 1:46 PM, Reilly Hayes wrote: > >> >> Modeling this in a type seems silly to me. ?If we were to do this >> today, I imagine we would use a property. >> >> /people/person/are_they_dead_yet would be a fine boolean. >> >> >> >> On Oct 7, 2009, at 12:27 PM, Jeff Prucher wrote: >> >>> The Deceased Person and Deceased Organism types do, themselves, carry >>> semantic value -- that is, if you know someone (or some organism) >>> is dead, >>> but you don't know when or how they died, you can still mark them >>> as being >>> deceased. ?(This has always seemed like a more compelling rationale >>> for the >>> types to me than any concern about seeming morbid.) >>> >>> Jeff >>> >>>> -----Original Message----- >>>> From: data-modeling-bounces at freebase.com >>>> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Reilly >>>> Hayes >>>> Sent: Wednesday, October 07, 2009 12:14 PM >>>> To: Freebase data modeling mailing list >>>> Subject: [Data-modeling] Deceased Person, Measured Person >>>> >>>> >>>> It is my understanding that these types were originally >>>> created for aesthetic reasons. ?For example, many people >>>> found the display of empty "Date of death" fields on living >>>> people jarring. ?Laundry lists of empty fields (many of which >>>> may not apply) made data harder to navigate. >>>> >>>> I propose that these concerns are no longer relevant and that >>>> it would make the schema easier for developers to use if the >>>> fields were merged back into /People/Person. ?I suspect there >>>> are other cases like this hiding elsewhere in commons. >>>> >>>> -r >>>> >>>> >>> >>> _______________________________________________ >>> Data-modeling mailing list >>> Data-modeling at freebase.com >>> http://lists.freebase.com/mailman/listinfo/data-modeling >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From philip-freebase at shadowmagic.org.uk Wed Oct 7 21:42:40 2009 From: philip-freebase at shadowmagic.org.uk (Philip Kendall) Date: Wed, 7 Oct 2009 22:42:40 +0100 Subject: [Data-modeling] Books that aren't In-Reply-To: References: <20091007140944.GY31466@sphinx.int.mythic-beasts.com> Message-ID: <20091007214240.GB31466@sphinx.int.mythic-beasts.com> On Wed, Oct 07, 2009 at 10:16:27AM -0700, Brian Karlak wrote: > > In this case, I'd say go ahead and put it on the delete queue. > There's very little value in having it as an untyped entity in > Freebase. Even if we were focusing on a load of card game extension > packs, the untyped "Unhappy Homes" topic would just be noise, an > unreconcilable topic that we would likely need to inspect manually. Perhaps I should have put a bit more context in :-) I found this topic while looking for the expansions of Gloom so I could type them as the soon-to-be-promoted /user/pak21/default_domain/game_expansion type, so this topic is nicely typed and reconcilable. If it were just "not a book", I agree with deleting it, but it's now "a game expansion with a synthetic OpenLibrary key", which is probably a different question. Cheers, Phil -- Philip Kendall http://www.shadowmagic.org.uk/ From iainsproat at gmail.com Thu Oct 8 07:47:01 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Thu, 8 Oct 2009 11:47:01 +0400 Subject: [Data-modeling] Deceased Person, Measured Person In-Reply-To: References: <4E78CE66-DD04-495D-A3BE-2381D3F0AF26@metaweb.com> <1B2A45AC044B4BDD95CD1BC464E3F91B@p4> <89829E44-B0A6-4568-B683-35215079C3BE@metaweb.com> Message-ID: On Thu, Oct 8, 2009 at 1:25 AM, Tom Morris wrote: > I'm not sure what the issue with "types as labels" is. ?They don't > bother me at all. Having argued against the '/architecture/tower' type a while back, I've had a change of opinion and am OK with 'types as labels'. So long as they don't become as abused as wikipedia categories. (i.e. used as a loosely defined tag) > Having said that, my preferred way of modeling this would be with a > more flexible date schema. ?A death date of "before 1930" conveys a > lot more information than a simple "Dead/Not Dead" boolean, however > it's encoded. ?For someone you're sure is dead now, but don't know how > long ago they died, just use "before today" and refine it later. I tried modelling some date uncertainties in my "lost in time" base http://lostintime.freebase.com/ Iain From philip-freebase at shadowmagic.org.uk Thu Oct 8 09:43:06 2009 From: philip-freebase at shadowmagic.org.uk (Philip Kendall) Date: Thu, 8 Oct 2009 10:43:06 +0100 Subject: [Data-modeling] Data game of the day: IATA codes In-Reply-To: References: <20091007125114.GX31466@sphinx.int.mythic-beasts.com> Message-ID: <20091008094306.GD31466@sphinx.int.mythic-beasts.com> On Wed, Oct 07, 2009 at 11:47:39AM -0400, Tom Morris wrote: > > If you need more grist for the mill, there are another 500+ topics here: > > http://untyped.freebaseapps.com/?topic_keyword=+airport%24&type_name=Airport&page=2&type=%2Faviation%2Fairport > > which are a combination of Wikipedia disambiguation pages for > airports, untyped airports, and airport subway/rail stations. My belief was that Wikipedia disambiguation pages were meant to be skipped by the importer - is the importer possibly not recognising the "{{airport disambig}}" template used on these? As one example, Wells Municipal Airport was imported on 15 Feb 2008, when the Wikipedia page definitely had the {{airport disambig}} template: Cheers, Phil -- Philip Kendall http://www.shadowmagic.org.uk/ From tfmorris at gmail.com Thu Oct 8 19:21:29 2009 From: tfmorris at gmail.com (Tom Morris) Date: Thu, 8 Oct 2009 15:21:29 -0400 Subject: [Data-modeling] Data game of the day: IATA codes In-Reply-To: <20091008094306.GD31466@sphinx.int.mythic-beasts.com> References: <20091007125114.GX31466@sphinx.int.mythic-beasts.com> <20091008094306.GD31466@sphinx.int.mythic-beasts.com> Message-ID: On Thu, Oct 8, 2009 at 5:43 AM, Philip Kendall wrote: > My belief was that Wikipedia disambiguation pages were meant to be > skipped by the importer - is the importer possibly not recognising the > "{{airport disambig}}" template used on these? That's the intent, but lots have slipped through historically. When we discussed it back in March (http://markmail.org/thread/uy5zvsshgmftirnn), I opened https://bugs.freebase.com/browse/DA-665 about this, but it's since been moved to a restricted Jira group, so you can't see it any more. >From the discussion, it sounded like it was as semi-manual process to identify all the correct disambiguation templates. Also, once topics have been mistakenly imported, it takes a special run to identify and delete them. I've already flagged all the airports for delete, but if you find other categories that we mistakenly imported, especially recently, you can up a bug report to get the list of templates updated. Tom From al at metaweb.com Thu Oct 8 19:28:44 2009 From: al at metaweb.com (Alexander Marks) Date: Thu, 8 Oct 2009 12:28:44 -0700 Subject: [Data-modeling] hulu data Message-ID: <8E76B014-75D9-42BA-B24D-C3070F593E9C@metaweb.com> In the next day or two I'll be switching on a new stream of video metadata from Hulu. You can see how this data will look on Sandbox right now: http://www.sandbox-freebase.com/view/tv/video Here's an explore-view look at a Colbert report episode video: http://www.sandbox-freebase.com/tools/explore/guid/9202a8c04000641f800000000f4e4a70 The /tv/video schema is new for this purpose, and will probably go through some evolution in the coming weeks and months. Some of its notable properties: - video_of links to the relevant TV episode (or movie) - weblink is a direct link to where the video can be watched - expires is the date and time at which the video stops being available Here's a query that shows how you might use this data in your applications (hint!): http://tinyurl.com/yb8dev2 Let me know if you have any ideas or questions about this data or the video schema. Al From al at metaweb.com Thu Oct 8 20:29:25 2009 From: al at metaweb.com (Alexander Marks) Date: Thu, 8 Oct 2009 13:29:25 -0700 Subject: [Data-modeling] hulu data In-Reply-To: <8E76B014-75D9-42BA-B24D-C3070F593E9C@metaweb.com> References: <8E76B014-75D9-42BA-B24D-C3070F593E9C@metaweb.com> Message-ID: <5FF14CFE-1AAE-4EFB-904C-BBE794AB28A2@metaweb.com> Just a correction to the query I posted earlier in that tinyurl link -- you'll want to use: "expires": { "optional": "forbidden", "value<=": "__now__" } instead of "expires>":"__now__" because some videos have no expiry date. Al On Oct 8, 2009, at 12:28 PM, Alexander Marks wrote: > In the next day or two I'll be switching on a new stream of video > metadata from Hulu. You can see how this data will look on Sandbox > right now: http://www.sandbox-freebase.com/view/tv/video > > Here's an explore-view look at a Colbert report episode video: > http://www.sandbox-freebase.com/tools/explore/guid/9202a8c04000641f800000000f4e4a70 > > The /tv/video schema is new for this purpose, and will probably go > through some evolution in the coming weeks and months. Some of its > notable properties: > > - video_of links to the relevant TV episode (or movie) > - weblink is a direct link to where the video can be watched > - expires is the date and time at which the video stops being > available > > Here's a query that shows how you might use this data in your > applications (hint!): > http://tinyurl.com/yb8dev2 > > Let me know if you have any ideas or questions about this data or the > video schema. > > Al > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From tfmorris at gmail.com Thu Oct 8 20:35:33 2009 From: tfmorris at gmail.com (Tom Morris) Date: Thu, 8 Oct 2009 16:35:33 -0400 Subject: [Data-modeling] hulu data In-Reply-To: <5FF14CFE-1AAE-4EFB-904C-BBE794AB28A2@metaweb.com> References: <8E76B014-75D9-42BA-B24D-C3070F593E9C@metaweb.com> <5FF14CFE-1AAE-4EFB-904C-BBE794AB28A2@metaweb.com> Message-ID: And you probably want "sort":"expires" instead of "sort":"-expires" so that you can watch the videos that are about to expire first. t. On Thu, Oct 8, 2009 at 4:29 PM, Alexander Marks wrote: > Just a correction to the query I posted earlier in that tinyurl link > -- you'll want to use: > > ? ?"expires": { > ? ? ? ?"optional": "forbidden", > ? ? ? ?"value<=": ? "__now__" > ? ?} > > instead of > > ? ?"expires>":"__now__" > > because some videos have no expiry date. > > Al > > > On Oct 8, 2009, at 12:28 PM, Alexander Marks wrote: > >> In the next day or two I'll be switching on a new stream of video >> metadata from Hulu. You can see how this data will look on Sandbox >> right now: http://www.sandbox-freebase.com/view/tv/video >> >> Here's an explore-view look at a Colbert report episode video: >> http://www.sandbox-freebase.com/tools/explore/guid/9202a8c04000641f800000000f4e4a70 >> >> The /tv/video schema is new for this purpose, and will probably go >> through some evolution in the coming weeks and months. Some of its >> notable properties: >> >> ?- video_of links to the relevant TV episode (or movie) >> ?- weblink is a direct link to where the video can be watched >> ?- expires is the date and time at which the video stops being >> available >> >> Here's a query that shows how you might use this data in your >> applications (hint!): >> http://tinyurl.com/yb8dev2 >> >> Let me know if you have any ideas or questions about this data or the >> video schema. >> >> Al >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From philip-freebase at shadowmagic.org.uk Fri Oct 9 11:12:41 2009 From: philip-freebase at shadowmagic.org.uk (Philip Kendall) Date: Fri, 9 Oct 2009 12:12:41 +0100 Subject: [Data-modeling] Data game of the day: IATA codes In-Reply-To: <20091007125114.GX31466@sphinx.int.mythic-beasts.com> References: <20091007125114.GX31466@sphinx.int.mythic-beasts.com> Message-ID: <20091009111239.GJ31466@sphinx.int.mythic-beasts.com> On Wed, Oct 07, 2009 at 01:51:17PM +0100, Philip Kendall wrote: > Add IATA codes to airports (or merge with existing topics[1]): > > http://airportcoder.freebaseapps.com/ One small improvement: the app (well, the graph) now remembers which items have been skipped, so you won't have to skip "Subotica Airport" and its incorrect Wikipedia information every time you start the app. > Let's clear this lot out by this time tomorrow :-) OK, that was perhaps a little optimistic! Only 6063 left to go... Cheers, Phil -- Philip Kendall http://www.shadowmagic.org.uk/ From spencerkelly86 at gmail.com Fri Oct 9 16:00:33 2009 From: spencerkelly86 at gmail.com (Spencer Kelly) Date: Fri, 9 Oct 2009 12:00:33 -0400 Subject: [Data-modeling] hulu data In-Reply-To: References: <8E76B014-75D9-42BA-B24D-C3070F593E9C@metaweb.com> <5FF14CFE-1AAE-4EFB-904C-BBE794AB28A2@metaweb.com> Message-ID: great stuff alex question: would 'video_of' also apply to footage of non-fictional events? i made thistype a while ago to connect the jfk assassination to the Zapruder film etc my first though was that 'footage of event' is a different relation than 'clip from show', .. but maybe it's not. any plans for connecting other video services ? if so, i would recommend a 'geographical restrictions' property, as hulu is only available to the united states. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091009/829c6b9f/attachment.htm From al at metaweb.com Fri Oct 9 16:58:17 2009 From: al at metaweb.com (Alexander Marks) Date: Fri, 9 Oct 2009 09:58:17 -0700 Subject: [Data-modeling] hulu data In-Reply-To: References: <8E76B014-75D9-42BA-B24D-C3070F593E9C@metaweb.com> <5FF14CFE-1AAE-4EFB-904C-BBE794AB28A2@metaweb.com> Message-ID: On Oct 9, 2009, at 9:00 AM, Spencer Kelly wrote: > question: would 'video_of' also apply to footage of non-fictional > events? > i made this type a while ago to connect the jfk assassination to > the Zapruder film etc > my first though was that 'footage of event' is a different relation > than 'clip from show', .. but maybe it's not. I thought about this earlier. I think you're right -- it is a different relation. My sense is that there should always be an intermediate object between the event and the video. So, the Zapruder film is "footage of" the JFK assassination, and the Zapruder film "can be watched" at this Youtube resource, and this Hulu resource, etc. event [JFK assas.] -> footage [Zapruder] -> video resource [Zapruder @ Youtube] So we'd use your type for the footage node, and link it to a video resource (there's still no property to do this from the topic side, discussions are going on about whether it should be part of /common/ topic -- for now you can just use the reverse property "!/tv/video" in MQL). Feel free to try this model out. > any plans for connecting other video services? Definitely. More are in the works. Anything in particular you'd like to see? > if so, i would recommend a 'geographical restrictions' property, as > hulu is only available to the united states. That's a great idea. I wonder if this should connect to the service (like /en/hulu) or to each video. Most of the services I can think of are restricted at the site level, but I'm sure there are or will be cases where only certain videos within a site are geo-restricted. Al From stefano at metaweb.com Fri Oct 9 17:14:18 2009 From: stefano at metaweb.com (Stefano Mazzocchi) Date: Fri, 09 Oct 2009 10:14:18 -0700 Subject: [Data-modeling] hulu data In-Reply-To: References: <8E76B014-75D9-42BA-B24D-C3070F593E9C@metaweb.com> <5FF14CFE-1AAE-4EFB-904C-BBE794AB28A2@metaweb.com> Message-ID: <4ACF6F6A.2020901@metaweb.com> Alexander Marks wrote: > Definitely. More are in the works. Anything in particular you'd like > to see? The internet archive has a lot of great old video stuff. For example: http://www.archive.org/details/Houseint1954 which will make you cringe in so many ways. -- Stefano Mazzocchi Application Catalyst Metaweb Technologies, Inc. stefano at metaweb.com ------------------------------------------------------------------- From zenkat at metaweb.com Fri Oct 9 17:25:50 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Fri, 9 Oct 2009 10:25:50 -0700 Subject: [Data-modeling] hulu data In-Reply-To: <4ACF6F6A.2020901@metaweb.com> References: <8E76B014-75D9-42BA-B24D-C3070F593E9C@metaweb.com> <5FF14CFE-1AAE-4EFB-904C-BBE794AB28A2@metaweb.com> <4ACF6F6A.2020901@metaweb.com> Message-ID: <3C1BA7AF-58B1-4AF0-AD2F-7590216308F7@metaweb.com> > http://www.archive.org/details/Houseint1954 > > which will make you cringe in so many ways. "The National Clean Up - Paint Up - Fix Up Bureau"? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091009/8a0c93c4/attachment.htm From tfmorris at gmail.com Fri Oct 9 18:13:09 2009 From: tfmorris at gmail.com (Tom Morris) Date: Fri, 9 Oct 2009 14:13:09 -0400 Subject: [Data-modeling] Data game of the day: IATA codes In-Reply-To: <20091009111239.GJ31466@sphinx.int.mythic-beasts.com> References: <20091007125114.GX31466@sphinx.int.mythic-beasts.com> <20091009111239.GJ31466@sphinx.int.mythic-beasts.com> Message-ID: On Fri, Oct 9, 2009 at 7:12 AM, Philip Kendall wrote: > you won't have to skip "Subotica Airport" > and its incorrect Wikipedia information every time you start the app. That actually illustrates the weakness of this approach. The resulting accuracy at the end of the process depends on the accuracy of the Wikipedians who transcribed the codes originally. Of course, they also already did the hard work of reconciliation, which is a huge time saver, when compared to the effort it would take to go back to the original IATA database and re-reconcile. On the data modeling front, what defines an airport? Is a civilian air field that takes over a former military base really the same airport? They're using the same tarmac, and probably even retained the old IATA code, but they seem qualitatively different to me. Does whether or not they retain the same name make a difference? Also, we don't have any way to indicate when an IATA code was valid. They don't move often, but they do move. Denver's Stapleton Airport is an example that will be familiar to U.S. travelers. I just came across another apparent example: /en/mandalay_chanmyathazi_airport - Mandalay Chanmyathazi Airport - Mandalay Chanmyathazi Airport ((IATA: MDL, ICAO: VYCZ)) is a domestic airport in Burma that served Mandalay and surrounding areas. It has largely been replaced by Mandalay International Airport. /en/mandalay_international_airport - Mandalay International Airport - Mandalay International Airport (IATA: MDL, ICAO: VYMD)), located 35 km south of Mandalay in Tada-U, is one of only two international airports in Myanmar. For people who are interested in the, sometimes tortured, history of these identifiers, check out http://www.skygod.com/asstd/abc.html Tom From tfmorris at gmail.com Fri Oct 9 18:31:16 2009 From: tfmorris at gmail.com (Tom Morris) Date: Fri, 9 Oct 2009 14:31:16 -0400 Subject: [Data-modeling] hulu data In-Reply-To: References: <8E76B014-75D9-42BA-B24D-C3070F593E9C@metaweb.com> <5FF14CFE-1AAE-4EFB-904C-BBE794AB28A2@metaweb.com> Message-ID: On Fri, Oct 9, 2009 at 12:58 PM, Alexander Marks wrote: > > On Oct 9, 2009, at 9:00 AM, Spencer Kelly wrote: > >> question: would 'video_of' also apply to footage of non-fictional >> events? >> i made this type ?a while ago to connect the jfk assassination to >> the Zapruder film ?etc >> my first though was that 'footage of event' is a different relation >> than 'clip from show', ?.. but maybe it's not. > > I thought about this earlier. I think you're right -- it is a > different relation. My sense is that there should always be an > intermediate object between the event and the video. So, the Zapruder > film is "footage of" the JFK assassination, and the Zapruder film "can > be watched" at this Youtube resource, and this Hulu resource, etc. > > ? event [JFK assas.] -> footage [Zapruder] -> video resource > [Zapruder @ Youtube] I think there are more intermediate objects (with associated metadata) than that. JFK assassination Zapruder film (Super 8) 35 mm transfer telecine digital transfer (uncompressed) MPEG-2 encoded video MPEG-4 encoded video YouTube hosted version of MPEG-4 Digitally restored/enhanced copy MPEG-2 encoded video MPEG-4 encoded video and that's before you start adding in different video overlays (e.g. frame numbers) and sound tracks (sync'd audio from the radios). Of course, a lot of the time the provenance will have been lost, so you'll only have one or two links in the chain. Similar chains of representations apply to other media types. For example for a newspaper page, you might have: newspaper->microfilm image->scanned digital image->compressed digital image->OCR'd text->human corrected OCR'd text with potentially multiple alternatives at each stage. The current video->hosting property in the schema is a unique 1:1 relationship which doesn't seem general enough if you're going to allow multiple hosting services. The other thing to consider when dealing with time-based media (video & audio clips) is that there are instances where want to refer to a specific segment of the media, whether it be digital or analog. For this you need to record the in/out (start/stop) time codes or some other reference (byte or frame offset). Tom > > So we'd use your type for the footage node, and link it to a video > resource (there's still no property to do this from the topic side, > discussions are going on about whether it should be part of /common/ > topic -- for now you can just use the reverse property "!/tv/video" in > MQL). Feel free to try this model out. > >> any plans for connecting other video services? > > Definitely. More are in the works. Anything in particular you'd like > to see? > >> if so, i would recommend a 'geographical restrictions' property, as >> hulu is only available to the united states. > > That's a great idea. I wonder if this should connect to the service > (like /en/hulu) or to each video. Most of the services I can think of > are restricted at the site level, but I'm sure there are or will be > cases where only certain videos within a site are geo-restricted. > > Al > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From al at metaweb.com Fri Oct 9 20:23:26 2009 From: al at metaweb.com (Alexander Marks) Date: Fri, 9 Oct 2009 13:23:26 -0700 Subject: [Data-modeling] hulu data In-Reply-To: References: <8E76B014-75D9-42BA-B24D-C3070F593E9C@metaweb.com> <5FF14CFE-1AAE-4EFB-904C-BBE794AB28A2@metaweb.com> Message-ID: <31E1A007-41F6-42B2-B0A1-909EC70DDF5C@metaweb.com> On Oct 9, 2009, at 11:31 AM, Tom Morris wrote: > I think there are more intermediate objects (with associated metadata) > than that. > > JFK assassination > Zapruder film (Super 8) > 35 mm transfer > telecine digital transfer (uncompressed) > MPEG-2 encoded video > MPEG-4 encoded video > YouTube hosted version of MPEG-4 > Digitally restored/enhanced copy > MPEG-2 encoded video > MPEG-4 encoded video > > and that's before you start adding in different video overlays (e.g. > frame numbers) and sound tracks (sync'd audio from the radios). I don't think we're going to want this level of detail in most cases, but the model doesn't actually preclude it. > The current video->hosting property in the schema is a unique 1:1 > relationship which doesn't seem general enough if you're going to > allow multiple hosting services. "Video" is probably a poorly chosen name. It's really a CVT to link any sort of video content to a web resource where it can be watched. So if there were 3 places to watch the Zapruder film (or a particular transfer of the Zapruder film, if it's being modeled at that detail), you would have 3 of these "Video" nodes, each of which is a "place to watch" this content. Does that work with how you're thinking? If so, can you think of a better name for "Video" to make this more clear? > The other thing to consider when dealing with time-based media (video > & audio clips) is that there are instances where want to refer to a > specific segment of the media, whether it be digital or analog. For > this you need to record the in/out (start/stop) time codes or some > other reference (byte or frame offset). I thought about this too. One real use case would be tagging exactly when some person is interviewed in a TV news show. You could do that like this: Bill Clinton --appearances--> Bill's appearance Bill's appearance --episode--> The Daily Show Sep. 17 Bill's appearance --segment_start--> 15:00 Bill's appearance --segment_end--> 26:00 The Daily Show Sep. 17 <--video_of-- Video A The Daily Show Sep. 17 <--video_of-- Video B Video A --source--> Hulu Video A --weblink--> http://www.hulu.com/watch/96402 Video B --source--> thedailyshow.com Video B --weblink--> http://www.thedailyshow.com/watch/thu-september-17-2009 Al From al at metaweb.com Sat Oct 10 00:37:31 2009 From: al at metaweb.com (Alexander Marks) Date: Fri, 9 Oct 2009 17:37:31 -0700 Subject: [Data-modeling] hulu data In-Reply-To: <8E76B014-75D9-42BA-B24D-C3070F593E9C@metaweb.com> References: <8E76B014-75D9-42BA-B24D-C3070F593E9C@metaweb.com> Message-ID: This is now running on OTG. On Oct 8, 2009, at 12:28 PM, Alexander Marks wrote: > In the next day or two I'll be switching on a new stream of video > metadata from Hulu. You can see how this data will look on Sandbox > right now: http://www.sandbox-freebase.com/view/tv/video > > Here's an explore-view look at a Colbert report episode video: > http://www.sandbox-freebase.com/tools/explore/guid/9202a8c04000641f800000000f4e4a70 > > The /tv/video schema is new for this purpose, and will probably go > through some evolution in the coming weeks and months. Some of its > notable properties: > > - video_of links to the relevant TV episode (or movie) > - weblink is a direct link to where the video can be watched > - expires is the date and time at which the video stops being > available > > Here's a query that shows how you might use this data in your > applications (hint!): > http://tinyurl.com/yb8dev2 > > Let me know if you have any ideas or questions about this data or the > video schema. > > Al > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From pauljmackay at gmail.com Sat Oct 10 18:37:26 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Sat, 10 Oct 2009 11:37:26 -0700 Subject: [Data-modeling] Feedback on Seafood Base Message-ID: Hi, I would welcome any comments on the Seafood base I started: http://seafood.freebase.com/ The schema is reasonable at this point in terms of modelling what I hoped to capture, but it really needs good data sources to populate it. thanks paul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091010/10ea9d96/attachment.htm From iainsproat at gmail.com Sat Oct 10 19:30:30 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Sat, 10 Oct 2009 23:30:30 +0400 Subject: [Data-modeling] Feedback on Seafood Base In-Reply-To: References: Message-ID: On Sat, Oct 10, 2009 at 10:37 PM, Paul Mackay wrote: > Hi, > > I would welcome any comments on the Seafood base I started: > > http://seafood.freebase.com/ Nice. > > The schema is reasonable at this point in terms of modelling what I hoped to > capture, but it really needs good data sources to populate it. As you requested some comments, I've done so! Mostly you're missing a description for each type. /base/seafood/fishing_method --type description missing --consider reciprocating incoming links (from /base/seafood/fishery & /base/seafood/seafood_rating) /base/seafood/fishery --type description missing --reciprocate incoming link from /base/seafood/seafood /base/seafood/fishery_location --type description missing --include the /location/location type /base/seafood/seafood_sustainability_category --type description missing (is there an authority/standards agency which defines these categories?) --reciprocate /base/seafood/seafood_rating /base/seafood/seafood_rating --type description missing --description for property /base/seafood/seafood_rating/how_caught is missing /base/seafood/certification_program --type description missing --/base/seafood/certification_program/organization -> the expected types should be more specific. Instead of expecting /organization/organization, rather expect /base/seafood/certifying_organization. The certifying organization can be applied to an organization or a company topic, so you don't need the duplicate property /base/seafood/certification_program/company --/base/seafood/certification_program/applies_to -> again, the expected type could be more specific. Say having a new seafood certification region type (/base/seafood/seafood_certification_organization). /base/seafood/certification_partner --type description missing --includes /location/location? surely this is incorrect - I'd expect a partner to be a company or organization, not a location? /base/seafood/contaminant --type description missing /base/seafood/seafood_guide --type description missing --should include /book/written_work from the commons. --/base/seafood/seafood_guide/published is not necessary if you include the written work type. --/base/seafood/seafood_guide/organization. The expected type could be a more specific. e.g. seafood guide author (/base/seafood/seafood_guide_author). /base/seafood/seafood --type description missing --should include the /food/food type Great effort, and I hope this is of help! Iain From paul at ontology2.com Mon Oct 12 14:57:16 2009 From: paul at ontology2.com (Paul Houle) Date: Mon, 12 Oct 2009 10:57:16 -0400 Subject: [Data-modeling] Vocal Range II In-Reply-To: References: <20091007125114.GX31466@sphinx.int.mythic-beasts.com> <20091009111239.GJ31466@sphinx.int.mythic-beasts.com> Message-ID: <4AD343CC.10004@ontology2.com> Over the weekend I thought about another aspect of the vocal range issue. The vocal range of non-opera singers isn't of great interest to the laymen, but it is an important characteristic if you want to understand a person's body of work. Singers like John Flansburgh and Bob Dylan aren't "great vocalists" in the generic sense, but theyare both great lyricists and have had excellent careers, creating music that fits them. An A&R person, on the other hand, could be as analytical about the prospects of, say, somebody like Miley Cyrus, as opera experts would be about an opera signer. Lately I've been practicing character work (acting) and thinking a lot about vocal adjustments; taxonomizing them for my own good so I can learn how to do more of them, do them better, and do them consistently. I'd really like to see some way to 'fingerprint' the skills of a performer, and particularly distinguish the case of actors like Don Adams or Rick Moranis (a typecast actor who is famous for playing a particular sort of character extremely well) from actors like Robin Williams or Bruce Willis (who have broader careers) [Overall the first approach I'd try to that would be a vector-space approach, something like what nanocrowd does for movies.] There's the vocal range of a singer, but there's also the vocal range required to sing a song. Lately I've been "programming" music for our car to find stuff my son will enjoy and will also be able to hear the words, sing along and think about what the words mean: this doesn't need to be marketed as children's music: both Shonen Knife and They Might Be Giants are often good, as are (often) TV show themes. It would be nice to see some characteristics of songs that would be useful for selecting this sort of thing. I know Pandora's got the "Music Genome Project" and it does a great job as a reccomendation systems, but most of their data is locked up. From pauljmackay at gmail.com Mon Oct 12 16:58:53 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Mon, 12 Oct 2009 09:58:53 -0700 Subject: [Data-modeling] Filtering topics by location In-Reply-To: <4F105C37-E54B-40EC-BF9D-DE8D111A138E@metaweb.com> References: <4F105C37-E54B-40EC-BF9D-DE8D111A138E@metaweb.com> Message-ID: There were replies here about the contained_by field that 2 level threshold and a "at least" 2 levels. The description for the Location type says "For geopolitical locations, containment two levels up and down is the ideal minimum". So could it be acceptable to have 3 levels, in the England example I gave? It seems to be there is some logic to this, given that the countries of England, Wales, Scotland, etc are important in their own right, but so is using United Kingdom also. Of course the first level is the county. If this is not possible I'm really not sure how easy it would ever be to filter to get Locations that are based in England while excluding ones in other countries that are part of UK. paul On Tue, Sep 22, 2009 at 11:48 PM, John Giannandrea wrote: > > Paul Mackay wrote: > > I am interested in populating information about England, but many > > things that have a location tend to have addresses containing county > > names and then a country of UK. If I find these topics, would it be > > valid to add an "England" entry to the "Contained by" field, such > > that it is easy to filter on? > > Most towns in the UK will have a "/location/location/contained_by": > {"id":"/en/united_kingdom"} > This will get you towns in Scotland and Northern Ireland also. > > > I'm not sure if the Contained by field is intended to be just the > > immediate entity that contains the topic in question. > > The convention we have used in data loads is at least two ply of > containment. > > -jg > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091012/c1d6c0b8/attachment.htm From etbmfstar at yahoo.com Wed Oct 14 06:58:54 2009 From: etbmfstar at yahoo.com (ELIEH TAFAKORY) Date: Tue, 13 Oct 2009 23:58:54 -0700 (PDT) Subject: [Data-modeling] (no subject) Message-ID: <100412.99164.qm@web34505.mail.mud.yahoo.com> hello! I am happy because of joinig to this group. I need to information about"MQL4". please help me. thanks. Get your new Email address! Grab the Email name you've always wanted before someone else does! http://mail.promotions.yahoo.com/newdomains/aa/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091013/0ff03441/attachment-0001.htm From philip-freebase at shadowmagic.org.uk Wed Oct 14 07:04:24 2009 From: philip-freebase at shadowmagic.org.uk (Philip Kendall) Date: Wed, 14 Oct 2009 08:04:24 +0100 Subject: [Data-modeling] (no subject) In-Reply-To: <100412.99164.qm@web34505.mail.mud.yahoo.com> References: <100412.99164.qm@web34505.mail.mud.yahoo.com> Message-ID: <20091014070424.GU31466@sphinx.int.mythic-beasts.com> On Tue, Oct 13, 2009 at 11:58:54PM -0700, ELIEH TAFAKORY wrote: > hello! > I am happy because of joinig to this group. > I need to information about"MQL4". > please help me. You're on entirely the wrong list. This is about Freebase, a website for structured data and nothing to do with MQL4. Cheers, Phil -- Philip Kendall http://www.shadowmagic.org.uk/ From pauljmackay at gmail.com Sun Oct 18 18:14:55 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Sun, 18 Oct 2009 11:14:55 -0700 Subject: [Data-modeling] Naming conventions Message-ID: I started this page (http://wiki.freebase.com/wiki/Naming_conventions) to capture a summary of naming conventions. Types is the only entity I've seen a convention for described in the documentation. If anyone knows what should be the conventions for other Freebase entities please could they update this page? Despite a suggested convention there are many types with varied capitalisation though. paul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091018/4e7194ed/attachment.htm From kirrily at metaweb.com Mon Oct 19 16:56:51 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Mon, 19 Oct 2009 09:56:51 -0700 Subject: [Data-modeling] Naming conventions In-Reply-To: References: Message-ID: On 18/10/2009, at 11:14 AM, Paul Mackay wrote: > I started this page (http://wiki.freebase.com/wiki/ > Naming_conventions) to capture a summary of naming conventions. > Types is the only entity I've seen a convention for described in the > documentation. If anyone knows what should be the conventions for > other Freebase entities please could they update this page? > > Despite a suggested convention there are many types with varied > capitalisation though. Well, capitalisation only appears in display names, so I don't think it's that big a deal. I'm more interested in naming conventions of type/domain keys and properties. For instance, most properties use_this_format but then there's /location/location/containedby which catches me nearly every time. K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091019/9015b095/attachment.htm From pauljmackay at gmail.com Mon Oct 19 21:33:30 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Mon, 19 Oct 2009 14:33:30 -0700 Subject: [Data-modeling] Address and/or Location Message-ID: I'm trying to define types for Community garden and Allotment in the Local food Base. I'd like to ask if it makes sense to have an Address for these. Initially I defined a specific Agricultural location that included the Location type, hence a Community garden topic would also be a Location. But typically people locate a Community garden using a crossroads, such as Arbutus and 12th, for example. Having this and the containing city would be useful, but I'm not sure if Address is correct as this is for mailing addresses. Should it be used and just fill out the relevant fields? Address though is also a compound type, so Community garden would have to have it as a property. Is it right to have a type that includes Location, and also uses Address which also includes Location? How to be clear on which one would get filled out? thanks paul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091019/73502d21/attachment.htm From tfmorris at gmail.com Tue Oct 20 03:54:54 2009 From: tfmorris at gmail.com (Tom Morris) Date: Mon, 19 Oct 2009 23:54:54 -0400 Subject: [Data-modeling] Address and/or Location In-Reply-To: References: Message-ID: On Mon, Oct 19, 2009 at 5:33 PM, Paul Mackay wrote: > I'm trying to define types for Community garden and Allotment in the Local > food Base. I'd like to ask if it makes sense to have an Address for these. > Initially I defined a specific Agricultural location that included the > Location type, hence a Community garden topic would also be a Location. > > But typically people locate a Community garden using a crossroads, such as > Arbutus and 12th, for example. Having this and the containing city would be > useful, but I'm not sure if Address is correct as this is for mailing > addresses. Should it be used and just fill out the relevant fields? > > Address though is also a compound type, so Community garden would have to > have it as a property. Is it right to have a type that includes Location, > and also uses Address which also includes Location? How to be clear on which > one would get filled out? I think we agreed some months ago that Location needed a Street Address property and that Address properties (unless they represent a mailing address as opposed to a street address) should be migrated off the types that have them now such a Building. I'm not sure where implementation of this schema change stands. If it's still a ways away, you may need to do something for the short term and then change it later (unfortunately). I think the initial term of an address is a free form string, so I don't see any reason that an intersection or crossroads should be prohibited (and most geocoders will deal with it just fine). Tom From pauljmackay at gmail.com Tue Oct 20 05:32:44 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Mon, 19 Oct 2009 22:32:44 -0700 Subject: [Data-modeling] Address and/or Location In-Reply-To: References: Message-ID: Are you suggesting using an Address type initially, or just a basic text property to store the crossroads info? It would also be useful to store the city too. Will Location have all the relevant elements of the current Address type to store full addresses? paul On Mon, Oct 19, 2009 at 8:54 PM, Tom Morris wrote: > On Mon, Oct 19, 2009 at 5:33 PM, Paul Mackay > wrote: > > I'm trying to define types for Community garden and Allotment in the > Local > > food Base. I'd like to ask if it makes sense to have an Address for > these. > > Initially I defined a specific Agricultural location that included the > > Location type, hence a Community garden topic would also be a Location. > > > > But typically people locate a Community garden using a crossroads, such > as > > Arbutus and 12th, for example. Having this and the containing city would > be > > useful, but I'm not sure if Address is correct as this is for mailing > > addresses. Should it be used and just fill out the relevant fields? > > > > Address though is also a compound type, so Community garden would have to > > have it as a property. Is it right to have a type that includes Location, > > and also uses Address which also includes Location? How to be clear on > which > > one would get filled out? > > I think we agreed some months ago that Location needed a Street > Address property and that Address properties (unless they represent a > mailing address as opposed to a street address) should be migrated off > the types that have them now such a Building. > > I'm not sure where implementation of this schema change stands. If > it's still a ways away, you may need to do something for the short > term and then change it later (unfortunately). I think the initial > term of an address is a free form string, so I don't see any reason > that an intersection or crossroads should be prohibited (and most > geocoders will deal with it just fine). > > Tom > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091019/22283e9b/attachment.htm From tfmorris at gmail.com Tue Oct 20 21:38:16 2009 From: tfmorris at gmail.com (Tom Morris) Date: Tue, 20 Oct 2009 17:38:16 -0400 Subject: [Data-modeling] Address and/or Location In-Reply-To: References: Message-ID: I'm really hoping Jeff is going to chime in here since he's the Metaweb staff member in charge of all this stuff. On Tue, Oct 20, 2009 at 1:32 AM, Paul Mackay wrote: > Are you suggesting using an Address type initially, or just a basic text > property to store the crossroads info? I'm suggesting /location/mailing_address which, despite it's name, is also used for street addresses, as in the case of /architecture/structure/address. > It would also be useful to store the > city too. Will Location have all the relevant elements of the current > Address type to store full addresses? If I had to guess, I'd guess that it'd probably lose the postal code and perhaps the levels of place hierarchy, but that it would include at least the free text street address property and a city/town (or other location) property. Tom From jeff at metaweb.com Wed Oct 21 20:03:28 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Wed, 21 Oct 2009 13:03:28 -0700 Subject: [Data-modeling] Address and/or Location In-Reply-To: References: Message-ID: > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Tom Morris > Sent: Tuesday, October 20, 2009 2:38 PM > To: Freebase data modeling mailing list > Subject: Re: [Data-modeling] Address and/or Location > > I'm really hoping Jeff is going to chime in here since he's > the Metaweb staff member in charge of all this stuff. > > On Tue, Oct 20, 2009 at 1:32 AM, Paul Mackay > wrote: > > Are you suggesting using an Address type initially, or just a basic > > text property to store the crossroads info? > > I'm suggesting /location/mailing_address which, despite it's > name, is also used for street addresses, as in the case of > /architecture/structure/address. I agree -- /location/address would be the type to use here. (As an aside, it has two keys: /location/address and the older /location/mailing_address; mailing_address probably shows up more in queries since if you ask MQL to return a single key, it picks the oldest (or something like that).) > > It would also be useful to store the > > city too. Will Location have all the relevant elements of > the current > > Address type to store full addresses? > > If I had to guess, I'd guess that it'd probably lose the > postal code and perhaps the levels of place hierarchy, but > that it would include at least the free text street address > property and a city/town (or other location) property. Location will just point to the existing Address type. Jeff > Tom > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From jeff at metaweb.com Wed Oct 21 20:57:17 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Wed, 21 Oct 2009 13:57:17 -0700 Subject: [Data-modeling] Address refactoring Message-ID: <057DA903FBBB428C8A3606CFB057CF92@amd> It's been pointed out (thanks Tom and Brendan!) that the proposal for refactoring how we handle addresses has yet to be acted upon. For reference, the original discussions are here: http://markmail.org/thread/a3embfd2zhwziqy4 http://markmail.org/thread/iroh55rbd3usb4j4 And the JIRA tasks all start here: https://bugs.freebase.com/browse/DA-808 But to sum up, the proposed plan is this: 1. Add an "address" property to the Location type. Any type that represents something that Is-A location would have Location as an included type. 2. Types that only have a Has-A relationship to location will not be cotyped as Location, and will continue to have properties for their addresses; ideally the labels for these addresses would make the relationship clearer: "mailing address", "headquarters address", etc. 3. Geobot (the process that assigns geocodes to addresses) will be updated to assert geocodes on the base Location topic for all Is-A topics (meaning that /location/address will not be co-typed as Location in these instances), but will continue to type other Address instances as Location and assert geocodes there. (#3 is a bit of a change from the original plan, but several people pointed out that it would be weird not to have, for example, the company headquarters plotted on the maps of instances of the Company type.) It's entirely possible for locations to have both kinds of addresses, if they are typed as both /location/location and something that has a Has-A relationship (in many schemas, we assert identity between a building and the instutition that houses it -- museums or libraries, for example). But, in order to kick this off, we need consensus on which types to change into Is-As. There are 14 commons types with a property that expects /location/address: /architecture/landscape_project /architecture/museum /architecture/structure /award/hall_of_fame /business/business_location /business/shopping_center /education/educational_institution /library/public_library /religion/religious_organization /medicine/hospital Of these, I propose that only the following should have an Is-A relationship, and be cotyped with /location/location: /architecture/structure (already includes Location) /business/shopping_center (already includes Location) /business/business_location And maybe /architecture/landscape_project; I'm less clear about how this type functions. The rest can stay as they are. Comments? Jeff From pauljmackay at gmail.com Thu Oct 22 00:34:34 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Wed, 21 Oct 2009 17:34:34 -0700 Subject: [Data-modeling] Address refactoring In-Reply-To: <057DA903FBBB428C8A3606CFB057CF92@amd> References: <057DA903FBBB428C8A3606CFB057CF92@amd> Message-ID: Are there any more specific guidelines, perhaps with more examples such as explaining the museum one, that illustrate how to model Is-A and Has-A Location relationships? It seems to me that a Community garden (a plot of land) Is-A Location, but I wonder should a Farmers market (which only is present at its location once a week during a portion of the year) be a Location or Has-A Location? Would the address property added to Location be of the Address type, which is /location/mailing_address, and is that suitable for the example I gave of the Community garden type I'm trying to define that may often be identified by just a crossroads and city? thanks paul On Wed, Oct 21, 2009 at 1:57 PM, Jeff Prucher wrote: > It's been pointed out (thanks Tom and Brendan!) that the proposal for > refactoring how we handle addresses has yet to be acted upon. > > For reference, the original discussions are here: > http://markmail.org/thread/a3embfd2zhwziqy4 > http://markmail.org/thread/iroh55rbd3usb4j4 > > And the JIRA tasks all start here: > https://bugs.freebase.com/browse/DA-808 > > But to sum up, the proposed plan is this: > 1. Add an "address" property to the Location type. Any type that represents > something that Is-A location would have Location as an included type. > > 2. Types that only have a Has-A relationship to location will not be > cotyped > as Location, and will continue to have properties for their addresses; > ideally the labels for these addresses would make the relationship clearer: > "mailing address", "headquarters address", etc. > > 3. Geobot (the process that assigns geocodes to addresses) will be updated > to assert geocodes on the base Location topic for all Is-A topics (meaning > that /location/address will not be co-typed as Location in these > instances), > but will continue to type other Address instances as Location and assert > geocodes there. > > (#3 is a bit of a change from the original plan, but several people pointed > out that it would be weird not to have, for example, the company > headquarters plotted on the maps of instances of the Company type.) > > It's entirely possible for locations to have both kinds of addresses, if > they are typed as both /location/location and something that has a Has-A > relationship (in many schemas, we assert identity between a building and > the > instutition that houses it -- museums or libraries, for example). > > But, in order to kick this off, we need consensus on which types to change > into Is-As. There are 14 commons types with a property that expects > /location/address: > > /architecture/landscape_project > /architecture/museum > /architecture/structure > /award/hall_of_fame > /business/business_location > /business/shopping_center > /education/educational_institution > /library/public_library > /religion/religious_organization > /medicine/hospital > > Of these, I propose that only the following should have an Is-A > relationship, and be cotyped with /location/location: > /architecture/structure (already includes Location) > /business/shopping_center (already includes Location) > /business/business_location > And maybe /architecture/landscape_project; I'm less clear about how this > type functions. > > The rest can stay as they are. > > Comments? > > Jeff > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091021/ff27c50d/attachment.htm From pauljmackay at gmail.com Thu Oct 22 00:37:52 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Wed, 21 Oct 2009 17:37:52 -0700 Subject: [Data-modeling] Address refactoring In-Reply-To: References: <057DA903FBBB428C8A3606CFB057CF92@amd> Message-ID: >Would the address property added to Location be of the Address type, which is /location/mailing_address, and is that suitable for the example I gave of the Community garden type I'm trying to define that may often be identified by just a crossroads and city? I think a separate mail answered this point, thanks :) On Wed, Oct 21, 2009 at 5:34 PM, Paul Mackay wrote: > Are there any more specific guidelines, perhaps with more examples such as > explaining the museum one, that illustrate how to model Is-A and Has-A > Location relationships? It seems to me that a Community garden (a plot of > land) Is-A Location, but I wonder should a Farmers market (which only is > present at its location once a week during a portion of the year) be a > Location or Has-A Location? > > Would the address property added to Location be of the Address type, which > is /location/mailing_address, and is that suitable for the example I gave of > the Community garden type I'm trying to define that may often be identified > by just a crossroads and city? > > thanks > > paul > > > On Wed, Oct 21, 2009 at 1:57 PM, Jeff Prucher wrote: > >> It's been pointed out (thanks Tom and Brendan!) that the proposal for >> refactoring how we handle addresses has yet to be acted upon. >> >> For reference, the original discussions are here: >> http://markmail.org/thread/a3embfd2zhwziqy4 >> http://markmail.org/thread/iroh55rbd3usb4j4 >> >> And the JIRA tasks all start here: >> https://bugs.freebase.com/browse/DA-808 >> >> But to sum up, the proposed plan is this: >> 1. Add an "address" property to the Location type. Any type that >> represents >> something that Is-A location would have Location as an included type. >> >> 2. Types that only have a Has-A relationship to location will not be >> cotyped >> as Location, and will continue to have properties for their addresses; >> ideally the labels for these addresses would make the relationship >> clearer: >> "mailing address", "headquarters address", etc. >> >> 3. Geobot (the process that assigns geocodes to addresses) will be updated >> to assert geocodes on the base Location topic for all Is-A topics (meaning >> that /location/address will not be co-typed as Location in these >> instances), >> but will continue to type other Address instances as Location and assert >> geocodes there. >> >> (#3 is a bit of a change from the original plan, but several people >> pointed >> out that it would be weird not to have, for example, the company >> headquarters plotted on the maps of instances of the Company type.) >> >> It's entirely possible for locations to have both kinds of addresses, if >> they are typed as both /location/location and something that has a Has-A >> relationship (in many schemas, we assert identity between a building and >> the >> instutition that houses it -- museums or libraries, for example). >> >> But, in order to kick this off, we need consensus on which types to change >> into Is-As. There are 14 commons types with a property that expects >> /location/address: >> >> /architecture/landscape_project >> /architecture/museum >> /architecture/structure >> /award/hall_of_fame >> /business/business_location >> /business/shopping_center >> /education/educational_institution >> /library/public_library >> /religion/religious_organization >> /medicine/hospital >> >> Of these, I propose that only the following should have an Is-A >> relationship, and be cotyped with /location/location: >> /architecture/structure (already includes Location) >> /business/shopping_center (already includes Location) >> /business/business_location >> And maybe /architecture/landscape_project; I'm less clear about how this >> type functions. >> >> The rest can stay as they are. >> >> Comments? >> >> Jeff >> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091021/6cc983c5/attachment.htm From kirrily at metaweb.com Thu Oct 22 17:24:27 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Thu, 22 Oct 2009 10:24:27 -0700 Subject: [Data-modeling] Address refactoring In-Reply-To: References: <057DA903FBBB428C8A3606CFB057CF92@amd> Message-ID: On 21/10/2009, at 5:34 PM, Paul Mackay wrote: > Are there any more specific guidelines, perhaps with more examples > such as explaining the museum one, that illustrate how to model Is-A > and Has-A Location relationships? It seems to me that a Community > garden (a plot of land) Is-A Location, but I wonder should a Farmers > market (which only is present at its location once a week during a > portion of the year) be a Location or Has-A Location? The simple test is: can it move? A company's HQ or a Farmer's Market can easily move, but a bridge or a national park is unlikely to. Movable = has-a, unmovable = is-a. (Yes, there will be rare exceptions, but you get the idea.) K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com From duncan.oliver at gmail.com Fri Oct 23 10:44:42 2009 From: duncan.oliver at gmail.com (Duncan Oliver) Date: Fri, 23 Oct 2009 05:44:42 -0500 Subject: [Data-modeling] Non-Unique Platform (Video Games) Message-ID: Cross-post from Video Game Commons: http://www.freebase.com/discuss/threads/guid/9202a8c04000641f800000000fae89c7 --- Duncan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091023/2400dd70/attachment.htm From pauljmackay at gmail.com Sun Oct 25 20:19:17 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Sun, 25 Oct 2009 13:19:17 -0700 Subject: [Data-modeling] Address refactoring In-Reply-To: References: <057DA903FBBB428C8A3606CFB057CF92@amd> Message-ID: I've added a brief note here http://wiki.freebase.com/wiki/Schema to capture that guideline about Location. On Thu, Oct 22, 2009 at 10:24 AM, Kirrily Robert wrote: > On 21/10/2009, at 5:34 PM, Paul Mackay wrote: > > > Are there any more specific guidelines, perhaps with more examples > > such as explaining the museum one, that illustrate how to model Is-A > > and Has-A Location relationships? It seems to me that a Community > > garden (a plot of land) Is-A Location, but I wonder should a Farmers > > market (which only is present at its location once a week during a > > portion of the year) be a Location or Has-A Location? > > The simple test is: can it move? A company's HQ or a Farmer's Market > can easily move, but a bridge or a national park is unlikely to. > Movable = has-a, unmovable = is-a. (Yes, there will be rare > exceptions, but you get the idea.) > > K. > > -- > Kirrily Robert > Freebase Community Director > kirrily at metaweb.com > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091025/c4284883/attachment.htm From pauljmackay at gmail.com Mon Oct 26 02:51:47 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Sun, 25 Oct 2009 19:51:47 -0700 Subject: [Data-modeling] Feedback on Seafood Base In-Reply-To: References: Message-ID: Hi Iain, Thanks very much for this feedback, it was very useful! I've made all the changes as suggested. I was wondering, could there be an app that checks for obvious things such as missing descriptions? Sort of a lint for Bases? paul On Sat, Oct 10, 2009 at 12:30 PM, Iain Sproat wrote: > On Sat, Oct 10, 2009 at 10:37 PM, Paul Mackay > wrote: > > Hi, > > > > I would welcome any comments on the Seafood base I started: > > > > http://seafood.freebase.com/ > > Nice. > > > > > The schema is reasonable at this point in terms of modelling what I hoped > to > > capture, but it really needs good data sources to populate it. > > As you requested some comments, I've done so! Mostly you're missing a > description for each type. > > /base/seafood/fishing_method > --type description missing > --consider reciprocating incoming links (from /base/seafood/fishery & > /base/seafood/seafood_rating) > > /base/seafood/fishery > --type description missing > --reciprocate incoming link from /base/seafood/seafood > > /base/seafood/fishery_location > --type description missing > --include the /location/location type > > /base/seafood/seafood_sustainability_category > --type description missing (is there an authority/standards agency > which defines these categories?) > --reciprocate /base/seafood/seafood_rating > > /base/seafood/seafood_rating > --type description missing > --description for property /base/seafood/seafood_rating/how_caught is > missing > > /base/seafood/certification_program > --type description missing > --/base/seafood/certification_program/organization -> the expected > types should be more specific. Instead of expecting > /organization/organization, rather expect > /base/seafood/certifying_organization. The certifying organization > can be applied to an organization or a company topic, so you don't > need the duplicate property > /base/seafood/certification_program/company > --/base/seafood/certification_program/applies_to -> again, the > expected type could be more specific. Say having a new seafood > certification region type > (/base/seafood/seafood_certification_organization). > > /base/seafood/certification_partner > --type description missing > --includes /location/location? surely this is incorrect - I'd expect > a partner to be a company or organization, not a location? > > /base/seafood/contaminant > --type description missing > > /base/seafood/seafood_guide > --type description missing > --should include /book/written_work from the commons. > --/base/seafood/seafood_guide/published is not necessary if you > include the written work type. > --/base/seafood/seafood_guide/organization. The expected type could > be a more specific. e.g. seafood guide author > (/base/seafood/seafood_guide_author). > > /base/seafood/seafood > --type description missing > --should include the /food/food type > > Great effort, and I hope this is of help! > > Iain > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091025/a7827cef/attachment.htm From narphorium at gmail.com Mon Oct 26 03:17:16 2009 From: narphorium at gmail.com (Shawn Simister) Date: Sun, 25 Oct 2009 23:17:16 -0400 Subject: [Data-modeling] Feedback on Seafood Base In-Reply-To: References: Message-ID: <4AE514BC.9000802@gmail.com> Phil's excellent SchemaViz app has a feature where it highlights undocumented types in red. It would be neat though to have a app to test whether a base meets certain criteria for promotion to the commons .. if that's even possible. Shawn Paul Mackay wrote: > Hi Iain, > > Thanks very much for this feedback, it was very useful! I've made all > the changes as suggested. > > I was wondering, could there be an app that checks for obvious things > such as missing descriptions? Sort of a lint for Bases? > > paul > > On Sat, Oct 10, 2009 at 12:30 PM, Iain Sproat > wrote: > > On Sat, Oct 10, 2009 at 10:37 PM, Paul Mackay > > wrote: > > Hi, > > > > I would welcome any comments on the Seafood base I started: > > > > http://seafood.freebase.com/ > > Nice. > > > > > The schema is reasonable at this point in terms of modelling > what I hoped to > > capture, but it really needs good data sources to populate it. > > As you requested some comments, I've done so! Mostly you're missing a > description for each type. > > /base/seafood/fishing_method > --type description missing > --consider reciprocating incoming links (from /base/seafood/fishery & > /base/seafood/seafood_rating) > > /base/seafood/fishery > --type description missing > --reciprocate incoming link from /base/seafood/seafood > > /base/seafood/fishery_location > --type description missing > --include the /location/location type > > /base/seafood/seafood_sustainability_category > --type description missing (is there an authority/standards agency > which defines these categories?) > --reciprocate /base/seafood/seafood_rating > > /base/seafood/seafood_rating > --type description missing > --description for property /base/seafood/seafood_rating/how_caught > is missing > > /base/seafood/certification_program > --type description missing > --/base/seafood/certification_program/organization -> the expected > types should be more specific. Instead of expecting > /organization/organization, rather expect > /base/seafood/certifying_organization. The certifying organization > can be applied to an organization or a company topic, so you don't > need the duplicate property > /base/seafood/certification_program/company > --/base/seafood/certification_program/applies_to -> again, the > expected type could be more specific. Say having a new seafood > certification region type > (/base/seafood/seafood_certification_organization). > > /base/seafood/certification_partner > --type description missing > --includes /location/location? surely this is incorrect - I'd expect > a partner to be a company or organization, not a location? > > /base/seafood/contaminant > --type description missing > > /base/seafood/seafood_guide > --type description missing > --should include /book/written_work from the commons. > --/base/seafood/seafood_guide/published is not necessary if you > include the written work type. > --/base/seafood/seafood_guide/organization. The expected type could > be a more specific. e.g. seafood guide author > (/base/seafood/seafood_guide_author). > > /base/seafood/seafood > --type description missing > --should include the /food/food type > > Great effort, and I hope this is of help! > > Iain > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > > ------------------------------------------------------------------------ > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091025/90d4d9e1/attachment.htm From pauljmackay at gmail.com Mon Oct 26 06:12:12 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Sun, 25 Oct 2009 23:12:12 -0700 Subject: [Data-modeling] Adding Location information to Address objects Message-ID: If a topic has an Address property this is a compound type. Is it possible to add Location geo-data to this object? I cannot see how to edit the Location properties that would be also included in the Address CVT topic. If such data were added would it enable seeing an exact view of the location of the address? I'm thinking of a Business Location for example that would otherwise just show to the level of the city. thanks paul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091025/a708fbff/attachment-0001.htm From faye at metaweb.com Mon Oct 26 07:41:59 2009 From: faye at metaweb.com (Faye Harris) Date: Mon, 26 Oct 2009 00:41:59 -0700 (PDT) Subject: [Data-modeling] Adding Location information to Address objects In-Reply-To: <758500981.199421256542861066.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: <685286595.199441256542919650.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> For displaying a business location on a map, you can fill out the geolocation property on type /location/location with its coordinates (long/lat). That should be sufficient for your purposes. For displaying or querying against more complex geodata, such as the shape representing a city, you'll want to upload its geodata as a blob and link to the geometry property of type /location/location. An example of the latter can be seen using the Explorer2 view on a location topic with this data such as San Francisco. Look for the value of /location/location/geometry on this page: http://www.freebase.com/tools/explore2/en/san_francisco The guid, /guid/9202a8c04000641f8000000008056a3b, links to the blob that stores the city's geodata as a MultiPolygon, whose values can be examined here: http://freebase.com/api/trans/raw/guid/9202a8c04000641f8000000008056a3b -- Faye ----- Original Message ----- From: "Paul Mackay" To: "Freebase data modeling mailing list" Sent: Sunday, October 25, 2009 11:12:12 PM GMT -08:00 US/Canada Pacific Subject: [Data-modeling] Adding Location information to Address objects If a topic has an Address property this is a compound type. Is it possible to add Location geo-data to this object? I cannot see how to edit the Location properties that would be also included in the Address CVT topic. If such data were added would it enable seeing an exact view of the location of the address? I'm thinking of a Business Location for example that would otherwise just show to the level of the city. thanks paul _______________________________________________ Data-modeling mailing list Data-modeling at freebase.com http://lists.freebase.com/mailman/listinfo/data-modeling From philip-freebase at shadowmagic.org.uk Mon Oct 26 08:15:32 2009 From: philip-freebase at shadowmagic.org.uk (Philip Kendall) Date: Mon, 26 Oct 2009 08:15:32 +0000 Subject: [Data-modeling] Feedback on Seafood Base In-Reply-To: References: Message-ID: <20091026081532.GG31466@sphinx.int.mythic-beasts.com> On Sun, Oct 25, 2009 at 07:51:47PM -0700, Paul Mackay wrote: > > I was wondering, could there be an app that checks for obvious things such > as missing descriptions? Sort of a lint for Bases? http://undocumented.pak21.user.dev.freebaseapps.com/ checks for Commons types without descriptions. Currently a tool for hitting Commons admins over the head with, but it would be trivial to modify for other purposes. http://undoc.vtalwar.user.dev.freebaseapps.com/properties lists (and lets you document) undocumented properties on types in domains you administer. Cheers, Phil -- Philip Kendall http://www.shadowmagic.org.uk/ From philip-freebase at shadowmagic.org.uk Mon Oct 26 08:33:49 2009 From: philip-freebase at shadowmagic.org.uk (Philip Kendall) Date: Mon, 26 Oct 2009 08:33:49 +0000 Subject: [Data-modeling] Feedback on Seafood Base In-Reply-To: <4AE514BC.9000802@gmail.com> References: <4AE514BC.9000802@gmail.com> Message-ID: <20091026083349.GH31466@sphinx.int.mythic-beasts.com> On Sun, Oct 25, 2009 at 11:17:16PM -0400, Shawn Simister wrote: > It would be neat > though to have a app to test whether a base meets certain criteria for > promotion to the commons .. if that's even possible. What criteria are we thinking of here? * All types must be documented * All properties must be documented * All included types must be from this base or the Commons * All expected types must be from this base or the Commons * All reversed properties must be from this base or the Commons Anything else? Cheers, Phil -- Philip Kendall http://www.shadowmagic.org.uk/ From iainsproat at gmail.com Mon Oct 26 08:35:34 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Mon, 26 Oct 2009 12:35:34 +0400 Subject: [Data-modeling] Feedback on Seafood Base In-Reply-To: <4AE514BC.9000802@gmail.com> References: <4AE514BC.9000802@gmail.com> Message-ID: On Mon, Oct 26, 2009 at 7:17 AM, Shawn Simister wrote: > certain criteria for promotion to the commons I dumped some thoughts on criteria onto the wiki - http://wiki.freebase.com/wiki/Schema#Commons_Criteria Please feel free to edit it and add your thoughts. Iain From iainsproat at gmail.com Mon Oct 26 09:29:03 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Mon, 26 Oct 2009 13:29:03 +0400 Subject: [Data-modeling] Feedback on Seafood Base In-Reply-To: <20091026083349.GH31466@sphinx.int.mythic-beasts.com> References: <4AE514BC.9000802@gmail.com> <20091026083349.GH31466@sphinx.int.mythic-beasts.com> Message-ID: On Mon, Oct 26, 2009 at 12:33 PM, Philip Kendall wrote: > * All types must be documented > * All properties must be documented > * All included types must be from this base or the Commons > * All expected types must be from this base or the Commons > * All reversed properties must be from this base or the Commons > > Anything else? * expected types to be specific where possible i.e. expect a pilot to fly a plane, not a person. (but pilot includes person). * reciprocate properties where appropriate. * Use CVT and enumerations appropriately (No CVT within CVT, enumerations only where list is well defined) * follow naming conventions * be, or delegate, someone responsible for maintaining the commons, answering queries on your schema, reviewing suggestions to the schema, warning others of changes to the schema and gardening data. Iain From narphorium at gmail.com Mon Oct 26 17:45:02 2009 From: narphorium at gmail.com (Shawn Simister) Date: Mon, 26 Oct 2009 13:45:02 -0400 Subject: [Data-modeling] Feedback on Seafood Base In-Reply-To: References: <4AE514BC.9000802@gmail.com> <20091026083349.GH31466@sphinx.int.mythic-beasts.com> Message-ID: <4AE5E01E.7060609@gmail.com> Iain Sproat wrote: > On Mon, Oct 26, 2009 at 12:33 PM, Philip Kendall > wrote: > >> * All types must be documented >> * All properties must be documented >> * All included types must be from this base or the Commons >> * All expected types must be from this base or the Commons >> * All reversed properties must be from this base or the Commons >> >> Anything else? >> > > * expected types to be specific where possible i.e. expect a pilot to > fly a plane, not a person. (but pilot includes person). > * reciprocate properties where appropriate. > * Use CVT and enumerations appropriately (No CVT within CVT, > enumerations only where list is well defined) > * follow naming conventions > * be, or delegate, someone responsible for maintaining the commons, > answering queries on your schema, reviewing suggestions to the schema, > warning others of changes to the schema and gardening data. > > These all look good to me. Although I think in some cases a base could be promoted to the commons without a commitment to maintain it as long as it represents a significant value the the community. We might also consider a minimum number of facts or instances to make sure the schema works with real-world data. Shawn -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091026/5cbd587b/attachment.htm From jeff at metaweb.com Mon Oct 26 18:45:43 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Mon, 26 Oct 2009 11:45:43 -0700 Subject: [Data-modeling] Feedback on Seafood Base In-Reply-To: <20091026083349.GH31466@sphinx.int.mythic-beasts.com> References: <4AE514BC.9000802@gmail.com> <20091026083349.GH31466@sphinx.int.mythic-beasts.com> Message-ID: > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com] On Behalf Of > Philip Kendall > Sent: Monday, October 26, 2009 1:34 AM > To: data-modeling at freebase.com > Subject: Re: [Data-modeling] Feedback on Seafood Base > > On Sun, Oct 25, 2009 at 11:17:16PM -0400, Shawn Simister wrote: > > It would be neat > > though to have a app to test whether a base meets certain > criteria for > > promotion to the commons .. if that's even possible. > > What criteria are we thinking of here? > > * All types must be documented > * All properties must be documented > * All included types must be from this base or the Commons > * All expected types must be from this base or the Commons > * All reversed properties must be from this base or the Commons If we're talking about things that can be detected by an app, I would add that properties expecting a date or number should typically be unique. (I've added this to the wiki, too.) Jeff From iainsproat at gmail.com Mon Oct 26 18:51:31 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Mon, 26 Oct 2009 22:51:31 +0400 Subject: [Data-modeling] Help needed with voting on deleting/merging topics Message-ID: If anyone has some spare time today, we could do with your help with voting on delete/merge tasks. Please head over to http://www.freebase.com/tools/pipeline/showtask and contribute some clicks. There are loads of topics waiting to be voted on. They were flagged for delete in bulk and been a bit too keen, so there is now a bit of backlog to be worked through. Some of the topics for deletion are lists - the names of which fit the pattern "National Register of Historic Places in {location}" and "{country} films of {year}". And others are topics about topics e.g. "History of x", "Economy of x", "Politics of x", "Communications in x" etc. - the data in these really belongs on the topic x itself, but they can be dealt with later. Hopefully a bit of crowdsourcing will get the backlog down to usual levels, so please help out at http://www.freebase.com/tools/pipeline/showtask Thanks, Iain From jeff at metaweb.com Mon Oct 26 18:55:29 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Mon, 26 Oct 2009 11:55:29 -0700 Subject: [Data-modeling] Adding Location information to Address objects In-Reply-To: References: Message-ID: The address CVT is cotyped as Location automatically (there's a process that does this once a day or so), and, for US addresses, geolocation is added at that time. These values can be edited directly, but it's a bit tricky: from the edit view of a topic, click the "explore mode" link at the bottom of the page. In the explore view, look for the appropriate "address" property in the "outgoing properties" section. Click the guid that appears next to it. That will take you to the explore view of the address CVT object. To view the address in edit mode, you can either press F8 (which will display a new toolbar at the bottom of the screen, on which you can click "normal view"), or edit the URL and replace /tools/explore/ with /edit/topic. Jeff _____ From: data-modeling-bounces at freebase.com [mailto:data-modeling-bounces at freebase.com] On Behalf Of Paul Mackay Sent: Sunday, October 25, 2009 11:12 PM To: Freebase data modeling mailing list Subject: [Data-modeling] Adding Location information to Address objects If a topic has an Address property this is a compound type. Is it possible to add Location geo-data to this object? I cannot see how to edit the Location properties that would be also included in the Address CVT topic. If such data were added would it enable seeing an exact view of the location of the address? I'm thinking of a Business Location for example that would otherwise just show to the level of the city. thanks paul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091026/4890bcfa/attachment-0001.htm From gordon at metaweb.com Mon Oct 26 19:20:47 2009 From: gordon at metaweb.com (Gordon Mackenzie) Date: Mon, 26 Oct 2009 12:20:47 -0700 Subject: [Data-modeling] Help needed with voting on deleting/merging topics In-Reply-To: References: Message-ID: Well, Brian Karlak has already mentioned he could do this without involving the review queue (as of today, 4881 tasks!). He's volunteered to do some selective mass deletes if we could come up with some easy to run MQL queries to return a list of these patterns. We could get rid of potentially 1000s of tasks, leaving the bulk to be airport merges ;) If someone who is interested and has a little time and some expertise to craft some well defined queries to list just those topics in the third and fourth paragraphs, I can open a data team task in JIRA, and we can reduce the queue, maybe tonight. ~ Gordon <<< gordon at metaweb.com >>> On Oct 26, 2009, at 11:51 AM, Iain Sproat wrote: > If anyone has some spare time today, we could do with your help with > voting on delete/merge tasks. Please head over to > http://www.freebase.com/tools/pipeline/showtask and contribute some > clicks. > > There are loads of topics waiting to be voted on. They were flagged > for delete in bulk and been a bit too keen, so there is now a bit of > backlog to be worked through. > > Some of the topics for deletion are lists - the names of which fit the > pattern "National Register of Historic Places in {location}" and > "{country} films of {year}". > > And others are topics about topics e.g. "History of x", "Economy of > x", "Politics of x", "Communications in x" etc. - the data in these > really belongs on the topic x itself, but they can be dealt with > later. > > Hopefully a bit of crowdsourcing will get the backlog down to usual > levels, so please help out at > http://www.freebase.com/tools/pipeline/showtask > > Thanks, > > Iain > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From iainsproat at gmail.com Mon Oct 26 19:49:52 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Mon, 26 Oct 2009 23:49:52 +0400 Subject: [Data-modeling] Help needed with voting on deleting/merging topics In-Reply-To: References: Message-ID: On Mon, Oct 26, 2009 at 11:20 PM, Gordon Mackenzie wrote: > If someone who is interested and has a little time and some expertise > to craft some well defined queries to list just those topics in the > third and fourth paragraphs, I can open a data team task in JIRA, and > we can reduce the queue, maybe tonight. I put my thinking cap on and came up with the following: [{ "id":null, "/type/reflect/any_value":[{ "link":"/pipeline/delete_task/delete_guid", "value":null }], "/pipeline/task/status":"open", "name~=":"Economy of" }] I think that works? Iain From iainsproat at gmail.com Mon Oct 26 20:25:39 2009 From: iainsproat at gmail.com (Iain Sproat) Date: Tue, 27 Oct 2009 00:25:39 +0400 Subject: [Data-modeling] Help needed with voting on deleting/merging topics In-Reply-To: References: Message-ID: I've started a wiki page for some help/guidelines on using the Review Queue http://wiki.freebase.com/wiki/Review_Queue Iain On Mon, Oct 26, 2009 at 11:49 PM, Iain Sproat wrote: > On Mon, Oct 26, 2009 at 11:20 PM, Gordon Mackenzie wrote: >> If someone who is interested and has a little time and some expertise >> to craft some well defined queries to list just those topics in the >> third and fourth paragraphs, I can open a data team task in JIRA, and >> we can reduce the queue, maybe tonight. > > I put my thinking cap on and came up with the following: > > [{ > ?"id":null, > ?"/type/reflect/any_value":[{ > ? ?"link":"/pipeline/delete_task/delete_guid", > ? ?"value":null > ?}], > ?"/pipeline/task/status":"open", > ?"name~=":"Economy of" > }] > > I think that works? > > Iain > From pauljmackay at gmail.com Tue Oct 27 03:39:17 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Mon, 26 Oct 2009 20:39:17 -0700 Subject: [Data-modeling] Adding Location information to Address objects In-Reply-To: References: Message-ID: Thanks Jeff, thats a great tip! I've copied into here http://wiki.freebase.com/wiki/Editing_topics. Is there a list of bots that run over Freebase anywhere? I'm not quite sure if the geolocation data could be automatically added, but other Location fields such as Contained by could be filled with the city field of the Address I would think... On Mon, Oct 26, 2009 at 11:55 AM, Jeff Prucher wrote: > The address CVT is cotyped as Location automatically (there's a process > that does this once a day or so), and, for US addresses, geolocation is > added at that time. These values can be edited directly, but it's a bit > tricky: from the edit view of a topic, click the "explore mode" link at the > bottom of the page. In the explore view, look for the appropriate "address" > property in the "outgoing properties" section. Click the guid that appears > next to it. That will take you to the explore view of the address CVT > object. To view the address in edit mode, you can either press F8 (which > will display a new toolbar at the bottom of the screen, on which you can > click "normal view"), or edit the URL and replace /tools/explore/ with > /edit/topic. > > Jeff > > ------------------------------ > *From:* data-modeling-bounces at freebase.com [mailto: > data-modeling-bounces at freebase.com] *On Behalf Of *Paul Mackay > *Sent:* Sunday, October 25, 2009 11:12 PM > > *To:* Freebase data modeling mailing list > *Subject:* [Data-modeling] Adding Location information to Address objects > > If a topic has an Address property this is a compound type. Is it possible > to add Location geo-data to this object? I cannot see how to edit the > Location properties that would be also included in the Address CVT topic. > > If such data were added would it enable seeing an exact view of the > location of the address? I'm thinking of a Business Location for example > that would otherwise just show to the level of the city. > > thanks > > paul > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091026/0f85c8a9/attachment.htm From pauljmackay at gmail.com Tue Oct 27 03:44:03 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Mon, 26 Oct 2009 20:44:03 -0700 Subject: [Data-modeling] Adding Location information to Address objects In-Reply-To: <685286595.199441256542919650.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> References: <758500981.199421256542861066.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> <685286595.199441256542919650.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: Hi Faye, Is there a view where the geometry data is actually shown? I couldnt see this on the Normal page view. thanks paul On Mon, Oct 26, 2009 at 12:41 AM, Faye Harris wrote: > For displaying a business location on a map, you can fill out the > geolocation property on type /location/location with its coordinates > (long/lat). That should be sufficient for your purposes. For displaying or > querying against more complex geodata, such as the shape representing a > city, you'll want to upload its geodata as a blob and link to the geometry > property of type /location/location. > > An example of the latter can be seen using the Explorer2 view on a location > topic with this data such as San Francisco. Look for the value of > /location/location/geometry on this page: > http://www.freebase.com/tools/explore2/en/san_francisco > > The guid, /guid/9202a8c04000641f8000000008056a3b, links to the blob that > stores the city's geodata as a MultiPolygon, whose values can be examined > here: > http://freebase.com/api/trans/raw/guid/9202a8c04000641f8000000008056a3b > > -- Faye > > > ----- Original Message ----- > From: "Paul Mackay" > To: "Freebase data modeling mailing list" > Sent: Sunday, October 25, 2009 11:12:12 PM GMT -08:00 US/Canada Pacific > Subject: [Data-modeling] Adding Location information to Address objects > > > If a topic has an Address property this is a compound type. Is it possible > to add Location geo-data to this object? I cannot see how to edit the > Location properties that would be also included in the Address CVT topic. > > If such data were added would it enable seeing an exact view of the > location of the address? I'm thinking of a Business Location for example > that would otherwise just show to the level of the city. > > thanks > > paul > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091026/677f8c2d/attachment.htm From jlowe at giswebsite.com Tue Oct 27 05:23:11 2009 From: jlowe at giswebsite.com (Jonathan W. Lowe) Date: Tue, 27 Oct 2009 05:23:11 +0000 Subject: [Data-modeling] Adding Location information to Address objects In-Reply-To: References: <758500981.199421256542861066.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> <685286595.199441256542919650.JavaMail.root@zimbra01.corp.sjc1.metaweb.com> Message-ID: <1256620991.2327.6.camel@brazil> Paul, If you copy the geometry from a topic in geojson format, you can then paste it into this simple little application which draws it in context: http://openlayers.org/dev/examples/vector-formats.html For instance, some of the US Census topics include detailed multi-polygon geometries. This one, for instance... http://www.freebase.com/tools/explore/en/us_census_2000_national_geometry/ca/001/400100 ...has a the following geojson geometry: {"type":"MultiPolygon","coordinates":[[[[-122.24421,37.86322],[-122.244242,37.86374],[-122.244307,37.86478],[-122.244379,37.865925],[-122.244279,37.866725],[-122.244579,37.867525],[-122.244479,37.867925],[-122.244679,37.868725],[-122.244679,37.869425],[-122.244692,37.869757],[-122.244779,37.871925],[-122.244879,37.872625],[-122.244933,37.87313],[-122.245179,37.875425],[-122.245279,37.876625],[-122.245579,37.879225],[-122.245679,37.880425],[-122.245679,37.880925],[-122.245779,37.881925],[-122.245879,37.882825],[-122.245979,37.884325],[-122.245979,37.884725],[-122.245479,37.884325],[-122.244079,37.883225],[-122.241979,37.881925],[-122.241379,37.882125],[-122.239879,37.882925],[-122.238679,37.883325],[-122.236779,37.882825],[-122.235679,37.882525],[-122.234379,37.882225],[-122.230979,37.881325],[-122.225779,37.879225],[-122.223879,37.878325],[-122.221378,37.875325],[-122.217378,37.871725],[-122.216278,37.868825],[-122.217078,37.868425],[-122.217778,37.867925],[-122.219178,37.867225],[-122.221479,37.865025],[-122.220378,37.864425],[-122.213778,37.858026],[-122.213578,37.857926],[-122.213178,37.857626],[-122.212278,37.856726],[-122.213778,37.855926],[-122.214078,37.855526],[-122.214478,37.856026],[-122.214978,37.856026],[-122.216278,37.855626],[-122.216278,37.856226],[-122.216278,37.856326],[-122.217478,37.856226],[-122.21761,37.856688],[-122.217678,37.856926],[-122.219578,37.857126],[-122.219779,37.858334],[-122.219878,37.858926],[-122.221679,37.859826],[-122.222879,37.858026],[-122.223179,37.857126],[-122.222179,37.855326],[-122.223179,37.854626],[-122.223779,37.854726],[-122.223479,37.853626],[-122.224279,37.852826],[-122.223679,37.851826],[-122.224579,37.851226],[-122.224279,37.850526],[-122.224752,37.850368],[-122.224879,37.850326],[-122.225379,37.849826],[-122.226379,37.850026],[-122.226679,37.849826],[-122.227179,37.849026],[-122.229879,37.849326],[-122.232279,37.851126],[-122.233679,37.852126],[-122.234179,37.852526],[-122.234279,37.853626],[-122.234479,37.855326],[-122.234579,37.856326],[-122.234679,37.857026],[-122.234879,37.857426],[-122.237679,37.857226],[-122.238579,37.857326],[-122.239179,37.857226],[-122.240179,37.857226],[-122.241656,37.857312],[-122.241979,37.857526],[-122.242225,37.857697],[-122.242479,37.857826],[-122.243379,37.858826],[-122.243879,37.858626],[-122.243779,37.859126],[-122.243921,37.860191],[-122.243979,37.860626],[-122.243979,37.861026],[-122.244179,37.862726],[-122.24421,37.86322]]]]} ...which, when pasted into the openlayers application reveals that this census tract is on the Eastern border of Berkeley and Oakland, California. Hope this helps, Jonathan On Mon, 2009-10-26 at 20:44 -0700, Paul Mackay wrote: > Hi Faye, > > Is there a view where the geometry data is actually shown? I couldnt > see this on the Normal page view. > > thanks > > paul > > On Mon, Oct 26, 2009 at 12:41 AM, Faye Harris > wrote: > For displaying a business location on a map, you can fill out > the geolocation property on type /location/location with its > coordinates (long/lat). That should be sufficient for your > purposes. For displaying or querying against more complex > geodata, such as the shape representing a city, you'll want to > upload its geodata as a blob and link to the geometry property > of type /location/location. > > An example of the latter can be seen using the Explorer2 view > on a location topic with this data such as San Francisco. Look > for the value of /location/location/geometry on this page: > http://www.freebase.com/tools/explore2/en/san_francisco > > The guid, /guid/9202a8c04000641f8000000008056a3b, links to the > blob that stores the city's geodata as a MultiPolygon, whose > values can be examined here: > http://freebase.com/api/trans/raw/guid/9202a8c04000641f8000000008056a3b > > -- Faye > > > > ----- Original Message ----- > From: "Paul Mackay" > To: "Freebase data modeling mailing list" > > Sent: Sunday, October 25, 2009 11:12:12 PM GMT -08:00 > US/Canada Pacific > Subject: [Data-modeling] Adding Location information to > Address objects > > > If a topic has an Address property this is a compound type. Is > it possible to add Location geo-data to this object? I cannot > see how to edit the Location properties that would be also > included in the Address CVT topic. > > If such data were added would it enable seeing an exact view > of the location of the address? I'm thinking of a Business > Location for example that would otherwise just show to the > level of the city. > > thanks > > paul > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From zenkat at metaweb.com Wed Oct 28 04:54:35 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Tue, 27 Oct 2009 21:54:35 -0700 Subject: [Data-modeling] [Freebase-experts] open library editions test load In-Reply-To: <1A5B1864-7B59-4585-ABFB-2DCC22BEB28A@metaweb.com> References: <1FB69464-A638-43DE-9A0C-F181D0D6C9A2@metaweb.com> <1A5B1864-7B59-4585-ABFB-2DCC22BEB28A@metaweb.com> Message-ID: Hello All -- The third and final test load of Open Library Editions has been completed. This load consists of 100K /book/book_editions and associated data for the Open Library titles we loaded over the summer. The editions can be found under attribution: http://www.freebase.com/tools/explore/user/book_bot/attr/32 Because there are so many topics in this load, it's hard to page through or query them all. If you're interested in seeing a sample, you may prefer the QA sampling view we've put together: http://book-edition-qa.vtalwar.user.dev.freebaseapps.com/queue?experts=1 This load follows all of the rules previously discussed in this thread, with one modification: no subtitle/title parsing is done on the edition's title before it becomes the /type/object/name property of the edition. Otherwise, everything is the same. As always, your feedback is appreciated. Please let us know if you have any questions or comments. We'll start the complete load of ~1.2M editions as soon as we've finished reviewing this final test load. Thanks! Brian On Oct 1, 2009, at 3:26 PM, Brian Karlak wrote: > Hello All -- > > The second test load of 10K Open Library Editions has been > completed. This load implements the blacklists described in the > original email below, along with other miscellaneous cleanup. > > For those of you who are interested in looking at the data, we've > created a data game to help with paging through a random sample of > the load: > > http://book-edition-qa.vtalwar.user.dev.freebaseapps.com/queue?experts=1 > > When reviewing records, you may want to check: > > Did the edition attach to the correct book? > If the edition is marked as "existed", did we match the correct > Freebase edition? > Is the edition data from Open Library correct, according to the LOC > or Amazon? > > If you feel that any of the data is incorrect, you can click "No" to > mark the record as suspect. Notes on why you clicked "No" (or > "Yes", for that matter) are most appreciated -- you can type them in > the box under the buttons. Of course, feel free to bring issues up > for discussion on this list! > > Finally, please note that this load is only of edition information: > we did not attempt to fix the /book/book and /book/author topics > they are attach to. This means that the known problems with the > earlier Open Library loads (missing first word of titles, > punctuation in author names, etc) are not addressed. Once we have > all of the editions, we'll work on this cleanup. > > As always, your feedback is appreciated! > > Thanks, > Brian > > PS -- This load was run under attribution node http://www.freebase.com/tools/explore/user/book_bot/attr/31 > if you want to query for the data. > > > On Sep 21, 2009, at 1:40 PM, Brian Karlak wrote: > >> Thanks for your feedback on this test load -- it is definitely >> appreciated! >> >> Based upon your feedback, we're going to make the following changes >> to the load process: >> >> 1) Create a blacklist of subjects. Do not load editions for >> matching OL entries, and mark the books for deletion: >> >> Computer Software Packages >> Gifts >> Novelty >> Blank Books/Journals >> Calendar[s] * >> >> 2) Create a blacklist of formats. Again, do not load / mark for >> delete: >> >> Calendar >> Stationary >> >> 3) Offline gardening task: Delete everything by "Pet Prints, Inc.". >> >> 4) Offline gardening task: delete trailing punctuation from Open >> Library author names. >> >> Interesting, but probably not implementable in the next 10K load: >> >> 5) Search for books with >50% non-English words that are not marked >> as being for a foreign language, delete from load. >> >> A test load of 10K editions should be coming down the pike >> shortly. We'll let you know before we start the load. >> >> Brian >> >> On Sep 17, 2009, at 5:31 PM, Brian Karlak wrote: >> >>> Hello All -- >>> >>> We hinted a few weeks back that we're trying a new process for >>> loading >>> massive data sets. Instead of doing a single huge data load (and >>> letting everyone know about it afterwards :-), we're doing >>> incrementally larger loads, systematically QA'ing them, and >>> notifying >>> the general Freebase community after each load so that we can get >>> feedback on potential problems. >>> >>> So ... >>> >>> We've just loaded the first test set of 144 OpenLibrary editions to >>> sandbox. This test set came from sampling 1000 editions from the >>> entire OL corpus, and taking those that matched titles and authors >>> from books in our July OpenLibrary book load. This is the first >>> tiny >>> dribble of what we hope ill ultimately be a load of 2.5M editions. >>> >>> The ISBN nodes for these books can be found in the "Links Created" >>> section of this page: >>> >>> http://www.sandbox-freebase.com/tools/explore/user/book_bot/attr/27?limit=200 >>> >>> (Yes, we know it's not the most beautiful display ... we're >>> looking at >>> prettier ways of showing this. However, once you click one of the / >>> soft/isbn/ keys, you'll be back in Freebase proper, on an ISBN node. >>> From this node, you can follow back to book edition, and then book.) >>> >>> Our initial QC of this load has already pointed out some areas for >>> improvement. For instance, Open Library contains some non-book >>> items >>> like blank journals, audio cassettes and art prints. Future loads >>> will filter these out by creating a blacklist of forbidden subjects >>> and formats ("stationary", "audio cassette", "gift", etc.) We'll >>> also >>> probably end up deleting the books in these categories as well. >>> >>> Any feedback you might have is definitely appreciated! >>> >>> Thanks, >>> Brian >>> >>> >>> _______________________________________________ >>> Freebase-experts mailing list >>> Freebase-experts at freebase.com >>> http://lists.freebase.com/mailman/listinfo/freebase-experts >> >> _______________________________________________ >> Freebase-experts mailing list >> Freebase-experts at freebase.com >> http://lists.freebase.com/mailman/listinfo/freebase-experts > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091027/bc8a464d/attachment.htm From zenkat at metaweb.com Wed Oct 28 21:57:25 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Wed, 28 Oct 2009 14:57:25 -0700 Subject: [Data-modeling] [Freebase-experts] open library editions test load In-Reply-To: References: <1FB69464-A638-43DE-9A0C-F181D0D6C9A2@metaweb.com> <1A5B1864-7B59-4585-ABFB-2DCC22BEB28A@metaweb.com> Message-ID: <31D87D34-AE6C-489C-9298-8FE99201264F@metaweb.com> Hi Tom -- Thanks for the feedback! > Love the new title format. It's much easier to disambiguate with the > subtitles included. Yes, definitely. It should also help with fixing the main /book/book titles once we've finished the edition load. > I presume the ground rules are the same as before with the previously > loaded Books keeping their as-loaded titles until some later stage (so > they won't match the Book Editions for now). Yes. We really need to get clean editions hooked up before we can hope to clean up the books -- which should be happening relatively soon. > As a point of clarification on the publisher, the QA app instructions > say to check the field, but are you looking for all of them to be > filled in or just for them to be correct if they are filled in? We don't always have the publisher information (especially from OL), so it's OK for the field to be blank. If it is filled out, however, it should be populated with a link to the correct topic for the "Publisher". Also, as a side point -- the "Publisher" type has a bit of semantic ambiguity to it, since publishing companies trade imprints like baseball cards and go through the same M&A antics the rest of corporate america loves. To make data loads tractable, we have been focusing on reconciling imprints, not publishing companies. This means "Harper Paperbacks", "HarperCollins" and "HarperPress" are separate topics, which at some later point will all have their "Imprint Of" property pointing to the proper parent company. Brian From jeff at metaweb.com Wed Oct 28 22:20:42 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Wed, 28 Oct 2009 15:20:42 -0700 Subject: [Data-modeling] Address refactoring In-Reply-To: <057DA903FBBB428C8A3606CFB057CF92@amd> References: <057DA903FBBB428C8A3606CFB057CF92@amd> Message-ID: <9E1BA6AB176A429FACE6C78F3AAB0C7F@p4> I wrote: > But, in order to kick this off, we need consensus on which > types to change into Is-As. There are 14 commons types with > a property that expects > /location/address: > > /architecture/landscape_project > /architecture/museum > /architecture/structure > /award/hall_of_fame > /business/business_location > /business/shopping_center > /education/educational_institution > /library/public_library > /religion/religious_organization > /medicine/hospital > > Of these, I propose that only the following should have an > Is-A relationship, and be cotyped with /location/location: > /architecture/structure (already includes Location) > /business/shopping_center (already includes Location) > /business/business_location And maybe > /architecture/landscape_project; I'm less clear about how > this type functions. > > The rest can stay as they are. Anyone have any thoughts about the Is-a vs. Has-a types listed above? Determining which is which is the first requirement for getting this task going. If I hear nothing further, I'll assume this Is-A list is good, and proceed from there. Jeff From faye at metaweb.com Wed Oct 28 22:58:45 2009 From: faye at metaweb.com (Faye Harris) Date: Wed, 28 Oct 2009 15:58:45 -0700 Subject: [Data-modeling] Address refactoring In-Reply-To: <057DA903FBBB428C8A3606CFB057CF92@amd> References: <057DA903FBBB428C8A3606CFB057CF92@amd> Message-ID: <4AE8CCA5.6050800@metaweb.com> DA-812 addresses migrating geodata from Address instances to new "Is-A" Location instances. There should be a similar migration task in the opposite direction, migrating geodata from today's "Is-A" Location instances that will turn into "Has-A" Location instances: If any of today's "Is-A"/tomorrow's "Has-A" Location instances has a value for either /location/location/geolocation or /location/location/geometry, that data should be migrated to its new linked Address (cotyped with Location) instance. Alas, even locations can be has-beens. We live in a Fast World. -- Faye Jeff Prucher wrote: > It's been pointed out (thanks Tom and Brendan!) that the proposal for > refactoring how we handle addresses has yet to be acted upon. > > For reference, the original discussions are here: > http://markmail.org/thread/a3embfd2zhwziqy4 > http://markmail.org/thread/iroh55rbd3usb4j4 > > And the JIRA tasks all start here: > https://bugs.freebase.com/browse/DA-808 > > But to sum up, the proposed plan is this: > 1. Add an "address" property to the Location type. Any type that represents > something that Is-A location would have Location as an included type. > > 2. Types that only have a Has-A relationship to location will not be cotyped > as Location, and will continue to have properties for their addresses; > ideally the labels for these addresses would make the relationship clearer: > "mailing address", "headquarters address", etc. > > 3. Geobot (the process that assigns geocodes to addresses) will be updated > to assert geocodes on the base Location topic for all Is-A topics (meaning > that /location/address will not be co-typed as Location in these instances), > but will continue to type other Address instances as Location and assert > geocodes there. > > (#3 is a bit of a change from the original plan, but several people pointed > out that it would be weird not to have, for example, the company > headquarters plotted on the maps of instances of the Company type.) > > It's entirely possible for locations to have both kinds of addresses, if > they are typed as both /location/location and something that has a Has-A > relationship (in many schemas, we assert identity between a building and the > instutition that houses it -- museums or libraries, for example). > > But, in order to kick this off, we need consensus on which types to change > into Is-As. There are 14 commons types with a property that expects > /location/address: > > /architecture/landscape_project > /architecture/museum > /architecture/structure > /award/hall_of_fame > /business/business_location > /business/shopping_center > /education/educational_institution > /library/public_library > /religion/religious_organization > /medicine/hospital > > Of these, I propose that only the following should have an Is-A > relationship, and be cotyped with /location/location: > /architecture/structure (already includes Location) > /business/shopping_center (already includes Location) > /business/business_location > And maybe /architecture/landscape_project; I'm less clear about how this > type functions. > > The rest can stay as they are. > > Comments? > > Jeff > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > From jeff at metaweb.com Wed Oct 28 23:41:47 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Wed, 28 Oct 2009 16:41:47 -0700 Subject: [Data-modeling] Address refactoring In-Reply-To: <4AE8CCA5.6050800@metaweb.com> References: <057DA903FBBB428C8A3606CFB057CF92@amd> <4AE8CCA5.6050800@metaweb.com> Message-ID: <0DE62372FC09490EB9B4CA8FCCC6F5B0@p4> If we end up having to change any Is-A types to Has-A types, we'll need to do this. But I don't think that there are any on the table that fit this description, are there? > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Faye Harris > Sent: Wednesday, October 28, 2009 3:59 PM > To: Freebase data modeling mailing list > Subject: Re: [Data-modeling] Address refactoring > > DA-812 addresses migrating geodata from Address instances to > new "Is-A" > Location instances. There should be a similar migration task > in the opposite direction, migrating geodata from today's > "Is-A" Location instances that will turn into "Has-A" > Location instances: > > If any of today's "Is-A"/tomorrow's "Has-A" Location > instances has a value for either > /location/location/geolocation or > /location/location/geometry, that data should be migrated to > its new linked Address (cotyped with Location) instance. > > Alas, even locations can be has-beens. We live in a Fast World. > > -- Faye > > > > Jeff Prucher wrote: > > It's been pointed out (thanks Tom and Brendan!) that the > proposal for > > refactoring how we handle addresses has yet to be acted upon. > > > > For reference, the original discussions are here: > > http://markmail.org/thread/a3embfd2zhwziqy4 > > http://markmail.org/thread/iroh55rbd3usb4j4 > > > > And the JIRA tasks all start here: > > https://bugs.freebase.com/browse/DA-808 > > > > But to sum up, the proposed plan is this: > > 1. Add an "address" property to the Location type. Any type that > > represents something that Is-A location would have Location > as an included type. > > > > 2. Types that only have a Has-A relationship to location > will not be > > cotyped as Location, and will continue to have properties for their > > addresses; ideally the labels for these addresses would > make the relationship clearer: > > "mailing address", "headquarters address", etc. > > > > 3. Geobot (the process that assigns geocodes to addresses) will be > > updated to assert geocodes on the base Location topic for all Is-A > > topics (meaning that /location/address will not be co-typed as > > Location in these instances), but will continue to type > other Address > > instances as Location and assert geocodes there. > > > > (#3 is a bit of a change from the original plan, but several people > > pointed out that it would be weird not to have, for example, the > > company headquarters plotted on the maps of instances of > the Company > > type.) > > > > It's entirely possible for locations to have both kinds of > addresses, > > if they are typed as both /location/location and something > that has a > > Has-A relationship (in many schemas, we assert identity between a > > building and the instutition that houses it -- museums or > libraries, for example). > > > > But, in order to kick this off, we need consensus on which types to > > change into Is-As. There are 14 commons types with a property that > > expects > > /location/address: > > > > /architecture/landscape_project > > /architecture/museum > > /architecture/structure > > /award/hall_of_fame > > /business/business_location > > /business/shopping_center > > /education/educational_institution > > /library/public_library > > /religion/religious_organization > > /medicine/hospital > > > > Of these, I propose that only the following should have an Is-A > > relationship, and be cotyped with /location/location: > > /architecture/structure (already includes Location) > > /business/shopping_center (already includes Location) > > /business/business_location And maybe > /architecture/landscape_project; > > I'm less clear about how this type functions. > > > > The rest can stay as they are. > > > > Comments? > > > > Jeff > > > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From faye at metaweb.com Thu Oct 29 00:05:53 2009 From: faye at metaweb.com (Faye Harris) Date: Wed, 28 Oct 2009 17:05:53 -0700 Subject: [Data-modeling] Address refactoring In-Reply-To: <0DE62372FC09490EB9B4CA8FCCC6F5B0@p4> References: <057DA903FBBB428C8A3606CFB057CF92@amd> <4AE8CCA5.6050800@metaweb.com> <0DE62372FC09490EB9B4CA8FCCC6F5B0@p4> Message-ID: <4AE8DC61.5050001@metaweb.com> Jeff Prucher wrote: > If we end up having to change any Is-A types to Has-A types, we'll need to > do this. But I don't think that there are any on the table that fit this > description, are there? > Unless I'm reading your original message wrong, yes. Specifically, geodata migration from Location to Address should be done for the following types in your email that are deemed "Has-A" Location, and not "Is-A" Location. At that point the Location cotype can be removed: /architecture/landscape_project /architecture/museum /award/hall_of_fame /education/educational_institution /library/public_library /religion/religious_organization /medicine/hospital Two queries can be run to find the instances that are currently cotyped as Location and have geo data. I'm using /medicine/hospital in my example, but the query applies to others: Query for /location/location/geolocation: http://tinyurl.com/ygmmrcw Query for /location/location/geometry: http://tinyurl.com/yzmlb6l For /medicine/hospital, the first query returns 148 instances, the second 0. -- Faye > >> -----Original Message----- >> From: data-modeling-bounces at freebase.com >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Faye Harris >> Sent: Wednesday, October 28, 2009 3:59 PM >> To: Freebase data modeling mailing list >> Subject: Re: [Data-modeling] Address refactoring >> >> DA-812 addresses migrating geodata from Address instances to >> new "Is-A" >> Location instances. There should be a similar migration task >> in the opposite direction, migrating geodata from today's >> "Is-A" Location instances that will turn into "Has-A" >> Location instances: >> >> If any of today's "Is-A"/tomorrow's "Has-A" Location >> instances has a value for either >> /location/location/geolocation or >> /location/location/geometry, that data should be migrated to >> its new linked Address (cotyped with Location) instance. >> >> Alas, even locations can be has-beens. We live in a Fast World. >> >> -- Faye >> >> >> >> Jeff Prucher wrote: >> >>> It's been pointed out (thanks Tom and Brendan!) that the >>> >> proposal for >> >>> refactoring how we handle addresses has yet to be acted upon. >>> >>> For reference, the original discussions are here: >>> http://markmail.org/thread/a3embfd2zhwziqy4 >>> http://markmail.org/thread/iroh55rbd3usb4j4 >>> >>> And the JIRA tasks all start here: >>> https://bugs.freebase.com/browse/DA-808 >>> >>> But to sum up, the proposed plan is this: >>> 1. Add an "address" property to the Location type. Any type that >>> represents something that Is-A location would have Location >>> >> as an included type. >> >>> 2. Types that only have a Has-A relationship to location >>> >> will not be >> >>> cotyped as Location, and will continue to have properties for their >>> addresses; ideally the labels for these addresses would >>> >> make the relationship clearer: >> >>> "mailing address", "headquarters address", etc. >>> >>> 3. Geobot (the process that assigns geocodes to addresses) will be >>> updated to assert geocodes on the base Location topic for all Is-A >>> topics (meaning that /location/address will not be co-typed as >>> Location in these instances), but will continue to type >>> >> other Address >> >>> instances as Location and assert geocodes there. >>> >>> (#3 is a bit of a change from the original plan, but several people >>> pointed out that it would be weird not to have, for example, the >>> company headquarters plotted on the maps of instances of >>> >> the Company >> >>> type.) >>> >>> It's entirely possible for locations to have both kinds of >>> >> addresses, >> >>> if they are typed as both /location/location and something >>> >> that has a >> >>> Has-A relationship (in many schemas, we assert identity between a >>> building and the instutition that houses it -- museums or >>> >> libraries, for example). >> >>> But, in order to kick this off, we need consensus on which types to >>> change into Is-As. There are 14 commons types with a property that >>> expects >>> /location/address: >>> >>> /architecture/landscape_project >>> /architecture/museum >>> /architecture/structure >>> /award/hall_of_fame >>> /business/business_location >>> /business/shopping_center >>> /education/educational_institution >>> /library/public_library >>> /religion/religious_organization >>> /medicine/hospital >>> >>> Of these, I propose that only the following should have an Is-A >>> relationship, and be cotyped with /location/location: >>> /architecture/structure (already includes Location) >>> /business/shopping_center (already includes Location) >>> /business/business_location And maybe >>> >> /architecture/landscape_project; >> >>> I'm less clear about how this type functions. >>> >>> The rest can stay as they are. >>> >>> Comments? >>> >>> Jeff >>> >>> _______________________________________________ >>> Data-modeling mailing list >>> Data-modeling at freebase.com >>> http://lists.freebase.com/mailman/listinfo/data-modeling >>> >>> >>> >> _______________________________________________ >> Data-modeling mailing list >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling >> >> > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > From tfmorris at gmail.com Thu Oct 29 03:01:12 2009 From: tfmorris at gmail.com (Tom Morris) Date: Wed, 28 Oct 2009 23:01:12 -0400 Subject: [Data-modeling] [Freebase-experts] open library editions test load In-Reply-To: <31D87D34-AE6C-489C-9298-8FE99201264F@metaweb.com> References: <1FB69464-A638-43DE-9A0C-F181D0D6C9A2@metaweb.com> <1A5B1864-7B59-4585-ABFB-2DCC22BEB28A@metaweb.com> <31D87D34-AE6C-489C-9298-8FE99201264F@metaweb.com> Message-ID: On Wed, Oct 28, 2009 at 5:57 PM, Brian Karlak wrote: >> As a point of clarification on the publisher, the QA app instructions >> say to check the field, but are you looking for all of them to be >> filled in or just for them to be correct if they are filled in? > > We don't always have the publisher information (especially from OL), > so it's OK for the field to be blank. ?If it is filled out, however, > it should be populated with a link to the correct topic for the > "Publisher". I'm being dense, so can we use an example? Looking at http://www.freebase.com/edit/topic/guid/9202a8c04000641f800000000f9d3b45 http://openlibrary.org/b/OL1353740M/Color_me_bright http://lccn.loc.gov/92245099 Should this be reported as a miss since it didn't create the publisher or only a miss if there was a reasonable looking publisher already that it failed to reconcile with or acceptable no matter what the prior state of the database was? I think I already flagged it as a miss, so this is primarily for future reviews. I understand the issue with imprints versus publishers. Publishers are also hard because there are bazillions of mom and pop specialty publishers. Tom From zenkat at metaweb.com Thu Oct 29 04:45:24 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Wed, 28 Oct 2009 21:45:24 -0700 Subject: [Data-modeling] [Freebase-experts] open library editions test load In-Reply-To: References: <1FB69464-A638-43DE-9A0C-F181D0D6C9A2@metaweb.com> <1A5B1864-7B59-4585-ABFB-2DCC22BEB28A@metaweb.com> <31D87D34-AE6C-489C-9298-8FE99201264F@metaweb.com> Message-ID: <087AC528-7C1C-4E53-AF9D-D7510E730AB2@metaweb.com> On Oct 28, 2009, at 8:01 PM, Tom Morris wrote: > http://www.freebase.com/edit/topic/guid/9202a8c04000641f800000000f9d3b45 > http://openlibrary.org/b/OL1353740M/Color_me_bright > http://lccn.loc.gov/92245099 > > Should this be reported as a miss since it didn't create the publisher > or only a miss if there was a reasonable looking publisher already > that it failed to reconcile with or acceptable no matter what the > prior state of the database was? I think I already flagged it as a > miss, so this is primarily for future reviews. I would mark it as "miss" because Open Library seems to have publisher information, but we didn't include it in our load. We should have either reconciled it to an already existing publisher, or we should have created a new one. And I see what you mean -- it does seem to be happening fairly frequently. In general, missing data doesn't bother me nearly as much as incorrect data; if we have good strong identifiers attached to the edition we should be able to get the missing data from elsewhere later. However, this seems to be happening frequently enough that it bears looking in to. It may just be due to when we took the snapshot of OpenLibrary; perhaps the data has changed on their end since we started the process. In any case, thanks for the heads-up; we'll have more tomorrow. Brian PS -- a regularly-formed annotation in the comments box like "missing publisher" would help a lot ... thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091028/add97f5e/attachment.htm From jeff at metaweb.com Thu Oct 29 17:02:15 2009 From: jeff at metaweb.com (Jeff Prucher) Date: Thu, 29 Oct 2009 10:02:15 -0700 Subject: [Data-modeling] Address refactoring In-Reply-To: <4AE8DC61.5050001@metaweb.com> References: <057DA903FBBB428C8A3606CFB057CF92@amd> <4AE8CCA5.6050800@metaweb.com><0DE62372FC09490EB9B4CA8FCCC6F5B0@p4> <4AE8DC61.5050001@metaweb.com> Message-ID: <2AB475307EB5411EBFE012CD739F4AEC@amd> Nice catch -- I was so busy looking at the schema (none of these types have location as an included type), I didn't think to look at this data. This is actually going to be a tricky case -- we do not consistently separate buildings from their uses in the schemata -- lots of libraries, hospitals, museums, and such are modeled in a way that implies unity with their structural container (e.g. the Louvre is considered to be both a building [although Building Complex might be more accurate] and a museum). Colloquially, this is perfectly fine -- people don't generally make this distinction except, perhaps, when something moves to a new building. Jeff > -----Original Message----- > From: data-modeling-bounces at freebase.com > [mailto:data-modeling-bounces at freebase.com] On Behalf Of Faye Harris > Sent: Wednesday, October 28, 2009 5:06 PM > To: Freebase data modeling mailing list > Subject: Re: [Data-modeling] Address refactoring > > Jeff Prucher wrote: > > If we end up having to change any Is-A types to Has-A types, we'll > > need to do this. But I don't think that there are any on the table > > that fit this description, are there? > > > Unless I'm reading your original message wrong, yes. > Specifically, geodata migration from Location to Address > should be done for the following types in your email that are > deemed "Has-A" Location, and not "Is-A" Location. At that > point the Location cotype can be removed: > > /architecture/landscape_project > /architecture/museum > /award/hall_of_fame > /education/educational_institution > /library/public_library > /religion/religious_organization > /medicine/hospital > > Two queries can be run to find the instances that are > currently cotyped as Location and have geo data. I'm using > /medicine/hospital in my example, but the query applies to others: > > Query for /location/location/geolocation: > http://tinyurl.com/ygmmrcw > > Query for /location/location/geometry: > http://tinyurl.com/yzmlb6l > > For /medicine/hospital, the first query returns 148 > instances, the second 0. > > -- Faye > > > > > >> -----Original Message----- > >> From: data-modeling-bounces at freebase.com > >> [mailto:data-modeling-bounces at freebase.com] On Behalf Of > Faye Harris > >> Sent: Wednesday, October 28, 2009 3:59 PM > >> To: Freebase data modeling mailing list > >> Subject: Re: [Data-modeling] Address refactoring > >> > >> DA-812 addresses migrating geodata from Address instances to new > >> "Is-A" > >> Location instances. There should be a similar migration > task in the > >> opposite direction, migrating geodata from today's "Is-A" Location > >> instances that will turn into "Has-A" > >> Location instances: > >> > >> If any of today's "Is-A"/tomorrow's "Has-A" Location > instances has a > >> value for either /location/location/geolocation or > >> /location/location/geometry, that data should be migrated > to its new > >> linked Address (cotyped with Location) instance. > >> > >> Alas, even locations can be has-beens. We live in a Fast World. > >> > >> -- Faye > >> > >> > >> > >> Jeff Prucher wrote: > >> > >>> It's been pointed out (thanks Tom and Brendan!) that the > >>> > >> proposal for > >> > >>> refactoring how we handle addresses has yet to be acted upon. > >>> > >>> For reference, the original discussions are here: > >>> http://markmail.org/thread/a3embfd2zhwziqy4 > >>> http://markmail.org/thread/iroh55rbd3usb4j4 > >>> > >>> And the JIRA tasks all start here: > >>> https://bugs.freebase.com/browse/DA-808 > >>> > >>> But to sum up, the proposed plan is this: > >>> 1. Add an "address" property to the Location type. Any type that > >>> represents something that Is-A location would have Location > >>> > >> as an included type. > >> > >>> 2. Types that only have a Has-A relationship to location > >>> > >> will not be > >> > >>> cotyped as Location, and will continue to have properties > for their > >>> addresses; ideally the labels for these addresses would > >>> > >> make the relationship clearer: > >> > >>> "mailing address", "headquarters address", etc. > >>> > >>> 3. Geobot (the process that assigns geocodes to > addresses) will be > >>> updated to assert geocodes on the base Location topic for > all Is-A > >>> topics (meaning that /location/address will not be co-typed as > >>> Location in these instances), but will continue to type > >>> > >> other Address > >> > >>> instances as Location and assert geocodes there. > >>> > >>> (#3 is a bit of a change from the original plan, but > several people > >>> pointed out that it would be weird not to have, for example, the > >>> company headquarters plotted on the maps of instances of > >>> > >> the Company > >> > >>> type.) > >>> > >>> It's entirely possible for locations to have both kinds of > >>> > >> addresses, > >> > >>> if they are typed as both /location/location and something > >>> > >> that has a > >> > >>> Has-A relationship (in many schemas, we assert identity between a > >>> building and the instutition that houses it -- museums or > >>> > >> libraries, for example). > >> > >>> But, in order to kick this off, we need consensus on > which types to > >>> change into Is-As. There are 14 commons types with a > property that > >>> expects > >>> /location/address: > >>> > >>> /architecture/landscape_project > >>> /architecture/museum > >>> /architecture/structure > >>> /award/hall_of_fame > >>> /business/business_location > >>> /business/shopping_center > >>> /education/educational_institution > >>> /library/public_library > >>> /religion/religious_organization > >>> /medicine/hospital > >>> > >>> Of these, I propose that only the following should have an Is-A > >>> relationship, and be cotyped with /location/location: > >>> /architecture/structure (already includes Location) > >>> /business/shopping_center (already includes Location) > >>> /business/business_location And maybe > >>> > >> /architecture/landscape_project; > >> > >>> I'm less clear about how this type functions. > >>> > >>> The rest can stay as they are. > >>> > >>> Comments? > >>> > >>> Jeff > >>> > >>> _______________________________________________ > >>> Data-modeling mailing list > >>> Data-modeling at freebase.com > >>> http://lists.freebase.com/mailman/listinfo/data-modeling > >>> > >>> > >>> > >> _______________________________________________ > >> Data-modeling mailing list > >> Data-modeling at freebase.com > >> http://lists.freebase.com/mailman/listinfo/data-modeling > >> > >> > > > > _______________________________________________ > > Data-modeling mailing list > > Data-modeling at freebase.com > > http://lists.freebase.com/mailman/listinfo/data-modeling > > > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > From kirrily at metaweb.com Thu Oct 29 17:40:47 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Thu, 29 Oct 2009 10:40:47 -0700 Subject: [Data-modeling] Address refactoring In-Reply-To: <2AB475307EB5411EBFE012CD739F4AEC@amd> References: <057DA903FBBB428C8A3606CFB057CF92@amd> <4AE8CCA5.6050800@metaweb.com><0DE62372FC09490EB9B4CA8FCCC6F5B0@p4> <4AE8DC61.5050001@metaweb.com> <2AB475307EB5411EBFE012CD739F4AEC@amd> Message-ID: On 29/10/2009, at 10:02 AM, Jeff Prucher wrote: > Nice catch -- I was so busy looking at the schema (none of these > types have > location as an included type), I didn't think to look at this data. > This is > actually going to be a tricky case -- we do not consistently separate > buildings from their uses in the schemata -- lots of libraries, > hospitals, > museums, and such are modeled in a way that implies unity with their > structural container (e.g. the Louvre is considered to be both a > building > [although Building Complex might be more accurate] and a museum). > Colloquially, this is perfectly fine -- people don't generally make > this > distinction except, perhaps, when something moves to a new building. FWIW, I've got draft schema for museums and libraries that does distinguish: http://mladraft.freebase.com/ K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com From zenkat at metaweb.com Thu Oct 29 19:52:05 2009 From: zenkat at metaweb.com (Brian Karlak) Date: Thu, 29 Oct 2009 12:52:05 -0700 Subject: [Data-modeling] open library editions test load In-Reply-To: <087AC528-7C1C-4E53-AF9D-D7510E730AB2@metaweb.com> References: <1FB69464-A638-43DE-9A0C-F181D0D6C9A2@metaweb.com> <1A5B1864-7B59-4585-ABFB-2DCC22BEB28A@metaweb.com> <31D87D34-AE6C-489C-9298-8FE99201264F@metaweb.com> <087AC528-7C1C-4E53-AF9D-D7510E730AB2@metaweb.com> Message-ID: <198961AB-63F5-449B-AB0F-7EB698FE0402@metaweb.com> On Oct 28, 2009, at 9:45 PM, Brian Karlak wrote: >> Should this be reported as a miss since it didn't create the >> publisher >> or only a miss if there was a reasonable looking publisher already >> that it failed to reconcile with or acceptable no matter what the >> prior state of the database was? I think I already flagged it as a >> miss, so this is primarily for future reviews. > > I would mark it as "miss" because Open Library seems to have > publisher information, but we didn't include it in our load. We > should have either reconciled it to an already existing publisher, > or we should have created a new one. And I see what you mean -- it > does seem to be happening fairly frequently. Hello All -- Oops -- a correction. The lead developer (/user/vtalwar) reminded me that we had decided to only focus on publishers that were easily reconcilable to /book/publishers already in Freebase, since we had concerns about creating too many duplicate imprint entities because of minor spelling variations (eg, HarperPress Inc vs HarperPress). In other words, data quality trumped completeness for this load. With the strong identifiers (ISBN, LCCN, OL, etc) attached to the editions, we should be able to fully load imprints at some point in the near future. Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091029/7b0487da/attachment.htm From pauljmackay at gmail.com Fri Oct 30 05:26:24 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Thu, 29 Oct 2009 22:26:24 -0700 Subject: [Data-modeling] Benefits of open data, Freebase Message-ID: Hi, I started this page http://wiki.freebase.com/wiki/Benefits with a view to listing any potential benefits of using open data, semantic data and specifically Freebase. When discussing using Freebase with other people and organizations it would be really helpful to have a range of points about the benefits it can offer. I'll try to add more to this as I'm sure others have many more ideas that could be listed - any input would be much appreciated! thanks paul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091029/e9881bc6/attachment.htm From kirrily at metaweb.com Fri Oct 30 06:08:20 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Thu, 29 Oct 2009 23:08:20 -0700 Subject: [Data-modeling] Community meeting on Monday, open to all Message-ID: <07A926BB-9A6B-4BEA-9CBF-FAC8473C1A94@metaweb.com> Copied from the blog at http://blog.freebase.com/2009/10/29/freebase-community-meeting-on-monday-open-to-all/ For a little while now we?ve been holding a weekly meeting at Metaweb to discuss issues of interest to the Freebase community. Although it?s usually just Metaweb staff who attend, the agenda and notes from each meeting are available on the Freebase wiki (http://wiki.freebase.com/wiki/Community_meeting ) and we always welcome contributions/questions/etc from anyone who wants to note them on the wiki in advance. Starting next Monday, and each first Monday of the month after that, we will be opening up the meeting to anyone who?s interested. You can attend in person or via Skype. When: 2:30pm PST (GMT-8), Monday November 2nd Where: Metaweb, 631 Howard St, 4th floor, San Francisco RSVP: Please email kirrily at metaweb.com before 2pm Monday if you plan to attend corporeally or digitally (and provide your Skype username in the latter case). If you?re interested in the Skype option, more information is available at http://wiki.freebase.com/wiki/Community_Meeting_via_Skype Whether you can make it or not, please feel free to check out the agenda and add anything you want covered. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com From pauljmackay at gmail.com Fri Oct 30 06:32:58 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Thu, 29 Oct 2009 23:32:58 -0700 Subject: [Data-modeling] Combined views Message-ID: Is there any way currently to create a combined view of disparate topics? Its not obvious to me how that could currently be done - accessing "All topics" in a base allows filtering on the topic properties, which is expected. For certain groups of topics it might be possible to create a more generic Type that could be applied to them all, but that seems to be creating the equivalent of base classes for the sake of it, which is not encouraged in the docs. It might be that the solution is a custom Acre or other web app that can display multiple topic types, but I wondered if this could be done in FB now or if it might be considered as a feature at all? paul -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091029/a862d1b7/attachment.htm From kirrily at metaweb.com Fri Oct 30 06:43:50 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Thu, 29 Oct 2009 23:43:50 -0700 Subject: [Data-modeling] Combined views In-Reply-To: References: Message-ID: <3F5C04F1-5228-4F38-B5FD-38859045E756@metaweb.com> On 29/10/2009, at 11:32 PM, Paul Mackay wrote: > Is there any way currently to create a combined view of disparate > topics? Its not obvious to me how that could currently be done - > accessing "All topics" in a base allows filtering on the topic > properties, which is expected. For certain groups of topics it might > be possible to create a more generic Type that could be applied to > them all, but that seems to be creating the equivalent of base > classes for the sake of it, which is not encouraged in the docs. > > It might be that the solution is a custom Acre or other web app that > can display multiple topic types, but I wondered if this could be > done in FB now or if it might be considered as a feature at all? > There is a way to make a saved view show the results of any arbitrary MQL query, but I forget the details. You put something in the URL. Perhaps someone from the client team who's familiar with this can share the details? In any case, that might do what you want. But I'm wondering... what's your use case here? What is it you want to show in this way? K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com From pauljmackay at gmail.com Fri Oct 30 16:36:59 2009 From: pauljmackay at gmail.com (Paul Mackay) Date: Fri, 30 Oct 2009 09:36:59 -0700 Subject: [Data-modeling] Combined views In-Reply-To: <3F5C04F1-5228-4F38-B5FD-38859045E756@metaweb.com> References: <3F5C04F1-5228-4F38-B5FD-38859045E756@metaweb.com> Message-ID: The initial case I was thinking of was for the local food base. What I would like to be able to do with that data eventually is have a map that can show all useful food-related resources that are nearby. But these could be shops, co-ops, gardens, etc - a range of types. paul On Thu, Oct 29, 2009 at 11:43 PM, Kirrily Robert wrote: > On 29/10/2009, at 11:32 PM, Paul Mackay wrote: > > > Is there any way currently to create a combined view of disparate > > topics? Its not obvious to me how that could currently be done - > > accessing "All topics" in a base allows filtering on the topic > > properties, which is expected. For certain groups of topics it might > > be possible to create a more generic Type that could be applied to > > them all, but that seems to be creating the equivalent of base > > classes for the sake of it, which is not encouraged in the docs. > > > > It might be that the solution is a custom Acre or other web app that > > can display multiple topic types, but I wondered if this could be > > done in FB now or if it might be considered as a feature at all? > > > > There is a way to make a saved view show the results of any arbitrary > MQL query, but I forget the details. You put something in the URL. > Perhaps someone from the client team who's familiar with this can > share the details? In any case, that might do what you want. > > But I'm wondering... what's your use case here? What is it you want > to show in this way? > > K. > > > -- > Kirrily Robert > Freebase Community Director > kirrily at metaweb.com > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091030/7d1cf8f0/attachment-0001.htm From robert at metaweb.com Fri Oct 30 17:32:37 2009 From: robert at metaweb.com (Robert Cook) Date: Fri, 30 Oct 2009 10:32:37 -0700 Subject: [Data-modeling] Combined views In-Reply-To: References: <3F5C04F1-5228-4F38-B5FD-38859045E756@metaweb.com> Message-ID: P - I don't believe you can compose such a query in MQL as it would at least naively require OR capability. However, as a workaround, I typically use a pattern where I create a collector node as a constraint. In your case, you could create a singleton type, perhaps called "Food- related resource types" with a property "types" that expects /type/ type. You then create a single instance of this type, add in the types you want to aggregate and then query using that topics property as the type constraint. Something like: [{ "type":[{ "!/base/food_resources/food_related_resource_types/types": { "id": } }] ... (use fully-qualified properties to get everything you want) }] I'm not sure if the Freebase client interface supports such shenanigans, but it's worth a shot. If not, the ACRE code would be pretty easy. R On Oct 30, 2009, at 9:36 AM, Paul Mackay wrote: > The initial case I was thinking of was for the local food base. What > I would like to be able to do with that data eventually is have a > map that can show all useful food-related resources that are nearby. > But these could be shops, co-ops, gardens, etc - a range of types. > > paul > > On Thu, Oct 29, 2009 at 11:43 PM, Kirrily Robert > wrote: > On 29/10/2009, at 11:32 PM, Paul Mackay wrote: > > > Is there any way currently to create a combined view of disparate > > topics? Its not obvious to me how that could currently be done - > > accessing "All topics" in a base allows filtering on the topic > > properties, which is expected. For certain groups of topics it might > > be possible to create a more generic Type that could be applied to > > them all, but that seems to be creating the equivalent of base > > classes for the sake of it, which is not encouraged in the docs. > > > > It might be that the solution is a custom Acre or other web app that > > can display multiple topic types, but I wondered if this could be > > done in FB now or if it might be considered as a feature at all? > > > > There is a way to make a saved view show the results of any arbitrary > MQL query, but I forget the details. You put something in the URL. > Perhaps someone from the client team who's familiar with this can > share the details? In any case, that might do what you want. > > But I'm wondering... what's your use case here? What is it you want > to show in this way? > > K. > > > -- > Kirrily Robert > Freebase Community Director > kirrily at metaweb.com > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091030/14277a39/attachment.htm From kirrily at metaweb.com Fri Oct 30 19:30:20 2009 From: kirrily at metaweb.com (Kirrily Robert) Date: Fri, 30 Oct 2009 12:30:20 -0700 Subject: [Data-modeling] Combined views In-Reply-To: References: <3F5C04F1-5228-4F38-B5FD-38859045E756@metaweb.com> Message-ID: On 30/10/2009, at 10:32 AM, Robert Cook wrote: > I don't believe you can compose such a query in MQL as it would at > least naively require OR capability. Why can't he do: "type|=" : ["/type/one", "/type/two", "/type/three"] K. -- Kirrily Robert Freebase Community Director kirrily at metaweb.com From robert at metaweb.com Fri Oct 30 20:40:43 2009 From: robert at metaweb.com (robert at metaweb.com) Date: Fri, 30 Oct 2009 13:40:43 -0700 (PDT) Subject: [Data-modeling] Combined views In-Reply-To: References: <3F5C04F1-5228-4F38-B5FD-38859045E756@metaweb.com> Message-ID: <54D8B003-1E5E-4EA5-B62F-7648C748E184@metaweb.com> Yes, indeed, that would be simpler ;-) On Oct 30, 2009, at 12:49 PM, Kirrily Robert wrote: > On 30/10/2009, at 10:32 AM, Robert Cook wrote: >> I don't believe you can compose such a query in MQL as it would at >> least naively require OR capability. > > Why can't he do: > > "type|=" : ["/type/one", "/type/two", "/type/three"] > > K. > > -- > Kirrily Robert > Freebase Community Director > kirrily at metaweb.com > > > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From kevin.l.neff at gmail.com Sat Oct 31 03:49:22 2009 From: kevin.l.neff at gmail.com (Kevin Neff) Date: Fri, 30 Oct 2009 22:49:22 -0500 Subject: [Data-modeling] Journal Articles Message-ID: I'm writing to ask for advice on how to name journal articles, which I plan to import as topics. *The Plan*: I have a set of several hundred papers that are of interest to people who study molecular crowding. I want to upload the articles as topics that have type "PubMed Article". Then, there are some categorizations I want to do with types. For example, one type is "Chemical Reaction Kinetics" which would be added only to articles that include analysis or measurement of reaction progress curves. It could have properties for the type of analysis that was done, etc. Other types will help further categorize the data, making it easy to search and list articles that contain certain types of data, particular facts or values, employ one of various models, , etc. My reason for doing this on Freebase is to make it available (for browsing and editing) to the people who are all re-re-re-re-inventing the bibliography in a field that is small enough that the problem may be solved definitively. *My Question*: But I'm having trouble figuring out how to name the articles at the topic level. I could use the unique identifier provided by PubMed, but I don't know if that will just pollute the namespace. Or I could use the title of the article, but some of the article names are quite long, which might be really inconvenient later on, especially for display purposes. Any thoughts about how to do this? Thanks --KLN ---------- Kevin Neff Mayo College of Medicine Rochester, MN -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091030/64b5449d/attachment.htm From tfmorris at gmail.com Sat Oct 31 04:00:54 2009 From: tfmorris at gmail.com (Tom Morris) Date: Sat, 31 Oct 2009 00:00:54 -0400 Subject: [Data-modeling] Journal Articles In-Reply-To: References: Message-ID: On Fri, Oct 30, 2009 at 11:49 PM, Kevin Neff wrote: > > I'm writing to ask for advice on how to name journal articles, which I plan > to import as topics. ... > My Question: But I'm having trouble figuring out how to name the articles at > the topic level.? I could use the unique identifier provided by PubMed, but > I don't know if that will just pollute the namespace.? Or I could use the > title of the article, but some of the article names are quite long, which > might be really inconvenient later on, especially for display purposes.? Any > thoughts about how to do this? I'd go with the title of the article unless it's exceptionally unwieldy, in which case I'd use a shortened form of the title with the full title in the alias field. A PubMed identifier definitely is not an appropriate topic name for something which as a human readable title/name. Tom From jason at metaweb.com Sat Oct 31 04:35:55 2009 From: jason at metaweb.com (Jason Douglas) Date: Fri, 30 Oct 2009 21:35:55 -0700 Subject: [Data-modeling] Webpage Proposal (was: Use of "web links" property) In-Reply-To: <470BE75A.6010707@gmail.com> References: <002101c809fb$c91113c0$67fa1eac@amd> <006901c80a0e$9d78aad0$67fa1eac@amd> <470AE7DB.2040403@thefirst.org> <000301c80a8c$211b0fe0$65fa1eac@amd> <470BE75A.6010707@gmail.com> Message-ID: So when this thread popped up a few weeks back, we happened to be talking a lot internally at the same time about how lacking the current /common/topic/webpage model is relative to all the interesting use cases people people want to do. Using Freebase to query content on the web seems like such a cool idea that we'd really like to see our schema be more up to the task. After much discussion, here's the proposal we've come up with: http://wiki.freebase.com/wiki/Webpage_Proposal It's a little more complicated than today, but that seems necessary to accommodate the variety of use cases. To compensate, we're also proposing a new approach for simplifying schema complexity in common use cases and for new developers, which is to use the up-coming MQL extensions feature. MQL extensions are virtual properties implemented in code (including Acre!), but more info on that soon... I just wanted to mention the concept up-front since it's a key part of the proposal. -jason On Oct 9, 2007, at 1:40 PM, Shawn Simister wrote: > I agree with the idea of adding a "type of resource" to categorize > links. I think that this would make it easier for a lot of mashups. > For example, right now if I want to get all the MySpace pages for > music artists I could get the web links for each artist and then > filter out those links from the myspace.com domain. Similarly, I > could filter out all the links from wordpress.com, blogspot.com, > etc. to try to find a list of the artists blogs but I would never be > able to get all of them. With categories like "blog", "home page", > "social networking page" it would be a lot easier to do this sort of > thing. > > Shawn > > Christoph Pingel wrote: >> To spin that thread a little farther... I think web links deserve a >> little more attention than they are currently given. >> >> One of the reasons given for a semantic web (for humans) was that >> links in HTML usually don't make meaning explicit. >> A link (if there is not textual information available) can mean >> anything from "here's the source of truth about that topic" to >> "look how this other guy disagrees with me" to "maybe there's >> something interesting hidden in that snippet over there". >> >> So I think it would make good sense in a system like freebase to >> not waste what people know about the resource on the other end. >> >> A practical example: Some entities do have "home pages" (humans, >> cities, institutions, companies, products), so this is in itself a >> special kind of resource if linked to an entry for that entity in >> freebase. >> >> I worked a lot with artist's homepages, and I found that many >> artists have something analogous to a homepage for example on an >> art festival website or in some large online collection (wikipedia, >> artnet, manifesta, etc.). So perhaps it would be a good idea to >> have a richer compound value type for web links: >> - title >> - url >> - topic of the resource (e.g. a topic of type visual artist) >> [this is what we have right now, additionally:] >> - type of resource (home page, blog entry, video, news article >> about x, ... roughly speaking 'page category') >> - the institution / web site / 'collection', whatever the 'parent' >> of the page is. >> >> This way, it would be possible to request the URLs of pages that >> 'represent' Joseph Beuys as a person and artist in different >> contexts, and pages that are merely somehow 'about' him (to give an >> example). >> >> What do you think? >> >> best regards, >> Christoph >> >> >> >> >> Am 09.10.2007 um 17:50 Uhr schrieb Jeff Prucher: >> >>> Thanks to everyone who's weighed in on this -- it's nice to see so >>> much >>> agreement >>> >>> Jeff P >>> >>>> -----Original Message----- >>>> From: data-modeling-bounces at freebase.com >>>> [mailto:data-modeling-bounces at freebase.com] On Behalf Of Jeff >>>> Thompson >>>> Sent: Monday, October 08, 2007 7:31 PM >>>> To: Freebase data modeling mailing list >>>> Subject: Re: [Data-modeling] Use of "web links" property >>>> >>>> Jeff Prucher wrote: >>>>> Semantic meaning is pretty much the only reason. My question is >>>>> really, is there enough semantic meaning inherent in "related >>>>> websites" to merit its being a distinct property? There are >>>>> definitely cases where that's true -- we have "IMDB Link" >>>> on films, for example. >>>> >>>> If the web site is only going to be parsed for information by >>>> a human, then you only need a textual web link description to >>>> be parsed by a human. If the web site potentially has >>>> information to be parsed by a machine (like IMDB) then it >>>> makes sense for the link to be separated out semantically, so >>>> the machine can find it. >>>> >>>> In the case of a book site, it's going to be text parsed by a >>>> human, so I'd say it doesn't warrant a separate property. >>>> >>>> >>>> >> >> >> // Christoph Pingel, MA ? Mediendesign & Semantische Technologien ? >> // Sch?tzenstr. 4 ? 76137 Karlsruhe ? 0721-9338884 ? www.christoph-pingel.de >> ? pingel at cognity.de >> >> >> _______________________________________________ >> Data-modeling mailing list >> >> Data-modeling at freebase.com >> http://lists.freebase.com/mailman/listinfo/data-modeling >> >> >> > > _______________________________________________ > Data-modeling mailing list > Data-modeling at freebase.com > http://lists.freebase.com/mailman/listinfo/data-modeling From philip-freebase at shadowmagic.org.uk Sat Oct 31 10:16:56 2009 From: philip-freebase at shadowmagic.org.uk (Philip Kendall) Date: Sat, 31 Oct 2009 10:16:56 +0000 Subject: [Data-modeling] Journal Articles In-Reply-To: References: Message-ID: <20091031101656.GP31466@sphinx.int.mythic-beasts.com> On Sat, Oct 31, 2009 at 12:00:54AM -0400, Tom Morris wrote: > > A PubMed identifier definitely is not > an appropriate topic name for something which as a human readable > title/name. But please create an enumerated property for the PubMed ID. This is "non trivial" to set up, but people here will be more than willing to help :-) Cheers, Phil -- Philip Kendall http://www.shadowmagic.org.uk/ From jack.alves at gmail.com Sat Oct 31 22:59:53 2009 From: jack.alves at gmail.com (Jack Alves) Date: Sat, 31 Oct 2009 15:59:53 -0700 Subject: [Data-modeling] schema to represent faculty Message-ID: <554723b0910311559u482b339l80afe87bad8b4201@mail.gmail.com> Hi, I'm working with someone to add data about university faculty. This work is part of the Bibliographic Knowledge Network project. For some of the data I found schema in the commons that works. For name, homepage, university, and title I can use, person/ name webpage employment_history /title /employer It isn't clear what I can use for "department" and "research area". An example of data I have for a professor's research area is, Artificial Intelligence: Automated Reasoning Machine Learning Data Mining Any suggestions? Should I create a new base/domain for these extra properties? Jack -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.freebase.com/pipermail/data-modeling/attachments/20091031/085d3f74/attachment.htm