[Developers] [Data-modeling] Data load issues

Jeff Prucher jeff at metaweb.com
Thu May 28 23:24:31 UTC 2009


These publisher topics are loaded from ISFDB, not OpenLibrary, so they're
part of a different problem!

Publishers can be tricky, though, and it's often very hard to tell if they
way something is listed is the publisher, the imprint, both, or some other
werid combination. Especially since publishers change their internal
structure all the dang time, so historical data can be a mess to deal with. 

In these Dark Horse cases though, the Amazon links are very helpful: most of
them have "search inside" enabled for at least the covers and copyright. So
Dark Horse Books is a Publisher, which is a division of the Company "Dark
Horse Comics." Berkley, Putnam, and Boulevard are all imprints of the
Penguin Group; all I can think of is that they're some kind of
co-publication deal, which doesn't seem to be well-supported in the schema.

Jeff

> -----Original Message-----
> From: developers-bounces at freebase.com 
> [mailto:developers-bounces at freebase.com] On Behalf Of Ed Laurent
> Sent: Thursday, May 28, 2009 2:06 PM
> To: For discussions about MQL,Freebase API and apps built on Freebase
> Subject: Re: [Developers] [Data-modeling] Data load issues
> 
> I've flagged a few but am already feeling overwhelmed with 
> the task. For example, all these Dark Horse publishers still 
> need to be assessed for merge: 
> http://www.freebase.com/view/user/spatialed/default_domain/vie
> ws/how_many_dark_horse
> 
> Should we just flag them for potential merge? I usually only 
> flag topics for merge if I feel pretty confident about it. 
> Unfortunately, I haven't found any good online resources to 
> verify if these publishers are in fact the same topics. Are there any?
> 
> -Ed
> 
> 
> 
> On Thu, May 28, 2009 at 4:42 PM, Kirrily Robert 
> <kirrily at metaweb.com> wrote:
> 
> 
> 	On 28/05/2009, at 1:03 PM, Reilly Hayes wrote:
> 	>> That's kind of a chicken and egg thing since it's 
> going to be hard
> 	>> for
> 	>> folks to clean up the authors without the additional 
> information.
> 	>> Any
> 	>> chance you guys would consider adding a link back to 
> Open Library so
> 	>> people can see an author's works when deciding 
> whether to flag or how
> 	>> to vote?  It would significantly reduce the amount 
> of labor involved
> 	>> for reviewers.
> 	>
> 	> We are not relying on the community for this cleanup. 
>  You will not
> 	> see these in the merge queue.  We did not want to 
> swamp the merge
> 	> queue with these.
> 	>
> 	
> 	What about if the community discover them on their own 
> and flag them
> 	for merge, though?  I think that's the situation we're 
> dealing with
> 	here.
> 	
> 
> 	K.
> 	
> 	--
> 	Kirrily Robert
> 	Freebase Community Director
> 	kirrily at metaweb.com
> 	
> 	
> 	
> 	_______________________________________________
> 	
> 	Developers mailing list
> 	Developers at freebase.com
> 	http://lists.freebase.com/mailman/listinfo/developers
> 	
> 
> 
> 



More information about the Developers mailing list