[Developers] [Data-modeling] Data load issues

Iain Sproat iainsproat at gmail.com
Fri May 29 13:38:15 UTC 2009


On a similar note, how do we deal with the resolution of duplicated CVTs?
 These are harder to fix through the client. (particularly as they don't
have a topic name with which to use the merge flag against.)
I assume the duplication of CVTs is a common scenario when uploading
properties from various data sources. I've just uploaded a bunch of Irish
Barons<http://www.freebase.com/view/user/sprocketonline/default_domain/views/duplicated_ireland_baron_cvt>twice
by accident.  And now there are two topics linking the Peerage of
Ireland to the relevant Noble title.

Is there a bot which detects CVT's with identical properties and merges, and
CVTs with zero or one property and deletes?

Iain

On Fri, May 29, 2009 at 3:24 AM, Jeff Prucher <jeff at metaweb.com> wrote:

> These publisher topics are loaded from ISFDB, not OpenLibrary, so they're
> part of a different problem!
>
> Publishers can be tricky, though, and it's often very hard to tell if they
> way something is listed is the publisher, the imprint, both, or some other
> werid combination. Especially since publishers change their internal
> structure all the dang time, so historical data can be a mess to deal with.
>
> In these Dark Horse cases though, the Amazon links are very helpful: most
> of
> them have "search inside" enabled for at least the covers and copyright. So
> Dark Horse Books is a Publisher, which is a division of the Company "Dark
> Horse Comics." Berkley, Putnam, and Boulevard are all imprints of the
> Penguin Group; all I can think of is that they're some kind of
> co-publication deal, which doesn't seem to be well-supported in the schema.
>
> Jeff
>
> > -----Original Message-----
> > From: developers-bounces at freebase.com
> > [mailto:developers-bounces at freebase.com] On Behalf Of Ed Laurent
> > Sent: Thursday, May 28, 2009 2:06 PM
> > To: For discussions about MQL,Freebase API and apps built on Freebase
> > Subject: Re: [Developers] [Data-modeling] Data load issues
> >
> > I've flagged a few but am already feeling overwhelmed with
> > the task. For example, all these Dark Horse publishers still
> > need to be assessed for merge:
> > http://www.freebase.com/view/user/spatialed/default_domain/vie
> > ws/how_many_dark_horse
> >
> > Should we just flag them for potential merge? I usually only
> > flag topics for merge if I feel pretty confident about it.
> > Unfortunately, I haven't found any good online resources to
> > verify if these publishers are in fact the same topics. Are there any?
> >
> > -Ed
> >
> >
> >
> > On Thu, May 28, 2009 at 4:42 PM, Kirrily Robert
> > <kirrily at metaweb.com> wrote:
> >
> >
> >       On 28/05/2009, at 1:03 PM, Reilly Hayes wrote:
> >       >> That's kind of a chicken and egg thing since it's
> > going to be hard
> >       >> for
> >       >> folks to clean up the authors without the additional
> > information.
> >       >> Any
> >       >> chance you guys would consider adding a link back to
> > Open Library so
> >       >> people can see an author's works when deciding
> > whether to flag or how
> >       >> to vote?  It would significantly reduce the amount
> > of labor involved
> >       >> for reviewers.
> >       >
> >       > We are not relying on the community for this cleanup.
> >  You will not
> >       > see these in the merge queue.  We did not want to
> > swamp the merge
> >       > queue with these.
> >       >
> >
> >       What about if the community discover them on their own
> > and flag them
> >       for merge, though?  I think that's the situation we're
> > dealing with
> >       here.
> >
> >
> >       K.
> >
> >       --
> >       Kirrily Robert
> >       Freebase Community Director
> >       kirrily at metaweb.com
> >
> >
> >
> >       _______________________________________________
> >
> >       Developers mailing list
> >       Developers at freebase.com
> >       http://lists.freebase.com/mailman/listinfo/developers
> >
> >
> >
> >
>
> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freebase.com/pipermail/developers/attachments/20090529/1e0cca98/attachment-0001.htm 


More information about the Developers mailing list