[Developers] Attribution/Citation best practices?

Brendan Neutra brendan at metaweb.com
Wed Apr 29 16:12:49 UTC 2009


+1 Clearly, the trend has been toward reducing the barrier to, em, data entry (witness Typewriter, Genderizer). The issue of reconciling manually entered data concerns me. I like your lightweight suggestion for attributing the wikipedia (vertical data entry apps could be even better at this).  And yes, care is taken for bulk data loads done by the freebase staff. Though, we may not want to require a heavy weight attribution process at this stage, whatever process that we are using should be documented and made available for those who are eager to use it.

brendan

----- Original Message -----
From: "Tom Morris" <tfmorris at gmail.com>
To: "For discussions about MQL, Freebase API and apps built on Freebase" <developers at freebase.com>
Sent: Wednesday, April 29, 2009 8:58:40 AM GMT -08:00 US/Canada Pacific
Subject: Re: [Developers] Attribution/Citation best practices?

So I take it that the lack of response means that nothing like this
exists, at least in written form.  Is it seen as something useful and
worth developing or is the assumption that Metaweb will do all the
bulk data loads and can deal with this type of thing as an internal
matter?

Tom

On Thu, Mar 19, 2009 at 4:14 PM, Tom Morris <tfmorris at gmail.com> wrote:
> Data provenance is key to almost any type of research and good
> citations/attribution are what allows a researcher to follow the
> provenance back and figure out whether or not they should trust a
> piece of information.
>
> In looking at what the Bulk Data Operations do, it appears that there
> some underlying machinery in place to do most or all of what's needed,
> but I haven't seen any documentation or recommended "best practices"
> for applications which are doing data loading.
>
> What should we be doing to make sure that the data we load is
> attributed properly?  While it's fun having big numbers against my
> name on the leaderboard, I'd really like to be able to point at the
> external data source that was used and include things like the date it
> was accessed, etc.  Is there a little cookbook entry someplace on how
> to do this?
>
> There's a similar issue for individual data entry.  I bet a lot of the
> structured data is coming from people transcribing Wikipedia blurbs,
> yet there's no way for them to indicate this, or the data of the
> Wikipedia article, as they do the data entry.  I little UI flag that
> they could click to say "I'm transcribing from Wikipedia" might help
> resolve conflicting information later.
>
> Tom
>
_______________________________________________
Developers mailing list
Developers at freebase.com
http://lists.freebase.com/mailman/listinfo/developers


More information about the Developers mailing list