[Developers] Attribution/Citation best practices?
Tom Morris
tfmorris at gmail.com
Thu Mar 19 20:14:31 UTC 2009
Data provenance is key to almost any type of research and good
citations/attribution are what allows a researcher to follow the
provenance back and figure out whether or not they should trust a
piece of information.
In looking at what the Bulk Data Operations do, it appears that there
some underlying machinery in place to do most or all of what's needed,
but I haven't seen any documentation or recommended "best practices"
for applications which are doing data loading.
What should we be doing to make sure that the data we load is
attributed properly? While it's fun having big numbers against my
name on the leaderboard, I'd really like to be able to point at the
external data source that was used and include things like the date it
was accessed, etc. Is there a little cookbook entry someplace on how
to do this?
There's a similar issue for individual data entry. I bet a lot of the
structured data is coming from people transcribing Wikipedia blurbs,
yet there's no way for them to indicate this, or the data of the
Wikipedia article, as they do the data entry. I little UI flag that
they could click to say "I'm transcribing from Wikipedia" might help
resolve conflicting information later.
Tom
More information about the Developers
mailing list