[Data-modeling] Library of Congress and Dewey Classifications
Praveen Paritosh
paritosh at metaweb.com
Tue Mar 18 21:19:17 UTC 2008
Jeff Prucher wrote:
>
> After poking around a bit more, it does seem like the LoC data is less
> completely weird than the Dewey data. What variation I've found (and from an
> admittedly small sample) is that sometimes the entire code is different at
> different libraries, but within that variation, the codes are stable at the
> edition level. E.g., for a 1976 edition of "The Wealth of Nations", one
> library has "AC7 .S59 1976" and another has "HB161 .S65 1976", but I didn't
> see any with something like "HB161 .S64". So maybe LC classification should
> stay on the book edition.
>
> The codes, regardless of how we do this, should definitely be
> machine-readable strings, rather than text strings (as they are now). I'm
> not sure about modeling the entirety of Dewey/LoC classifications as topics
> -- we could do it (modulo any copyright issues -- I haven't looked into
> that), but I wonder which would be more useful -- a simple string or a
> topic?
For the top level 1000 of the Dewey Decimal Classification (DDC)
numbers, we have a number and a label[1], e.g.,
121 Epistemology (Theory of knowledge)
For the remaining vast majority (e.g., we currently have 170,000 DDC
numbers that we have not loaded), we do not know the corresponding
labels which seems to be copyrighted information that OCLC[2] provides.
However, as a result of having alternate subject information for those
books, one can do mappings between /book/written_work/subjects and DDC
numbers. The subjects themselves are topics, and so it seems unnecessary
to model DDC as topics.
[1] http://www.tnrdlib.bc.ca/dewey.html
[2] http://www.oclc.org/dewey/
Thanks,
Praveen.
More information about the Data-modeling
mailing list