[Data-modeling] Animal breeds and owners
Paul Houle
paul at ontology2.com
Fri Feb 6 22:34:59 UTC 2009
John Giannandra wrote:
> Paul, thats a nice site.
>
>
Thanks. It's actually got a lot of problems. For one thing, it's
really a modified installation of wordpress, and it's reached a point
where the navigation mechanisms have broken down. The head end could
probably add 50,000 more pictures of birds, reptiles, fish, you name
it, in a week but then the site would become completely impossible to use.
We're working on a second-generation publishing platform for similar
kinds of sites (on different topics), and I'll be launching the first
of the new sites soon. One of these days the animals will get ported to
the new system.
> w.r.t ITIS numbers, we have about 40k ITIS TSNs mapped in freebase (and about 23K NCBI numbers).
> So you can view something like:
> http://freebase.com/view/biology/itis/180528
>
> And you can download the data directly at
> http://download.freebase.com/datadumps/2009-01-13/browse/biology/organism_classification.tsv
>
> There are currently ~84K organism classifications in freebase and the 40K mostly reflects the overlap between ITIS and Wikipedia when we last did the import. There has been some discussion about importing all of ITIS even though we dont have blurbs and images for most of those Taxa. Let me know if that would be useful for your project, or if there is particular data that you would like to contribute or see contributed to freebase.
>
We did an alignment between ITIS and Wikipedia for animalphotos.
There are errors in both ITIS and Wikipedia, and of course there is
some disagreement between taxonomists. My (unexpert) opinion from
looking at the diffs is that Wikipedia tends to be more reflective of
recent thinking, but I've found some places where Wikipedia is wrong.
Animalphotos uses ITIS as the skeleton for the animal taxonomy because
the integrity of the tree is better: there was funkiness in the tree we
extracted from Wikipedia that we didn't want to deal with.
Recently we've built a taxonomy of car makes and models based on
government databases and wikipedia. It identifies many car models that
were not tagged as car models in freebase the last time I looked. Our
process starts with a set of objects that have a known identity,
expands it to a larger set where the precision isn't so good, and then
uses various filtering processes to clean up the taxonomy. I think
we'll have something acceptable after the next processing phase -- we'd
be happy to contribute data of this sort to freebase.
More information about the Data-modeling
mailing list