[Data-modeling] Chemistry modeling questions

Bryan Cheung bryan.cheung at metaweb.com
Thu Feb 12 18:54:20 UTC 2009


https://bugs.freebase.com/browse/FREEBASE-390

/user/davidar has graciously offered to load chemistry data from the  
Blue Obelisk Data Repository (BODR).  There will be some new additions/ 
changes to the /chemistry/chemical_element and /chemistry/isotope  
schemas, some of which I would like to ask for community feedback on  
how they would like the data modeled.  In particular, how would the  
community like to see these areas modeled:

 > Elements
 > It looks like WP (the original source of the data for the  
atomic_mass property) uses IUPAC weight. We could update that property  
with your data. As for uncertainty, there are two proposals - we could  
add another int property to represent uncertainty, or we can change  
the ect of atomic_mass to a new cvt type with atomic mass and  
uncertainty as properties.
Personally I'm in favour of using a CVT. Also, the dataset provides  
uncertainties in both concise (1.234(5)) and extended (1.234 +- 0.005)  
notation - so to maintain consistency I've converted all instances of  
the former to the latter, as I felt that this was a more robust form.  
The only issue with this though is that it is going to be affected by  
bug CLI-4191. I'm interested to hear your thoughts on the issue.

 > - electron configuration
 > Unsure of how to represent this in FB. Rawstring?
I'm not sure. For e.g. oxygen - "He 2s2 2p4", would it be best to  
leave it as a rawstring, or convert it to a CVT with a link to helium  
and the rawstring "2s2 2p4", or even expand the noble gas leaving "1s2  
2s2 2p4"?

 > Isotopes
 > - spin
 > Should this be represented as an unique rawstring? Examples include  
1/2+, 0+, 1+, 3/2-
Either that or we could create a spin type and link to instances of  
that, but I'm really not sure.

Feedback is appreciated.

Bryan


More information about the Data-modeling mailing list