[Developers] Problem writing floating point value twice (or "What value of epsilon does MQL use?")
Warren Harris
warren at metaweb.com
Wed Jul 29 19:59:26 UTC 2009
Aside from the epsilon comparison issue, there is probably a bug here
in that our python layer will often munge the number of significant
digits before it ever gets to our database where the comparison is
performed. I plan to fix this in then next major release of mql.
Warren
On Jul 29, 2009, at 12:17 PM, Scott Meyer wrote:
> Tom Morris wrote:
>> I guess maybe I should define my use of "epsilon." In any given
>> floating point representation, not all numbers can be represented
>> exactly, even if they are in range. Because of this and the way
>> processors and software libraries do calculations, rounding, etc,
>> it's
>> bad practice to check for exact equality of floating point numbers by
>> comparing their binary representation. Instead, the difference
>> between the two numbers is calculated and compared to some small
>> value, often referred to as "epsilon," and if the difference is less
>> than this value, the numbers are considered equal.
>>
>> It would appear that MQL is not using this technique or perhaps the
>> value of epsilon is set too low to accommodate variability introduced
>> by the software stack and the various conversions that are done.
>>
>> Is my interpretation accurate (and thus this should be filed as a
>> bug)
>> or is something else going on here?
>
> In the database, we store floating point values as strings so
> there's no
> representational limit. Equality means "exactly equal," not "close
> to".
>
>> On Tue, Jul 28, 2009 at 10:20 AM, Tom Morris<tfmorris at gmail.com>
>> wrote:
>>> If I run this query twice
>>>
>>> [{"guid": "#9202a8c04000641f80000000087c7629",
>>> "type": "/location/location",
>>> "area": {
>>> "connect": "insert",
>>> "value": 0.0012141000000000003
>>> }}]
>>>
>>> The second run will return the following error
>>>
>>> "info": {
>>> "key": "value",
>>> "newvalue": 0.0012141000000000003,
>>> "value": 0.0012141
>>> },
>>> "message": "Found existing value for unique property, try update",
>>>
>>> where I'd expect it to return "present."
>>>
>>> Is there any way around this behavior other than pre-reading and
>>> comparing myself (doubling the latency)?
>
> Case 1: your area computation really is accurate to 18 significant
> digits
>
> I think that we're doing exactly the right thing. Picking some
> general purpose epislon is virtually guaranteed to to make the guy
> who has lovingly hand crafted a fixed point area computation which is
> really accurate to 18 SD apoplectic with rage.
>
> Case 2: your area computation isn't really that accurate
>
> How about clamping to a modest 5 SD?
>
> I suppose we could come up with guidelines for how many significant
> digits should go into particular properties; even 5 seems like
> overkill for the purposes of describing real estate.
>
> -Scott
> _______________________________________________
> Developers mailing list
> Developers at freebase.com
> http://lists.freebase.com/mailman/listinfo/developers
More information about the Developers
mailing list