[Developers] RFE: fuzzy search

Andi Vajda vajda at metaweb.com
Fri Jul 18 15:38:14 UTC 2008


On Fri, 18 Jul 2008, Shug Boabby wrote:

> Excellent!
>
> All the examples I have seen of this API are URL based... is there a
> way to send JSON queries? Specifically, I want to get the Wikipedia ID
> in the response.

I'm not sure I understand your question correctly:

If you're asking for passing all parameter to the Lucene query as JSON:

   Not currently, no. I could add support for another parameter to pass a
   JSON struct of all parameters for the query. From python, this is not
   really needed since you can use urrlib2.urlopen(url, urlencode(query)).
   urlencode() takes your dict and converts it to a bunch of URL-happy
   key/value pairs.
   But you're not the first one to request this. I sure can add support for
   this if there is consensus around a reasonable use case.

If you're asking about retrieving the wikipedia id:

  Almost. The new relevance search server, currently under development, makes
  it possible to pass a MQL query that is going to be run against the ids
  returned by the Lucene query. By default, a canned MQL query is used. With
  the new server you can replace it with your own to retrieve extra
  information such as the wiki id (which the Lucene index doesn't have, it
  only returns topic guids and a score)

Andi..

>
> 2008/7/18 Andi Vajda <vajda at metaweb.com>:
>>
>> On Fri, 18 Jul 2008, Andi Vajda wrote:
>>
>>> On Fri, 18 Jul 2008, Shug Boabby wrote:
>>>
>>>> This RFE got lost in one of my posts from a few days ago. Copied here.
>>>>
>>>> It would be awesome to be able to send off a "fuzzy text match" query.
>>>> I tend to use combinations of "a:key~=" etc, but it would be really
>>>> really good to be able to basically do a search engine style search
>>>> across the keys that can perhaps do simple things like stemming of
>>>> words, removal of stop words, rearranging words, minor spelling
>>>> corrections, alternative words and arbitrary dropping of some words in
>>>> longer queries... like one would expect from a search bar. I believe
>>>> Apache Lucene does much of this. As a solid example, I would like it
>>>> if a search for "smashin pumpkins" [sic] would return "The Smashing
>>>> Pumpkins", or if a search for "The Queen of England" would include
>>>> Elizabeth_II_of_the_United_Kingdom" along with
>>>> "Elizabeth_I_of_England".
>>>
>>> The Freebase search server is Lucene-based and implements a lot of this
>>> already. See http://www.freebase.com/api/service/search for more
>>> information.
>>
>> Sorry, that URL should be:
>> http://www.freebase.com/view/guid/9202a8c04000641f8000000006ad84c9
>>
>> Andi..
>> _______________________________________________
>> Developers mailing list
>> Developers at freebase.com
>> http://lists.freebase.com/mailman/listinfo/developers
>>
>


More information about the Developers mailing list