atom feed26 messages in org.apache.lucene.java-devRe: FST and FieldCache?
FromSent OnAttachments
Davi...@MITRE.org)May 18, 2011 9:52 pm 
Dawid WeissMay 18, 2011 11:30 pm 
Earwin BurrfootMay 19, 2011 1:24 am 
Dawid WeissMay 19, 2011 1:42 am 
Michael McCandlessMay 19, 2011 3:08 am 
Dawid WeissMay 19, 2011 3:16 am 
Michael McCandlessMay 19, 2011 3:21 am 
Dawid WeissMay 19, 2011 3:36 am 
Earwin BurrfootMay 19, 2011 5:30 am 
Michael McCandlessMay 19, 2011 5:44 am 
Dawid WeissMay 19, 2011 5:45 am 
Earwin BurrfootMay 19, 2011 6:02 am 
Robert MuirMay 19, 2011 6:04 am 
Dawid WeissMay 19, 2011 6:20 am 
Jason RutherglenMay 19, 2011 6:22 am 
Earwin BurrfootMay 19, 2011 6:29 am 
Jason RutherglenMay 19, 2011 6:36 am 
Michael McCandlessMay 19, 2011 6:40 am 
Jason RutherglenMay 19, 2011 7:08 am 
Davi...@MITRE.org)May 19, 2011 7:53 am 
Davi...@MITRE.org)May 19, 2011 8:58 am 
Michael McCandlessMay 19, 2011 8:59 am 
Jason RutherglenMay 19, 2011 9:35 am 
Michael McCandlessMay 19, 2011 9:42 am 
Earwin BurrfootMay 19, 2011 11:00 am 
Dawid WeissMay 19, 2011 11:48 am 
Subject:Re: FST and FieldCache?
From:Earwin Burrfoot (ear@gmail.com)
Date:May 19, 2011 6:02:22 am
List:org.apache.lucene.java-dev

On Thu, May 19, 2011 at 16:45, Dawid Weiss <dawi@cs.put.poznan.pl> wrote:

That's what I invented, and yes, it was invented by countless people before :)

You know I didn't mean to sound rude, right? I'm really admiring your ability to come up with these solutions by yourself, I'm merely copying other folks' ideas.

I tried to prevent another reference to mr. Daciuk :)

Anyway, the optimization you're describing is sure possible. Lucene's FST implementation can actually combine both approaches because always expanding nodes is inefficient and those already expanded will allow a binary search (assuming the automaton structure is known to the implementation). Another refinement of this idea creates a detached table (err.. index :) of states to start from inside the automaton, so that you don't have to go through the initial 2-3 states which are more or less always large and even binary search is costly there. Dawid

But you have to lookup this err..index somehow. And that's either binary or hash lookup. Where's the win?