| From | Sent On | Attachments |
|---|---|---|
| Davi...@MITRE.org) | May 18, 2011 9:52 pm | |
| Dawid Weiss | May 18, 2011 11:30 pm | |
| Earwin Burrfoot | May 19, 2011 1:24 am | |
| Dawid Weiss | May 19, 2011 1:42 am | |
| Michael McCandless | May 19, 2011 3:08 am | |
| Dawid Weiss | May 19, 2011 3:16 am | |
| Michael McCandless | May 19, 2011 3:21 am | |
| Dawid Weiss | May 19, 2011 3:36 am | |
| Earwin Burrfoot | May 19, 2011 5:30 am | |
| Michael McCandless | May 19, 2011 5:44 am | |
| Dawid Weiss | May 19, 2011 5:45 am | |
| Earwin Burrfoot | May 19, 2011 6:02 am | |
| Robert Muir | May 19, 2011 6:04 am | |
| Dawid Weiss | May 19, 2011 6:20 am | |
| Jason Rutherglen | May 19, 2011 6:22 am | |
| Earwin Burrfoot | May 19, 2011 6:29 am | |
| Jason Rutherglen | May 19, 2011 6:36 am | |
| Michael McCandless | May 19, 2011 6:40 am | |
| Jason Rutherglen | May 19, 2011 7:08 am | |
| Davi...@MITRE.org) | May 19, 2011 7:53 am | |
| Davi...@MITRE.org) | May 19, 2011 8:58 am | |
| Michael McCandless | May 19, 2011 8:59 am | |
| Jason Rutherglen | May 19, 2011 9:35 am | |
| Michael McCandless | May 19, 2011 9:42 am | |
| Earwin Burrfoot | May 19, 2011 11:00 am | |
| Dawid Weiss | May 19, 2011 11:48 am |
| Subject: | Re: FST and FieldCache? | |
|---|---|---|
| From: | Earwin Burrfoot (ear...@gmail.com) | |
| Date: | May 19, 2011 6:29:45 am | |
| List: | org.apache.lucene.java-dev | |
This is more about compressing strings in TermsIndex, I think. And ability to use said TermsIndex directly in some cases that required FieldCache before. (Maybe FC is still needed, but it can be degraded to docId->ord map, storing actual strings in TI). This yields fat space savings when we, eg, need to both lookup on a field and build facets out of it.
mmap is cool :) What I want to see is a FST-based TermsDict that is simply mmaped into memory, without building intermediate indexes, like Lucene does now. And docvalues are orthogonal to that, no?
On Thu, May 19, 2011 at 17:22, Jason Rutherglen <jaso...@gmail.com> wrote:
maybe thats because we have one huge monolithic implementation
Doesn't the DocValues branch solve this?
Also, instead of trying to implement clever ways of compressing strings in the field cache, which probably won't bare fruit, I'd prefer to look at [eventually] MMap'ing (using DV) the field caches to avoid the loading and heap costs, which are signifcant. I'm not sure if we can easily MMap packed ints and the shared byte[], though it seems fairly doable?
On Thu, May 19, 2011 at 6:05 AM, Robert Muir <rcm...@gmail.com> wrote:
2011/5/19 Michael McCandless <luc...@mikemccandless.com>:
Of course, for certain apps that perf hit is justified, so probably we should make this an option when populating field cache (ie, in-memory storage option of using an FST vs using packed ints/byte[]).
or should we actually try to have different fieldcacheimpls?
I see all these missions to refactor the thing, which always fail.
maybe thats because we have one huge monolithic implementation.
-- Kirill Zakharenko/Кирилл Захаренко E-Mail/Jabber: ear...@gmail.com Phone: +7 (495) 683-567-4 ICQ: 104465785





