| From | Sent On | Attachments |
|---|---|---|
| Davi...@MITRE.org) | May 18, 2011 9:52 pm | |
| Dawid Weiss | May 18, 2011 11:30 pm | |
| Earwin Burrfoot | May 19, 2011 1:24 am | |
| Dawid Weiss | May 19, 2011 1:42 am | |
| Michael McCandless | May 19, 2011 3:08 am | |
| Dawid Weiss | May 19, 2011 3:16 am | |
| Michael McCandless | May 19, 2011 3:21 am | |
| Dawid Weiss | May 19, 2011 3:36 am | |
| Earwin Burrfoot | May 19, 2011 5:30 am | |
| Michael McCandless | May 19, 2011 5:44 am | |
| Dawid Weiss | May 19, 2011 5:45 am | |
| Earwin Burrfoot | May 19, 2011 6:02 am | |
| Robert Muir | May 19, 2011 6:04 am | |
| Dawid Weiss | May 19, 2011 6:20 am | |
| Jason Rutherglen | May 19, 2011 6:22 am | |
| Earwin Burrfoot | May 19, 2011 6:29 am | |
| Jason Rutherglen | May 19, 2011 6:36 am | |
| Michael McCandless | May 19, 2011 6:40 am | |
| Jason Rutherglen | May 19, 2011 7:08 am | |
| Davi...@MITRE.org) | May 19, 2011 7:53 am | |
| Davi...@MITRE.org) | May 19, 2011 8:58 am | |
| Michael McCandless | May 19, 2011 8:59 am | |
| Jason Rutherglen | May 19, 2011 9:35 am | |
| Michael McCandless | May 19, 2011 9:42 am | |
| Earwin Burrfoot | May 19, 2011 11:00 am | |
| Dawid Weiss | May 19, 2011 11:48 am |
| Subject: | Re: FST and FieldCache? | |
|---|---|---|
| From: | Earwin Burrfoot (ear...@gmail.com) | |
| Date: | May 19, 2011 1:24:25 am | |
| List: | org.apache.lucene.java-dev | |
You cannot get a string out of automaton by its ordinal without storing additional data. The string is stored there not as a single arc, but as a sequence of them (basically.. err.. as a string), so referencing them is basically writing the string asis. Space savings here come from sharing arcs between strings.
Though, it's possible to do if you associate an additional number with each node. (I invented some way, shared it with Mike and forgot.. good grief :/)
Perfect hashing, on the other hand, is like a Map<String, Integer> that accepts a predefined set of N strings and returns an int in 0..N-1 interval. And it can't do the reverse lookup, by design, that's a lossy compression for all good perfect hashing algos. So, it's irrelevant here, huh?
On Thu, May 19, 2011 at 08:53, David Smiley (@MITRE.org) <DSMI...@mitre.org> wrote:
I've been pondering how to reduce the size of FieldCache entries when there are a large number of Strings. I'd like to facet on such a field with Solr but with less memory. As I understand it, FSTs are a highly compressed representation of a set of Strings (among other possibilities). The fieldCache would need to point to an FST entry (an "arc"?) using something small, say an integer. Is there a way to point to an FST entry with an integer, and then somehow with relative efficiency construct the String from the arcs to get there?
~ David Smiley
----- Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book
--
View this message in context:
http://lucene.472066.n3.nabble.com/FST-and-FieldCache-tp2960030p2960030.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
-- Kirill Zakharenko/Кирилл Захаренко E-Mail/Jabber: ear...@gmail.com Phone: +7 (495) 683-567-4 ICQ: 104465785





