| From | Sent On | Attachments |
|---|---|---|
| Davi...@MITRE.org) | May 18, 2011 9:52 pm | |
| Dawid Weiss | May 18, 2011 11:30 pm | |
| Earwin Burrfoot | May 19, 2011 1:24 am | |
| Dawid Weiss | May 19, 2011 1:42 am | |
| Michael McCandless | May 19, 2011 3:08 am | |
| Dawid Weiss | May 19, 2011 3:16 am | |
| Michael McCandless | May 19, 2011 3:21 am | |
| Dawid Weiss | May 19, 2011 3:36 am | |
| Earwin Burrfoot | May 19, 2011 5:30 am | |
| Michael McCandless | May 19, 2011 5:44 am | |
| Dawid Weiss | May 19, 2011 5:45 am | |
| Earwin Burrfoot | May 19, 2011 6:02 am | |
| Robert Muir | May 19, 2011 6:04 am | |
| Dawid Weiss | May 19, 2011 6:20 am | |
| Jason Rutherglen | May 19, 2011 6:22 am | |
| Earwin Burrfoot | May 19, 2011 6:29 am | |
| Jason Rutherglen | May 19, 2011 6:36 am | |
| Michael McCandless | May 19, 2011 6:40 am | |
| Jason Rutherglen | May 19, 2011 7:08 am | |
| Davi...@MITRE.org) | May 19, 2011 7:53 am | |
| Davi...@MITRE.org) | May 19, 2011 8:58 am | |
| Michael McCandless | May 19, 2011 8:59 am | |
| Jason Rutherglen | May 19, 2011 9:35 am | |
| Michael McCandless | May 19, 2011 9:42 am | |
| Earwin Burrfoot | May 19, 2011 11:00 am | |
| Dawid Weiss | May 19, 2011 11:48 am |
| Subject: | Re: FST and FieldCache? | |
|---|---|---|
| From: | Jason Rutherglen (jaso...@gmail.com) | |
| Date: | May 19, 2011 9:35:31 am | |
| List: | org.apache.lucene.java-dev | |
And I do agree there are times when mmap is appropriate, eg if query latency is unimportant to you, but it's not a panacea and it comes with serious downsides
Do we have a benchmark of ByteBuffer vs. byte[]'s in RAM?
There's also RAM based SSDs whose performance could be comparable with well, RAM. Also, with our heap based field caches, the first sorted search requires that they be loaded into RAM. Then we don't unload them until the reader is closed? With MMap the unloading would happen automatically?
On Thu, May 19, 2011 at 8:59 AM, Michael McCandless <luc...@mikemccandless.com> wrote:
On Thu, May 19, 2011 at 10:09 AM, Jason Rutherglen <jaso...@gmail.com> wrote:
When you mmap them you let the OS decide when to swap stuff out which mean you pick up potentially high query latency waiting for these pages to swap back in
Right, however if one is using lets say SSDs, and the query time is less important, then MMap'ing would be fine. Also it prevents deadly OOMs in favor of basic 'slowness' of the query. If there is no performance degradation I think MMap'ing is a great option. A common use case is an index that's far too large for a given server will simply not work today, whereas with MMap'ed field caches the query would complete, just extremely slowly. If the user wishes to improve performance it's easy enough to add more hardware.
Well, be careful: if you just don't have enough memory to accomodate all the RAM data structures Lucene needs... you're gonna be in trouble with mmap too. True, you won't hit OOMEs anymore, but instead you'll be in a swap fest and your app is nearly unusable.
SSDs, while orders of magnitude faster than spinning magnets, are still orders of magnitude slower than RAM.
But, yes, they obviously help substantially. It's a one-way door... you'll never go back once you've switched to SSDs.
And I do agree there are times when mmap is appropriate, eg if query latency is unimportant to you, but it's not a panacea and it comes with serious downsides.
I wish I could have the opposite of mmap from Java -- the ability to pin the pages that hold important data structures.
Mike





