atom feed8 messages in org.apache.lucene.java-userRe: searchWithFilter bug?
FromSent OnAttachments
Peter KeeganDec 4, 2009 7:32 am 
Michael McCandlessDec 4, 2009 7:38 am 
Peter KeeganDec 4, 2009 7:47 am 
Simon WillnauerDec 4, 2009 8:01 am 
Peter KeeganDec 4, 2009 8:27 am 
Simon WillnauerDec 4, 2009 9:53 am 
Michael McCandlessDec 4, 2009 10:09 am 
Simon WillnauerDec 4, 2009 10:29 am 
Subject:Re: searchWithFilter bug?
From:Peter Keegan (pete@gmail.com)
Date:Dec 4, 2009 8:27:03 am
List:org.apache.lucene.java-user

The filter is just a java.util.BitSet. I use the top level reader to create the filter, and call IndexSearcher.search (Query, Filter, HitCollector). So, there is no 'docBase' at this level of the api.

Peter

On Fri, Dec 4, 2009 at 11:01 AM, Simon Willnauer < simo@googlemail.com> wrote:

Peter, which filter do you use, do you respect the IndexReaders maxDoc() and the docBase?

simon

On Fri, Dec 4, 2009 at 4:47 PM, Peter Keegan <pete@gmail.com> wrote:

I think the Filter's docIdSetIterator is using the top level reader for each segment, because the cardinality of the DocIdSet from which it's created is the same for all readers (and what I expect to see at the top level.

On Fri, Dec 4, 2009 at 10:38 AM, Michael McCandless < luc@mikemccandless.com> wrote:

That doesn't sound good.

Though, in searchWithFilter, we seem to ask for the Query's scorer, and the Filter's docIdSetIterator, using the same reader (which may be toplevel, for the legacy case, or per-segment, for the normal case). So I'm not [yet] seeing where the issue is...

Can you boil it down to a smallish test case?

Mike

On Fri, Dec 4, 2009 at 10:32 AM, Peter Keegan <pete@gmail.com> wrote:

I'm having a problem with 'searchWithFilter' on Lucene 2.9.1. The Filter wraps a simple BitSet. When doing a 'MatchAllDocs' query with this filter, I get only a subset of the expected results, even accounting for

deletes.

The

index has 10 segments. In IndexSearcher->searchWithFilter, it looks

like

the

scorer is advancing to the filter's docId, which is the index-wide value, but the scorer is using the segment-relative value. If I optimize the index, I get the expected results. Does this look like a bug?