luceneDeleting Documents using a Multi-Term Query


Introduction

Deleting documents from a Lucene index is easy when you have a primary key field in your document (like in traditional SQL databases).

However, sometimes deleting a number of documents based on multiple fields in the document is what you need. The Lucene API allows you to achieve this by specifying a query to use for deletion.

To do this, pick the right Analyzer, construct the query, pass the query to the indexWriter to delete the documents.

Syntax

  1. indexWriter.deleteDocuments(multiTermQuery);
  2. Query multiTermQuery = new QueryParser("", analyzer).parse("field_name1:"field value 1" AND field_name2:"field value 2"");
  3. BooleanQuery multiTermQuery = new BooleanQuery(); multiTermQuery.add(new TermQuery(new Term("field_name1", "field value 1")), BooleanClause.Occur.MUST); multiTermQuery.add(new TermQuery(new Term("field_name2", "field value 2")), BooleanClause.Occur.MUST);

Remarks

Caveats with the Choice of Analyzer

It's not immediately obvious, but the analyzer that you are using makes a huge difference to the way your query is run. This is because the StandardAnalyzer filters out common English words like "the" and "a". You might want to pick a different analyzer (like KeywordAnalyzer) so that it matches exactly. This obviously depends on you application of Lucene of course.