Chapter 7. Index Optimization

From time to time, the Lucene index needs to be optimized. The process is essentially a defragmentation. Until an optimization is triggered Lucene only marks deleted documents as such, no physical deletions are applied. During the optimization process the deletions will be applied which also effects the number of files in the Lucene Directory.

Optimising the Lucene index speeds up searches but has no effect on the indexation (update) performance. During an optimization, searches can be performed, but will most likely be slowed down. All index updates will be stopped. It is recommended to schedule optimization:

7.1. Automatic optimization

Hibernate Search can automatically optimize an index after:

  • a certain amount of operations (insertion, deletion)

  • or a certain amout of transactions

The configuration for automatic index optimization can be defined on a global level or per index:

Example 7.1. Defining automatic optimization parameters

hibernate.search.default.optimizer.operation_limit.max = 1000
hibernate.search.default.optimizer.transaction_limit.max = 100
hibernate.search.Animal.optimizer.transaction_limit.max = 50

An optimization will be triggered to the Animal index as soon as either:

  • the number of additions and deletions reaches 1000

  • the number of transactions reaches 50 (hibernate.search.Animal.optimizer.transaction_limit.max having priority over hibernate.search.default.optimizer.transaction_limit.max)

If none of these parameters are defined, no optimization is processed automatically.

7.2. Manual optimization

You can programmatically optimize (defragment) a Lucene index from Hibernate Search through the SearchFactory:

Example 7.2. Programmatic index optimization

FullTextSession fullTextSession = Search.getFullTextSession(regularSession);
SearchFactory searchFactory = fullTextSession.getSearchFactory();

searchFactory.optimize(Order.class);
// or
searchFactory.optimize();

The first example optimizes the Lucene index holding Orders; the second, optimizes all indexes.

Note

searchFactory.optimize() has no effect on a JMS backend. You must apply the optimize operation on the Master node.

7.3. Adjusting optimization

Apache Lucene has a few parameters to influence how optimization is performed. Hibernate Search exposes those parameters.

Further index optimisation parameters include:

  • hibernate.search.[default|<indexname>].indexwriter.[batch|transaction].max_buffered_docs
  • hibernate.search.[default|<indexname>].indexwriter.[batch|transaction].max_field_length
  • hibernate.search.[default|<indexname>].indexwriter.[batch|transaction].max_merge_docs
  • hibernate.search.[default|<indexname>].indexwriter.[batch|transaction].merge_factor
  • hibernate.search.[default|<indexname>].indexwriter.[batch|transaction].ram_buffer_size
  • hibernate.search.[default|<indexname>].indexwriter.[batch|transaction].term_index_interval

See Section 3.8, “Tuning Lucene indexing performance” for more details.