Ce contenu n'est pas disponible dans la langue sélectionnée.

3.8. Tuning Lucene indexing performance


Hibernate Search allows you to tune the Lucene indexing performance by specifying a set of parameters which are passed through to underlying Lucene IndexWriter such as mergeFactor, maxMergeDocs and maxBufferedDocs. You can specify these parameters either as default values applying for all indexes, on a per index basis, or even per shard.
There are two sets of parameters allowing for different performance settings depending on the use case. During indexing operations triggered by database modifications, the parameters are grouped by the transaction keyword:
hibernate.search.[default|<indexname>].indexwriter.transaction.<parameter_name>
Copy to Clipboard Toggle word wrap
When indexing occurs via FullTextSession.index() (see Chapter 6, Manual indexing), the used properties are those grouped under the batch keyword:
hibernate.search.[default|<indexname>].indexwriter.batch.<parameter_name>
Copy to Clipboard Toggle word wrap
Unless the corresponding .batch property is explicitly set, the value will default to the .transaction property. If no value is set for a .batch value in a specific shard configuration, Hibernate Search will look at the index section, then at the default section and after that it will look for a .transaction in the same order:
hibernate.search.Animals.2.indexwriter.transaction.max_merge_docs 10
hibernate.search.Animals.2.indexwriter.transaction.merge_factor 20
hibernate.search.default.indexwriter.batch.max_merge_docs 100
Copy to Clipboard Toggle word wrap
This configuration will result in these settings applied to the second shard of Animals index:
  • transaction.max_merge_docs = 10
  • batch.max_merge_docs = 100
  • transaction.merge_factor = 20
  • batch.merge_factor = 20
All other values will use the defaults defined in Lucene.
The default for all values is to leave them at Lucene's own default, so the listed values in the following table actually depend on the version of Lucene you are using; values shown are relative to version 2.4. For more information about Lucene indexing performances, please refer to the Lucene documentation.
Expand
Table 3.3. List of indexing performance and behavior properties
Property Description Default Value
hibernate.search.[default|<indexname>].indexwriter.[transaction|batch].max_buffered_delete_terms
Determines the minimal number of delete terms required before the buffered in-memory delete terms are applied and flushed. If there are documents buffered in memory at the time, they are merged and a new segment is created.
Disabled (flushes by RAM usage)
hibernate.search.[default|<indexname>].indexwriter.[transaction|batch].max_buffered_docs
Controls the amount of documents buffered in memory during indexing. The bigger the more RAM is consumed.
Disabled (flushes by RAM usage)
hibernate.search.[default|<indexname>].indexwriter.[transaction|batch].max_field_length
The maximum number of terms that will be indexed for a single field. This limits the amount of memory required for indexing so that very large data will not crash the indexing process by running out of memory. This setting refers to the number of running terms, not to the number of different terms.
This silently truncates large documents, excluding from the index all terms that occur further in the document. If you know your source documents are large, be sure to set this value high enough to accommodate the expected size. If you set it to Integer.MAX_VALUE, then the only limit is your memory, but you should anticipate an OutOfMemoryError.
If setting this value in batch differently than in transaction you may get different data (and results) in your index depending on the indexing mode.
10000
hibernate.search.[default|<indexname>].indexwriter.[transaction|batch].max_merge_docs
Defines the largest number of documents allowed in a segment. Larger values are best for batched indexing and speedier searches. Small values are best for transaction indexing.
Unlimited (Integer.MAX_VALUE)
hibernate.search.[default|<indexname>].indexwriter.[transaction|batch].merge_factor
Controls segment merge frequency and size.
Determines how often segment indices are merged when insertion occurs. With smaller values, less RAM is used while indexing, and searches on unoptimized indices are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches on unoptimized indices are slower, indexing is faster. Thus larger values (> 10) are best for batch index creation, and smaller values (< 10) for indices that are interactively maintained. The value must no be lower than 2.
10
hibernate.search.[default|<indexname>].indexwriter.[transaction|batch].ram_buffer_size
Controls the amount of RAM in MB dedicated to document buffers. When used together max_buffered_docs a flush occurs for whichever event happens first.
Generally for faster indexing performance it's best to flush by RAM usage instead of document count and use as large a RAM buffer as you can.
16 MB
hibernate.search.[default|<indexname>].indexwriter.[transaction|batch].term_index_interval
Expert: Set the interval between indexed terms.
Large values cause less memory to be used by IndexReader, but slow random-access to terms. Small values cause more memory to be used by an IndexReader, and speed random-access to terms. See Lucene documentation for more details.
128
hibernate.search.[default|<indexname>].indexwriter.[transaction|batch].use_compound_file The advantage of using the compound file format is that less file descriptors are used. The disadvantage is that indexing takes more time and temporary disk space. You can set this parameter to false in an attempt to improve the indexing time, but you could run out of file descriptors if mergeFactor is also large.
Boolean parameter, use "true" or "false". The default value for this option is true.
true
Retour au début
Red Hat logoGithubredditYoutubeTwitter

Apprendre

Essayez, achetez et vendez

Communautés

À propos de la documentation Red Hat

Nous aidons les utilisateurs de Red Hat à innover et à atteindre leurs objectifs grâce à nos produits et services avec un contenu auquel ils peuvent faire confiance. Découvrez nos récentes mises à jour.

Rendre l’open source plus inclusif

Red Hat s'engage à remplacer le langage problématique dans notre code, notre documentation et nos propriétés Web. Pour plus de détails, consultez le Blog Red Hat.

À propos de Red Hat

Nous proposons des solutions renforcées qui facilitent le travail des entreprises sur plusieurs plates-formes et environnements, du centre de données central à la périphérie du réseau.

Theme

© 2025 Red Hat