14.4. Manual Index Changes
14.4.1. Adding Instances to the Index
FullTextSession
.index(T entity)
you can directly add or update a specific object instance to the index. If this entity was already indexed, then the index will be updated. Changes to the index are only applied at transaction commit.
Example 14.61. Indexing an entity via FullTextSession.index(T entity)
FullTextSession fullTextSession = Search.getFullTextSession(session);
Transaction tx = fullTextSession.beginTransaction();
Object customer = fullTextSession.load( Customer.class, 8 );
fullTextSession.index(customer);
tx.commit(); //index only updated at commit time
MassIndexer
: see Section 14.4.3.2, “Using a MassIndexer” for more details.
14.4.2. Deleting Instances from the Index
FullTextSession
.
Example 14.62. Purging a specific instance of an entity from the index
FullTextSession fullTextSession = Search.getFullTextSession(session);
Transaction tx = fullTextSession.beginTransaction();
for (Customer customer : customers) {
fullTextSession.purge( Customer.class, customer.getId() );
}
tx.commit(); //index is updated at commit time
purgeAll
method. This operation removes all entities of the type passed as a parameter as well as all its subtypes.
Example 14.63. Purging all instances of an entity from the index
FullTextSession fullTextSession = Search.getFullTextSession(session);
Transaction tx = fullTextSession.beginTransaction();
fullTextSession.purgeAll( Customer.class );
//optionally optimize the index
//fullTextSession.getSearchFactory().optimize( Customer.class );
tx.commit(); //index changes are applied at commit time
Note
index
, purge
, and purgeAll
are available on FullTextEntityManager
as well.
Note
index
, purge
, and purgeAll
) only affect the index, not the database, nevertheless they are transactional and as such they won't be applied until the transaction is successfully committed, or you make use of flushToIndexes
.
14.4.3. Rebuilding the Index
- Using
FullTextSession
.flushToIndexes()
periodically, while usingFullTextSession
.index()
on all entities. - Use a
MassIndexer
.
14.4.3.1. Using flushToIndexes()
FullTextSession
.purgeAll()
and FullTextSession
.index()
, however there are some memory and efficiency constraints. For maximum efficiency Hibernate Search batches index operations and executes them at commit time. If you expect to index a lot of data you need to be careful about memory consumption since all documents are kept in a queue until the transaction commit. You can potentially face an OutOfMemoryException
if you don't empty the queue periodically; to do this use fullTextSession.flushToIndexes()
. Every time fullTextSession.flushToIndexes()
is called (or if the transaction is committed), the batch queue is processed, applying all index changes. Be aware that, once flushed, the changes cannot be rolled back.
Example 14.64. Index rebuilding using index() and flushToIndexes()
fullTextSession.setFlushMode(FlushMode.MANUAL); fullTextSession.setCacheMode(CacheMode.IGNORE); transaction = fullTextSession.beginTransaction(); //Scrollable results will avoid loading too many objects in memory ScrollableResults results = fullTextSession.createCriteria( Email.class ) .setFetchSize(BATCH_SIZE) .scroll( ScrollMode.FORWARD_ONLY ); int index = 0; while( results.next() ) { index++; fullTextSession.index( results.get(0) ); //index each element if (index % BATCH_SIZE == 0) { fullTextSession.flushToIndexes(); //apply changes to indexes fullTextSession.clear(); //free memory since the queue is processed } } transaction.commit();
Note
hibernate.search.default.worker.batch_size
has been deprecated in favor of this explicit API which provides better control
14.4.3.2. Using a MassIndexer
MassIndexer
uses several parallel threads to rebuild the index. You can optionally select which entities need to be reloaded or have it reindex all entities. This approach is optimized for best performance but requires to set the application in maintenance mode. Querying the index is not recommended when a MassIndexer is busy.
Example 14.65. Rebuild the Index Using a MassIndexer
fullTextSession.createIndexer().startAndWait();
Warning
Example 14.66. Using a Tuned MassIndexer
fullTextSession .createIndexer( User.class ) .batchSizeToLoadObjects( 25 ) .cacheMode( CacheMode.NORMAL ) .threadsToLoadObjects( 12 ) .idFetchSize( 150 ) .progressMonitor( monitor ) //a MassIndexerProgressMonitor implementation .startAndWait();
FieldBridge
s or ClassBridge
s to output a Lucene document. The threads trigger lazyloading of additional attributes during the conversion process. Because of this, a high number of threads working in parallel is required. The number of threads working on actual index writing is defined by the backend configuration of each index.
CacheMode.IGNORE
(the default), as in most reindexing situations the cache will be a useless additional overhead. It might be useful to enable some other CacheMode
depending on your data as it could increase performance if the main entity is relating to enum-like data included in the index.
Note
Note
hibernate.search.[default|<indexname>].exclusive_index_use
hibernate.search.[default|<indexname>].indexwriter.max_buffered_docs
hibernate.search.[default|<indexname>].indexwriter.max_merge_docs
hibernate.search.[default|<indexname>].indexwriter.merge_factor
hibernate.search.[default|<indexname>].indexwriter.merge_min_size
hibernate.search.[default|<indexname>].indexwriter.merge_max_size
hibernate.search.[default|<indexname>].indexwriter.merge_max_optimize_size
hibernate.search.[default|<indexname>].indexwriter.merge_calibrate_by_deletes
hibernate.search.[default|<indexname>].indexwriter.ram_buffer_size
hibernate.search.[default|<indexname>].indexwriter.term_index_interval
max_field_length
but this was removed from Lucene, it's possible to obtain a similar effect by using a LimitTokenCountAnalyzer
.
.indexwriter
parameters are Lucene specific and Hibernate Search passes these parameters through.
MassIndexer
uses a forward only scrollable result to iterate on the primary keys to be loaded, but MySQL's JDBC driver will load all values in memory. To avoid this "optimization" set idFetchSize
to Integer.MIN_VALUE
.