6.6. Search and Text Extraction

The full-text search language and JCR-SQL2's full-text search constraint both have the ability to find nodes using a simpler search-engine-like expression with wildcards and phrases.

One can imagine how the hierarchical database performs these matches against a node's name and properties containing STRING, LONG, DATE, DOUBLE, DECIMAL, NAME, and PATH values. But for BINARY values, in order to determine whether the search expressions match, the hierarchical database has to determine what text is contained within each BINARY value. Indeed, the hierarchical database can only match against the BINARY value if it can extract the text from that value. This is where text extraction comes into play.

A text extractor is a component that knows how to extract searchable text from a BINARY value. Each text extract describes whether it can process files of a particular MIME type. If it can, the hierarchical database will (when necessary) call the extractor to obtain the searchable text for a supplied BINARY value.

6.6. Search and Text Extraction

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links