6.6. Search and Text Extraction


The full-text search language and JCR-SQL2's full-text search constraint both have the ability to find nodes using a simpler search-engine-like expression with wildcards and phrases.
One can imagine how the hierarchical database performs these matches against a node's name and properties containing STRING, LONG, DATE, DOUBLE, DECIMAL, NAME, and PATH values. But for BINARY values, in order to determine whether the search expressions match, the hierarchical database has to determine what text is contained within each BINARY value. Indeed, the hierarchical database can only match against the BINARY value if it can extract the text from that value. This is where text extraction comes into play.
A text extractor is a component that knows how to extract searchable text from a BINARY value. Each text extract describes whether it can process files of a particular MIME type. If it can, the hierarchical database will (when necessary) call the extractor to obtain the searchable text for a supplied BINARY value.
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2026 Red Hat
Back to top