Using Apache Solr for Content Search |
Previous
|
|
Next
|
Advanced Configuration |
|
Using iKnowBase Instant |
Using Apache Solr for Content Search
Concept
iKnowBase comes with ready-to-use components for integration with the Apache Solr open source enterprise search platform (
http://lucene.apache.org/solr/).
When using iKnowBase together with Apache Solr, the following components will be in use:
- The Apache Solr server is installed on one or more computers. Inside Apache Solr, an iKnowBase search component is installed to handle security considerations
- The database repository and ikbBatch application cooperates to send index updates from iKnowBase to Apache Solr
- The ikbViewer application has user interface components to easily generate search requests and navigate the search result
The process for indexing works somewhat like this:
- The content is inserted, updated or deleted in iKnowBase (1). This can be done thru Forms, Service API or any other user interface made to update content in iKnowBase.
- The document may be triggered by a Solr event if the event condition matches the document (2).
- If triggered, an “index request” is queued to the Oracle Database AQ-system (3). The operation will either be update or delete.
- The ikbBatch application listens to index requests (4).
- If the document is new or updated, ikbBatch will retrieve the document (5), as described in the Solr Configuration. The Solr Configuration defines how the document should be represented in Apache Solr and maps the document/attribute values to Solr fields.
- Finally, ikbBatch creates and sends “update” or “delete” messages to Apache Solr for updating the Solr index (6).
The process for search works somewhat like this:
- Whenever a user initiates a search operation (7), the user interface component creates a SolrQuery and submits to an iKnowBase SolrSearchClient component for execution.
- The search client adds security related information to the request, signs the request using a secure Message Authentication Code, and sends the request to Apache Solr.
- Inside Apache Solr, the iKnowBase search component verifies the security information, and adds the required search conditions (8).
- Apache Solr returns a normal search response which is rendered by the user interface components in iKnowBase.
Installation and setup
See the
Installation Guide
for installation guidelines.
After you’re done with the installation you are ready to index your content.
iKnowBase security
The Apache Solr Search Server, which is distributed together with iKnowBase, includes iKnowBase Solr components which handle security in terms of authorized access to documents. The iKnowBase security search component is configured in the configuration file “solrconfig.xml”, and configured for use by the search handler “/select”. Search handlers, which use this component, will load security information from iKnowBase and filter the result set by iKnowBase access control lists.
Note: If you will configure new search handlers, include the iKnowBase security search component to ensure authorized access to iKnowBase data. For information regarding security in autocomplete operations, see
Configuring search suggestions below.
Configuring the indexing process
Before you start indexing your content you need to decide the following:
- What kind of documents/information should be indexed? You need to investigate your content and define what to index. When you have defined all indexable content you will need to represent the set in one or more Indexing events in Development Studio (available under the Advanced tab), see
Development Reference
. An Indexing event will trigger all documents with the given condition and issue an update statement to the Content Indexer.
- When you have decided which documents to be indexed, you then have to define what to index within a document. You must create a Solr configuration in Development Studio (available under the Advanced tab), where you define all the attributes to be indexed and how they should be represented in Apache Solr, see
Development Reference
. Should the attribute be indexed itself? Should we be able to search for the attribute value text when we do a freetext search? Should it be possible to display the value in the result set? The Solr configuration screen will help you by presenting the most common attributes in your database, they are normally good candidates for indexing.
Building a search page
To build a search page, you will use a Groovy-based template (HtmlViewer, ScriptViewer or ScriptAction), where the iKnowBase SolrSearchClient component provides access to the Solrj library.
The basic flow for a search page is as follows, see below for examples:
- Acquire a search client from the available beans
- Acquire a SolrQuery using the search client. The search client can either provide a new SolrQuery every time, or provide a SolrQuery which is stored in the user session and can be reused. Using a session-based SolrQuery avoids having to send all query configuration on the URL, as they current state is already present.
- Use the search client to apply URL-parameters to the SolrQuery, or set up the required parameters manually
In terms of documentation, you will be interested in the following:
Sample search page structure
A transport set is provided as an example where a basic search page is included with faceting, ordering and autocomplete based on a facet search. Import the file
/Database/source/upgrade/transport_sets/EXP-IKB_SYSTEST-D0D0606DCC1420DAE040000A180076CE-SOLR_Example.dmp
from
ikbStudio/advanced/importjobs
. It contains the “essentials” for using a search client in the page
/cs/solrsearch
.
Configuring search suggestions
Search suggestions (autocomplete) can be implemented in several ways.
On the provided sample page, see chapter
Sample search page structure, autocomplete is implemented using a faceted search. On the search page there is an Ext JS combobox which loads data through an iKnowBase script action. The script action forwards an autocomplete request, which actually is a faceted search, to Solr.
Note: The property facetMinCount is set to 1 to prevent unauthorized access to data, ie. a facet is only returned if it is in use in one of the documents available to the user.
The Solr Suggester component provides an alternative way to implement autocomplete.
Note: The Solr Suggester provides no mechanism to add a security filter and the user may get access to unathorized data, ie. if the completion comes from the title index, the user may see a document title even though he is not authoarized to view the document.
Monitoring the Solr solution
A few key items to check:
- On the ikbBatch page, check that the content indexing queue is emptied.
- Monitor the solr-instance using the console at
http://<hostname>:<solr-port>/solr/#/
- Optimize the solr index using the console at
http://<hostname>:<solr-port>/solr/#/~cores/<corename>