Content Transformation Providers

See Content Transformation Service in Development Guide for concept and usage details.

These providers are automatically enabled

These providers have specific installation requirements detailed below

FileConverter

The FileConverter is a content transformation provider that converts documents from a number of file formats, to PDF, HTML or a number of image formats.

Note that the FileConverter is licensed separately from the core iKnowBase product.
Note that the FileConverter requires 64-bit Linux “x86_64” architecture.

Understanding the FileConverter

Usage of the FileConverter works like this:

The process above implies that for the FileConverter to work, you also need to install a separate Outside In program to the server.

Installing Outside In technology

The Outside In programs are delivered separately from iKnowBase, in a zip-file that will typically be named something like fileConverter-linux-x86-64-outsidein-835.zip. Install this file using the following steps:

$ cd /opt/iknowbase
$ unzip fileConverter-linux-x86-64-outsidein-835.zip

Configuration properties

After installing the outside in technology, you must configure the file converter. The FileConverterConfiguration accepts these configuration properties:

Property name Description
com.iknowbase.batch.fileConverter.outsideInDirectory Location of outside in installation. File Converter is disabled when this is not set.

Testing and troubleshooting

Running tests

The first step is to verify that the converstion program itself runs. Go to the installation directory, and verify that you may run document conversion from the command line:

$ ./exsimple Test.docx Test.pdf pdf.cfg
EX_CALLBACK_ID_PAGECOUNT: The File had 5 pages.
Export successful: 1 output file(s) created.

The second step is to run a “local” conversion from the web-application. Using a browser, open the “/ikbBatch” application. In the tab named “fileconverter”, you will find a number of links for test conversions. They will convert from a Microsoft Word document and a Microsoft PowerPoint presentation, to a number of export formats. Clicking on these will run the server-side conversion, and return the converted document. Using the tests named “Test.docx (local)” and “Test.pptx (local)” will run the test locally, without any database involvment.

The third step is to run a “queue based” conversion. The procedure is the same as above. Using the tests named “Test.docx (queue)” and “Test.pptx (queue)” will send the document through the database for conversion, the same way as most production usage will work.

Missing libraries

A common problem is for conversion to image formats to fail under Linux, due to missing libraries:

$ ./exsimple Test.docx Test.pdf pdf.cfg
./exsimple: error while loading shared libraries: libstdc++.so.5: cannot open shared object file: No such file or directory
./exsimple: error while loading shared libraries: libXm.so.3: cannot open shared object file: No such file or directory

Search for the missing file using the “locate”-command, as shown below. If the file is missing, or only available as a stub, the proper library must be installed.

Fonts not found

If font path is not set you get the following message:

[root@build fileConverter]# ./exsimple test.docx test.tiff tiff.cfg
        EX_CALLBACK_ID_PAGECOUNT: The File had 0 pages.
EXRunExport() failed: No valid fonts found (0x0B03)

This can be fixed by setting the environment variable GDFONTPATH to a true type font directory.

One way of doing this is to add a file named “fonts.sh”, with the following content, to the /etc/profile.d directory:

export GDFONTPATH=/usr/share/fonts/liberation/

Missing fonts

Another common problem is missing fonts:

bc.
[root@ip-10-53-107-93 fileConverter]# ./exsimple Test.docx Test.pdf pdf.cfg
EX_CALLBACK_ID_PAGECOUNT: The File had 1 page.
EXRunExport() failed: The font directory does not contain any font files or the directory is invalid (0x0B02)

This can often be fixed by installing the liberation fonts:

$ yum install liberation-fonts-common liberation-mono-fonts liberation-sans-fonts liberation-serif-fonts libreoffice-opensymbol-fonts

CloudConvert

The CloudConvert content transformation provider supports a large set of transformations via www.cloudconvert.com, including among other things format transformation and image manipulation.

It requires a valid account with www.cloudconvert.com as well as HTTPS network access to www.cloudconvert.com.

Understanding the CloudConvert provider

Concept and transformation instructions are covered in detail in the Development Guide.

Installation

Create an account with www.cloudconvert.com and configure iKnowBase with the API Token from the user account.

The provider is now ready for use through the Content Server or the PLSQL API.

Configuration properties

You may add new or reconfigure transformation types to use the “cloudconvert” provider where applicable.

The CloudConvertConfiguration accepts these configuration properties:

Property name Description
com.iknowbase.content.transformation.cloudconvert.apiToken Subscription token for accessing the www.cloudconvert.com account.
com.iknowbase.content.transformation.cloudconvert.supportedFormats List of supported formats for input or output.
com.iknowbase.content.transformation.cloudconvert.http.connectTimeout Optional: www.cloudconvert.com HTTP client connect timeout (ms).
com.iknowbase.content.transformation.cloudconvert.http.readTimeout Optional: www.cloudconvert.com HTTP client read timeout (ms) if no data has been received.
com.iknowbase.content.transformation.cloudconvert.http.proxy.type Optional: www.cloudconvert.com HTTP client outgoing proxy (, HTTP or SOCKS).
com.iknowbase.content.transformation.cloudconvert.http.proxy.hostname Optional: www.cloudconvert.com HTTP client outgoing proxy hostname.
com.iknowbase.content.transformation.cloudconvert.http.proxy.port Optional: www.cloudconvert.com HTTP client outgoing proxy port.

Testing and troubleshooting

The CloudConvert content transformation provider is available through the Content Transformation Service, which is enabled in both the Content Server (Viewer) and Batch server.

See www.cloudconvert.com API description and console for valid input to output formats and applicable configuratopm options.

Test using Content Server (Viewer) by accessing the URLs (examples):

Transformation fails with HTTP status 40x

Verify that the transformation (all options) is valid using www.cloudconvert.com API description and console.

Enable TRACE logging to get more information about the content transformation process.

Transformation times out

You may need to increase the various timeout settings for the transformation, see Development Guide for details.