Regular Expression Converter for Watson Explorer Engine

Sometimes it’s useful to extract data from a Watson Explorer content node using regular expressions. In this post, I’ll show you how to extract data using a regular expression and create a new content node for that specific data.

To start off we will use the default example-metadata collection. We will attempt to extract any 3 digit number from the snippet content to make the regex easy. You can do much more advanced regular expressions if necessary.

First go to the example-metadata collection and click “test-it”

Then click on “Test-it” next to the first result:

Now scroll down and look at the output of the ” Create Metadata from Content” converter:

In the output, you will see the snippet content has the number 500 in it.


We will make a converter that will extract any 3 digit number into a new content. First, add a new converter:

Select the Regex entity extraction converter and click Add.

In the converter configuration, in the list of entities node names enter “my-regex-node” and the target node of “snippet”. Then click OK.

Now on the sidebar of WEX click the + next to XML.

enter the following names:

Now update the xml node to include your regular expression like below. Note that my regex is “[0-9]{3}” to match 3 digits. Save the node.

Return to the collection and do a test-it, as we did above, down to that same first result. If you look at the converter trace you will see the regex converter is running.

Click on the 910 output to see your new content node:

Now you can use the new “regex-rule” content in your search application.

Published by

John Ward

Hi, I'm John. By day I'm an IBM Watson Explorer Consultant with several years of experience deploying and customizing Watson Explorer solutions. I'm also a pretty experienced web developer and like to write tutorials and about other things like business and life experiences.