Trending Posts

mint screenshot

Make Money Online: Documenting 10 Years of Failure

By John Ward / March 20, 2015 / 65 Comments
This is the history of my experience trying to make money online over the past 10 years or so. This is by far the longest post I've ever personally written and it's more of an autobiography than a blog post....
pancakeswap prediction bot

I Made an Automated Pancakeswap Prediction Bot

By John Ward / October 14, 2022 / 0 Comments
A few weeks ago I got the idea to try to automatically bid on Pancakeswap's Prediction game. So I decided to try to build a Pancakeswap Prediction bot to take on the task. I did this just to learn about...
IBM Watson Avatar Logo

IBM Watson Explorer

By John Ward / April 6, 2015 / 1 Comment
I'm going to talk a little bit about IBM Watson Explorer (WEX). A few people have contacted me about what I do at my day job as a Watson Explorer Consultant. Since this is my personal site I don't usually...
2022

My 2022 Recap and 2023 Plans

By John Ward / January 13, 2023 / 0 Comments
I haven't been posting to my blog that often, but I wanted to recap 2022 and lay out some of my plans for 2023. Overall, 2022 was a pretty good year for me, and I made some progress on business...
watson explorer vs elastics earch

Watson Explorer vs Elasticsearch for Enterprise Search

By John Ward / May 12, 2020 / 0 Comments
Are you interested in IBM Watson Explorer vs Elasticsearch? Recently, I had to do some comparisons between IBM Watson Explorer and Elasticsearch for a project. I spent some time going through the features of both platforms and found some interesting...
what is ibm watson

What is IBM Watson

By John Ward / July 24, 2020 / 0 Comments
There is a lot of confusion about what exactly IBM Watson is? I'm going to try to clear that up a little bit in this blog post. I'll go into the history of IBM Watson and what IBM is doing...

One AdSense Change Dramatically Increased My Earnings…

By John Ward / March 9, 2015 / 12 Comments
... and I have no idea what it is. I used to blog actively on my tutorial site, TeamTutorials. In it's prime the site would see over 100k page views per month. Even in the prime the site barely met...

Are Products the Road to Prosperity?

By John Ward / May 13, 2015 / 1 Comment
A few weeks ago a wrote a somewhat popular post about my past experiences trying to make money online. I went through the ups and downs of working as an affiliate promoting other people's products. At the conclusion of the...
xrp

What’s the Future of XRP

By John Ward / September 20, 2021 / 0 Comments
The XRP token was a fast-growing cryptocurrency until the U.S. Securities and Exchange Commission (SEC) filed a lawsuit against the parent company, Ripple,  back in December of 2020. This left many people scrambling to dump their XRP holdings and several...
IBM Watson explorer connector list

IBM Watson Explorer Connector List

By John Ward / March 4, 2020 / 0 Comments
IBM Watson Explorer Foundational Components include many out-of-box connectors for various types of data sources including web, filesystems, shares, databases and content management systems. This allows Watson Explorer (WEX) to be a very versatile tool for ingesting and indexing from...

Copy and Modified Documents with a Watson Explorer Converter

A common task when crawling and indexing a document in Watson Explorer Engine (WEX) is making changes to a document during the conversion process. The most common occurrence is needing to copy all the contents in the application-vxml document while making some changes to one or a few of those contents. To do this, there is a recursive copy template that can be used. I’ll show you how to apply it.

First, I’m going to use the out-of-box “example-metadata” collection. Navigate to that collection and click the test-it button.

wex collection screenshot test it

After clicking test-it you will see a listing of documents. Click on the test-it button for the “blowout” record.

watson explorer test-it results

On the resulting page, scroll down and look at the conversion trace. There is a converter called “Create Metadata from Content”. This is the converter that ships with WEX to convert the HTML files into v:xml documents. Each of the links on the left side represent input and output of that conversion step. We want to click on the output of this converter to see what the document looks like.

watson explorer conversion trace

You will see the output of your current V:XML document. Note that I have a Google Chrome plugin that is converting my XML output for display.

watson explorer converter output

For the sake of this exercise, let’s change the title field to contain the actual title and the author. Like this: Blowout – Lucy Spring. To do this we go back to the previous page and click “add new converter” further down the page.
watson explorer add converter

We want a custom converter

watson explorer add custom converter

Now you will see the configuration screen for a custom converter
wex_converter_08You want to set both the type-in and type-out to application/vxml-unnormalized as we want to apply this template to application/vxml-unnormalized and we will provide application/vxml-unnormalized as output. I use “unnormalized” because I want the normal WEX normalization functions to still apply after this transformation. Also give your converter a name.

wex custom converter configuration

The next section is the conditional setting. This is where you can determine the matches that will cause the converter to apply. In this case we want to match all so I just add a wildcard (*).

wex converter conditional settings

You can skip the advanced section and focus on the Action section. First, the needs to be set to XSL since we’re applying an XSL template to an XML document.

watson explorer custom converter action

Now we’ll use a standard template that allows you to copy nodes with special processing.

<!-- Match the root, recur -->
<xsl:template match="/">
  <xsl:apply-templates select="." mode="copy" />
</xsl:template>

<!-- Specialty nodes go here -->

<!-- End specialty nodes -->

<xsl:template match="@* | text() | comment()" mode="copy">
  <xsl:copy />
</xsl:template>

<!-- Default action, keep recurring and copying -->
<xsl:template match="*" mode="copy">
  <xsl:copy>
    <xsl:apply-templates select="@*" mode="copy" />
    <xsl:apply-templates mode="copy" />
  </xsl:copy>
</xsl:template>

The template above will only copy the document if you run it this way. We want to modify this to merge our title and author by matching on the title content and copying some things.

<!-- Match the root, recur -->
<xsl:template match="/">
  <xsl:apply-templates select="." mode="copy" />
</xsl:template>

<!-- Specialty nodes go here -->

<!-- match the title content -->
<xsl:template match="content[@name='title']" mode="copy">

  <!-- create a new content node -->
  <content>
    <!-- copy all the attributes -->
    <xsl:copy-of select="@*" />

    <!-- copy the value of the current node (title) and add a dash and the author content value-->

    <xsl:value-of select="concat(.,' - ',//content[@name='author'])" />


  </content>

</xsl:template>

<!-- End specialty nodes -->

<xsl:template match="@* | text() | comment()" mode="copy">
  <xsl:copy />
</xsl:template>

<!-- Default action, keep recurring and copying -->
<xsl:template match="*" mode="copy">
  <xsl:copy>
    <xsl:apply-templates select="@*" mode="copy" />
    <xsl:apply-templates mode="copy" />
  </xsl:copy>
</xsl:template>

As you can see I’ve added comments in the code above. The important thing to note is that I want to modify the title content so I match it and the mode is always copy due to the way this template works. Then I just copy the attributes, and concat the two values I wanted.

Save this converter and click test-it again at the top of the Watson Explorer page. You will now see your new converter in the conversion trace.

wex custom converter conversion trace

Now if we check the input and output we’ll see the difference.

The before:

wex before converter

Now the title after:

wex converter after

Now if you crawl this collection your titles will include the author name in the search results.
watson explorer search results

Published by

John Ward

I've been in working in the tech space since about 2004. I've spent time working with Artificial Intelligence, Machine Learning, Natural Language Processing, and Advertising technology.