Using crawl-url nodes to enqueue data along with a url in Watson Explorer

viv:crawl-enqueue-url is documented as having 1 argument – the url to enqueue.
However, it has an optional second form with two arguments which allows much more flexibility. If the first argument is set to the url and the second is a variable containing a crawl-url, that also works. (note the use of entities to create the content tags)
For example:
<xsl:variable name="my-crawl-url">


<crawl-url another-attribute="value">


<crawl-data content-type="application/vxml-unnormalized">


&lt;content name="inside" &gt;I am a content!&lt;/content &gt;


</crawl-data>


</crawl-url>


</xsl:variable>


<xsl:value-of select="viv:crawl-enqueue-url('www.someurl.com', $my-crawl-url)" />
When www.someurl.com is crawled, the content ‘inside’ will be added to it (after running through the normalization converter, probably), and from the converter, the attributes ‘url’ (automatically added) and ‘another-attribute’ (manually added) will be available to you via viv:current-node()/@attribute-name. If you want to add multiple contents in your crawl-data node, you’ll need to give them a root node – ‘document’ will work:
<crawl-data content-type="application/vxml-unnormalized">


&lt;document>


&lt;content name="inside" &gt;I am a content!&lt;/content &gt;


&lt;content name="inside2" &gt;I am another content!&lt;/content &gt;


&lt;/document>


</crawl-data>
The prototype of the two-argument form actually looks something like: viv:crawl-enqueue-url(object, node). If the first argument evaluates to true, it is used as the url attribute on the node. That’s why the url is automatically added, above. If it evaluates to false, the node is used as-is. Thus, an equivalent form of the above is:
<xsl:variable name="my-crawl-url">


<crawl-url url="www.someurl.com" another-attribute="value">


<crawl-data content-type="application/vxml-unnormalized">


&lt;content name="inside" &gt;I am a content! &lt;/content &gt;


</crawl-data>


</crawl-url>


</xsl:variable>


<xsl:value-of select="viv:crawl-enqueue-url(false(), $my-crawl-url)" />

 

Published by

John Ward

I've been in working in the tech space since about 2004. I've spent time working with Artificial Intelligence, Machine Learning, Natural Language Processing, and Advertising technology.