This pipeline processes a large data file that is an XML document. Each 'training-scenario' element can be processed by XSLT but the whole document is too big to load into one in-memory tree. The output is a text data file that can be read by statistical software.
<p:pipe xmlns:p="urn:publicid:IDN+smallx.com:pipeline:1.0" name="scenario2text" xmlns:c="urn:publicid:IDN+smallx.com:component-language:1.0" > <!-- add the aggregation specification to the input --> <p:template> <result> <c:file href="header.xml"/> <xsl:copy-of select="."/> <c:file href="trailer.xml"/> </result> </p:template> <!-- Aggregate by running through the file component --> <p:file/> </p:pipe>