Quantcast

handling large files

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

handling large files

Reju Mathew
I need to move large files (around 800 MB) using camel route via Active MQ.
 
The split method does not give any option to split on size. Besides when I use a route like
from(“file:data”).split(..).to("jms:myq").to("file:dest")  - loads the entire file into memory and then tries to split (causing out of memory)?

Is there a better way?

Thanks,
Reju
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

tide08
You can use camel-stream - http://camel.apache.org/stream.html


Reju Mathew wrote
I need to move large files (around 800 MB) using camel route via Active MQ.
 
The split method does not give any option to split on size. Besides when I use a route like
from(“file:data”).split(..).to("jms:myq").to("file:dest")  - loads the entire file into memory and then tries to split (causing out of memory)?

Is there a better way?

Thanks,
Reju
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

mcarson
I have to solve this exact same problem!  I am also trying to:
   -- read an extremely large file from a file system
   -- doing a split on that file
   -- send each split record to a JMS queue

However I am relatively new to Camel.. I do not understand how the stream component can help specify how much data to read before the split occurs... and when I tried to stream the file I got exceptions in my split because when streaming the document it is no longer well formed XML?

Can someone help explain in more detail how using the stream component solves this problem?


This is the camel context I was using successfully to perform these tasks for SMALL files:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="
       http://www.springframework.org/schema/beans   
       http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
       http://camel.apache.org/schema/spring 
       http://camel.apache.org/schema/spring/camel-spring.xsd">
 
  <bean id="jms" class="org.apache.camel.component.jms.JmsComponent">
    <property name="connectionFactory">
    <bean class="org.apache.activemq.ActiveMQConnectionFactory">
    <property name="brokerURL" value="tcp://localhost:61616"/>
    </bean>
    </property>
  </bean>

  <camelContext xmlns="http://camel.apache.org/schema/spring" 
  xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
    <package>camel.myprototype</package>
   
    <route>
      <from uri="file:src/inbox?delete=true"/>
      <split>
      <xpath>/s:Envelope/s:Body/person</xpath>
      <to uri="jms:queue:Q_People"/>
      </split>
    </route>
  </camelContext>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

Claus Ibsen-2
Hi

How do you want to split the file?
Is there a special character that denotes a new "record"

Using java.util.Scanner is great as it can do streaming. And also what
Camel can do if you for example want to split by new line etc.




On Wed, Oct 14, 2009 at 3:01 PM, mcarson <[hidden email]> wrote:

>
> I have to solve this exact same problem!  I am also trying to:
>   -- read an extremely large file from a file system
>   -- doing a split on that file
>   -- send each split record to a JMS queue
>
> However I am relatively new to Camel.. I do not understand how the stream
> component can help specify how much data to read before the split occurs...
> and when I tried to stream the file I got exceptions in my split because
> when streaming the document it is no longer well formed XML?
>
> Can someone help explain in more detail how using the stream component
> solves this problem?
>
>
> This is the camel context I was using successfully to perform these tasks
> for SMALL files:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <beans xmlns="http://www.springframework.org/schema/beans"
>       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>       xsi:schemaLocation="
>       http://www.springframework.org/schema/beans
>       http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
>       http://camel.apache.org/schema/spring
>       http://camel.apache.org/schema/spring/camel-spring.xsd">
>
>  <bean id="jms" class="org.apache.camel.component.jms.JmsComponent">
>    <property name="connectionFactory">
>        <bean class="org.apache.activemq.ActiveMQConnectionFactory">
>                <property name="brokerURL" value="tcp://localhost:61616"/>
>        </bean>
>    </property>
>  </bean>
>
>  <camelContext xmlns="http://camel.apache.org/schema/spring"
>        xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
>    <package>camel.myprototype</package>
>
>    <route>
>      <from uri="file:src/inbox?delete=true"/>
>      <split>
>        <xpath>/s:Envelope/s:Body/person</xpath>
>        <to uri="jms:queue:Q_People"/>
>      </split>
>    </route>
>  </camelContext>
> --
> View this message in context: http://www.nabble.com/handling-large-files-tp25826380p25890696.html
> Sent from the Camel - Users mailing list archive at Nabble.com.
>
>



--
Claus Ibsen
Apache Camel Committer

Open Source Integration: http://fusesource.com
Blog: http://davsclaus.blogspot.com/
Twitter: http://twitter.com/davsclaus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

mcarson
I would like to split using xpath if possible:

<xpath>/Envelope/Body/person</xpath> 

Claus Ibsen-2 wrote
Hi

How do you want to split the file?
Is there a special character that denotes a new "record"

Using java.util.Scanner is great as it can do streaming. And also what
Camel can do if you for example want to split by new line etc.

--
Claus Ibsen
Apache Camel Committer

Open Source Integration: http://fusesource.com
Blog: http://davsclaus.blogspot.com/
Twitter: http://twitter.com/davsclaus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

mcarson
It looks like the scanner might provide me with the capabilities I was looking for regarding reading in a file in delimited chunks.  I'm assuming I would implement this as a bean... can the bean component be used as a "from" in a camel route?  I'm new to Camel, and I have never seen that done.  Is there an example bean (that is a consumer of some sort) that I could use to model my code after?


Claus Ibsen-2 wrote
Hi

How do you want to split the file?
Is there a special character that denotes a new "record"

Using java.util.Scanner is great as it can do streaming. And also what
Camel can do if you for example want to split by new line etc.

--
Claus Ibsen
Apache Camel Committer

Open Source Integration: http://fusesource.com
Blog: http://davsclaus.blogspot.com/
Twitter: http://twitter.com/davsclaus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

Claus Ibsen-2
Hi

On Wed, Oct 14, 2009 at 4:16 PM, mcarson <[hidden email]> wrote:
>
> It looks like the scanner might provide me with the capabilities I was
> looking for regarding reading in a file in delimited chunks.  I'm assuming I
> would implement this as a bean... can the bean component be used as a "from"
> in a camel route?  I'm new to Camel, and I have never seen that done.  Is
> there an example bean (that is a consumer of some sort) that I could use to
> model my code after?
>

Since you use xpath then I took at dive into looking how to split big files.
Using InputSource seems to do the trick as it allow xpath to use SAX
events which fits with streaming.

I will work a bit to get it supported nice out of the box. And provide
details how to do it in 2.0.



>
>
> Claus Ibsen-2 wrote:
>>
>> Hi
>>
>> How do you want to split the file?
>> Is there a special character that denotes a new "record"
>>
>> Using java.util.Scanner is great as it can do streaming. And also what
>> Camel can do if you for example want to split by new line etc.
>>
>> --
>> Claus Ibsen
>> Apache Camel Committer
>>
>> Open Source Integration: http://fusesource.com
>> Blog: http://davsclaus.blogspot.com/
>> Twitter: http://twitter.com/davsclaus
>>
>>
>
> --
> View this message in context: http://www.nabble.com/handling-large-files-tp25826380p25891924.html
> Sent from the Camel - Users mailing list archive at Nabble.com.
>
>



--
Claus Ibsen
Apache Camel Committer

Open Source Integration: http://fusesource.com
Blog: http://davsclaus.blogspot.com/
Twitter: http://twitter.com/davsclaus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

Claus Ibsen-2
On Wed, Oct 14, 2009 at 4:21 PM, Claus Ibsen <[hidden email]> wrote:

> Hi
>
> On Wed, Oct 14, 2009 at 4:16 PM, mcarson <[hidden email]> wrote:
>>
>> It looks like the scanner might provide me with the capabilities I was
>> looking for regarding reading in a file in delimited chunks.  I'm assuming I
>> would implement this as a bean... can the bean component be used as a "from"
>> in a camel route?  I'm new to Camel, and I have never seen that done.  Is
>> there an example bean (that is a consumer of some sort) that I could use to
>> model my code after?
>>
>
> Since you use xpath then I took at dive into looking how to split big files.
> Using InputSource seems to do the trick as it allow xpath to use SAX
> events which fits with streaming.
>
> I will work a bit to get it supported nice out of the box. And provide
> details how to do it in 2.0.
>

Ah yeah the xpath will still at least hold all the result into memory.

As you can only get a result of these types listed here:
http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/xpath/XPathConstants.html

And none of them is stream based.

So even with SAX to parse the big xml file the xpath expression
evaluation will result into all data being loaded into memory, or at
least the NodeList which contains all the splitted entries.

So maybe that Scanner is better if you can do some custom clipping. I
believe its regexp based so you may be able to find a good regexp that
can split on </person> or something.







>
>
>>
>>
>> Claus Ibsen-2 wrote:
>>>
>>> Hi
>>>
>>> How do you want to split the file?
>>> Is there a special character that denotes a new "record"
>>>
>>> Using java.util.Scanner is great as it can do streaming. And also what
>>> Camel can do if you for example want to split by new line etc.
>>>
>>> --
>>> Claus Ibsen
>>> Apache Camel Committer
>>>
>>> Open Source Integration: http://fusesource.com
>>> Blog: http://davsclaus.blogspot.com/
>>> Twitter: http://twitter.com/davsclaus
>>>
>>>
>>
>> --
>> View this message in context: http://www.nabble.com/handling-large-files-tp25826380p25891924.html
>> Sent from the Camel - Users mailing list archive at Nabble.com.
>>
>>
>
>
>
> --
> Claus Ibsen
> Apache Camel Committer
>
> Open Source Integration: http://fusesource.com
> Blog: http://davsclaus.blogspot.com/
> Twitter: http://twitter.com/davsclaus
>



--
Claus Ibsen
Apache Camel Committer

Open Source Integration: http://fusesource.com
Blog: http://davsclaus.blogspot.com/
Twitter: http://twitter.com/davsclaus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

Claus Ibsen-2
Hi

This is as far I got with the xpath expression for splitting
http://svn.apache.org/viewvc?rev=825156&view=rev



On Wed, Oct 14, 2009 at 4:40 PM, Claus Ibsen <[hidden email]> wrote:

> On Wed, Oct 14, 2009 at 4:21 PM, Claus Ibsen <[hidden email]> wrote:
>> Hi
>>
>> On Wed, Oct 14, 2009 at 4:16 PM, mcarson <[hidden email]> wrote:
>>>
>>> It looks like the scanner might provide me with the capabilities I was
>>> looking for regarding reading in a file in delimited chunks.  I'm assuming I
>>> would implement this as a bean... can the bean component be used as a "from"
>>> in a camel route?  I'm new to Camel, and I have never seen that done.  Is
>>> there an example bean (that is a consumer of some sort) that I could use to
>>> model my code after?
>>>
>>
>> Since you use xpath then I took at dive into looking how to split big files.
>> Using InputSource seems to do the trick as it allow xpath to use SAX
>> events which fits with streaming.
>>
>> I will work a bit to get it supported nice out of the box. And provide
>> details how to do it in 2.0.
>>
>
> Ah yeah the xpath will still at least hold all the result into memory.
>
> As you can only get a result of these types listed here:
> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/xpath/XPathConstants.html
>
> And none of them is stream based.
>
> So even with SAX to parse the big xml file the xpath expression
> evaluation will result into all data being loaded into memory, or at
> least the NodeList which contains all the splitted entries.
>
> So maybe that Scanner is better if you can do some custom clipping. I
> believe its regexp based so you may be able to find a good regexp that
> can split on </person> or something.
>
>
>
>
>
>
>
>>
>>
>>>
>>>
>>> Claus Ibsen-2 wrote:
>>>>
>>>> Hi
>>>>
>>>> How do you want to split the file?
>>>> Is there a special character that denotes a new "record"
>>>>
>>>> Using java.util.Scanner is great as it can do streaming. And also what
>>>> Camel can do if you for example want to split by new line etc.
>>>>
>>>> --
>>>> Claus Ibsen
>>>> Apache Camel Committer
>>>>
>>>> Open Source Integration: http://fusesource.com
>>>> Blog: http://davsclaus.blogspot.com/
>>>> Twitter: http://twitter.com/davsclaus
>>>>
>>>>
>>>
>>> --
>>> View this message in context: http://www.nabble.com/handling-large-files-tp25826380p25891924.html
>>> Sent from the Camel - Users mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> Claus Ibsen
>> Apache Camel Committer
>>
>> Open Source Integration: http://fusesource.com
>> Blog: http://davsclaus.blogspot.com/
>> Twitter: http://twitter.com/davsclaus
>>
>
>
>
> --
> Claus Ibsen
> Apache Camel Committer
>
> Open Source Integration: http://fusesource.com
> Blog: http://davsclaus.blogspot.com/
> Twitter: http://twitter.com/davsclaus
>



--
Claus Ibsen
Apache Camel Committer

Open Source Integration: http://fusesource.com
Blog: http://davsclaus.blogspot.com/
Twitter: http://twitter.com/davsclaus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

mcarson
In order to get the scanner solution to work, I would still need some way to start polling on the directory at the beginning of my camel route, correct?  Is there a way to use the "file" component (or any components) as a "from" to detect that a file has arrived in a directory, but NOT actually read the file?  It would be nice if I could simply detect a large file, and passing along only the fileName to the next step in the route.  This would then make it possible for next step in the route (which could be a "scanner" bean)  that could scan through and split the received file effectively.


Claus Ibsen-2 wrote
Hi

This is as far I got with the xpath expression for splitting
http://svn.apache.org/viewvc?rev=825156&view=rev



On Wed, Oct 14, 2009 at 4:40 PM, Claus Ibsen <claus.ibsen@gmail.com> wrote:
> On Wed, Oct 14, 2009 at 4:21 PM, Claus Ibsen <claus.ibsen@gmail.com> wrote:
>> Hi
>>
>> On Wed, Oct 14, 2009 at 4:16 PM, mcarson <mcarson@amsa.com> wrote:
>>>
>>> It looks like the scanner might provide me with the capabilities I was
>>> looking for regarding reading in a file in delimited chunks.  I'm assuming I
>>> would implement this as a bean... can the bean component be used as a "from"
>>> in a camel route?  I'm new to Camel, and I have never seen that done.  Is
>>> there an example bean (that is a consumer of some sort) that I could use to
>>> model my code after?
>>>
>>
>> Since you use xpath then I took at dive into looking how to split big files.
>> Using InputSource seems to do the trick as it allow xpath to use SAX
>> events which fits with streaming.
>>
>> I will work a bit to get it supported nice out of the box. And provide
>> details how to do it in 2.0.
>>
>
> Ah yeah the xpath will still at least hold all the result into memory.
>
> As you can only get a result of these types listed here:
> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/xpath/XPathConstants.html
>
> And none of them is stream based.
>
> So even with SAX to parse the big xml file the xpath expression
> evaluation will result into all data being loaded into memory, or at
> least the NodeList which contains all the splitted entries.
>
> So maybe that Scanner is better if you can do some custom clipping. I
> believe its regexp based so you may be able to find a good regexp that
> can split on </person> or something.
>
>
>
>
>
>
>
>>
>>
>>>
>>>
>>> Claus Ibsen-2 wrote:
>>>>
>>>> Hi
>>>>
>>>> How do you want to split the file?
>>>> Is there a special character that denotes a new "record"
>>>>
>>>> Using java.util.Scanner is great as it can do streaming. And also what
>>>> Camel can do if you for example want to split by new line etc.
>>>>
>>>> --
>>>> Claus Ibsen
>>>> Apache Camel Committer
>>>>
>>>> Open Source Integration: http://fusesource.com
>>>> Blog: http://davsclaus.blogspot.com/
>>>> Twitter: http://twitter.com/davsclaus
>>>>
>>>>
>>>
>>> --
>>> View this message in context: http://www.nabble.com/handling-large-files-tp25826380p25891924.html
>>> Sent from the Camel - Users mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> Claus Ibsen
>> Apache Camel Committer
>>
>> Open Source Integration: http://fusesource.com
>> Blog: http://davsclaus.blogspot.com/
>> Twitter: http://twitter.com/davsclaus
>>
>
>
>
> --
> Claus Ibsen
> Apache Camel Committer
>
> Open Source Integration: http://fusesource.com
> Blog: http://davsclaus.blogspot.com/
> Twitter: http://twitter.com/davsclaus
>



--
Claus Ibsen
Apache Camel Committer

Open Source Integration: http://fusesource.com
Blog: http://davsclaus.blogspot.com/
Twitter: http://twitter.com/davsclaus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

Claus Ibsen-2
On Wed, Oct 14, 2009 at 10:28 PM, mcarson <[hidden email]> wrote:

>
> In order to get the scanner solution to work, I would still need some way to
> start polling on the directory at the beginning of my camel route, correct?
> Is there a way to use the "file" component (or any components) as a "from"
> to detect that a file has arrived in a directory, but NOT actually read the
> file?  It would be nice if I could simply detect a large file, and passing
> along only the fileName to the next step in the route.  This would then make
> it possible for next step in the route (which could be a "scanner" bean)
> that could scan through and split the received file effectively.
>

Yeah the file component does NOT read the content. It holds just a
java.io.File object (in facts its  a GenericFile as it also works for
FTP files).
Anyway you just grab the java.io.File using

File file = exchange.getIn().getBody(File.class);


>
>
> Claus Ibsen-2 wrote:
>>
>> Hi
>>
>> This is as far I got with the xpath expression for splitting
>> http://svn.apache.org/viewvc?rev=825156&view=rev
>>
>>
>>
>> On Wed, Oct 14, 2009 at 4:40 PM, Claus Ibsen <[hidden email]>
>> wrote:
>>> On Wed, Oct 14, 2009 at 4:21 PM, Claus Ibsen <[hidden email]>
>>> wrote:
>>>> Hi
>>>>
>>>> On Wed, Oct 14, 2009 at 4:16 PM, mcarson <[hidden email]> wrote:
>>>>>
>>>>> It looks like the scanner might provide me with the capabilities I was
>>>>> looking for regarding reading in a file in delimited chunks.  I'm
>>>>> assuming I
>>>>> would implement this as a bean... can the bean component be used as a
>>>>> "from"
>>>>> in a camel route?  I'm new to Camel, and I have never seen that done.
>>>>>  Is
>>>>> there an example bean (that is a consumer of some sort) that I could
>>>>> use to
>>>>> model my code after?
>>>>>
>>>>
>>>> Since you use xpath then I took at dive into looking how to split big
>>>> files.
>>>> Using InputSource seems to do the trick as it allow xpath to use SAX
>>>> events which fits with streaming.
>>>>
>>>> I will work a bit to get it supported nice out of the box. And provide
>>>> details how to do it in 2.0.
>>>>
>>>
>>> Ah yeah the xpath will still at least hold all the result into memory.
>>>
>>> As you can only get a result of these types listed here:
>>> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/xpath/XPathConstants.html
>>>
>>> And none of them is stream based.
>>>
>>> So even with SAX to parse the big xml file the xpath expression
>>> evaluation will result into all data being loaded into memory, or at
>>> least the NodeList which contains all the splitted entries.
>>>
>>> So maybe that Scanner is better if you can do some custom clipping. I
>>> believe its regexp based so you may be able to find a good regexp that
>>> can split on </person> or something.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>> Claus Ibsen-2 wrote:
>>>>>>
>>>>>> Hi
>>>>>>
>>>>>> How do you want to split the file?
>>>>>> Is there a special character that denotes a new "record"
>>>>>>
>>>>>> Using java.util.Scanner is great as it can do streaming. And also what
>>>>>> Camel can do if you for example want to split by new line etc.
>>>>>>
>>>>>> --
>>>>>> Claus Ibsen
>>>>>> Apache Camel Committer
>>>>>>
>>>>>> Open Source Integration: http://fusesource.com
>>>>>> Blog: http://davsclaus.blogspot.com/
>>>>>> Twitter: http://twitter.com/davsclaus
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/handling-large-files-tp25826380p25891924.html
>>>>> Sent from the Camel - Users mailing list archive at Nabble.com.
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Claus Ibsen
>>>> Apache Camel Committer
>>>>
>>>> Open Source Integration: http://fusesource.com
>>>> Blog: http://davsclaus.blogspot.com/
>>>> Twitter: http://twitter.com/davsclaus
>>>>
>>>
>>>
>>>
>>> --
>>> Claus Ibsen
>>> Apache Camel Committer
>>>
>>> Open Source Integration: http://fusesource.com
>>> Blog: http://davsclaus.blogspot.com/
>>> Twitter: http://twitter.com/davsclaus
>>>
>>
>>
>>
>> --
>> Claus Ibsen
>> Apache Camel Committer
>>
>> Open Source Integration: http://fusesource.com
>> Blog: http://davsclaus.blogspot.com/
>> Twitter: http://twitter.com/davsclaus
>>
>>
>
> --
> View this message in context: http://www.nabble.com/handling-large-files-tp25826380p25898450.html
> Sent from the Camel - Users mailing list archive at Nabble.com.
>
>



--
Claus Ibsen
Apache Camel Committer

Open Source Integration: http://fusesource.com
Blog: http://davsclaus.blogspot.com/
Twitter: http://twitter.com/davsclaus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

mcarson
Using the scanner seems to work for parsing down the huge file based upon a delimiter.  However it appears that either the JmsTemplate I'm using to send messages or ActiveMQ cannot keep pace.

Somewhere between 250K - 500K sends, I get this stack trace:

Exception in thread "main" org.springframework.jms.UncategorizedJmsException: Uncategorized exception occured during JMS processing; nested exception is javax.jms.JMSException: java.io.EOFException
        at org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:308)
        at org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:168)
        at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:474)
        at org.springframework.jms.core.JmsTemplate.send(JmsTemplate.java:548)
        at asa.camel.TestScanner.parseRecord(TestScanner.java:62)
        at asa.camel.TestScanner.readFile(TestScanner.java:34)
        at asa.camel.TestScanner.main(TestScanner.java:82)
Caused by: javax.jms.JMSException: java.io.EOFException
        at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:49)
        at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1244)
        at org.apache.activemq.ActiveMQConnection.ensureConnectionInfoSent(ActiveMQConnection.java:1339)
        at org.apache.activemq.ActiveMQConnection.createSession(ActiveMQConnection.java:298)
        at org.springframework.jms.support.JmsAccessor.createSession(JmsAccessor.java:196)
        at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:462)
        ... 4 more
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:375)
        at org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
        at org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:210)
        at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:202)
        at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:185)
        at java.lang.Thread.run(Thread.java:619)

Any ideas what could cause this?

Claus Ibsen-2 wrote
>>>>>
>>>>>
>>>>> Claus Ibsen-2 wrote:
>>>>>>
>>>>>> Hi
>>>>>>
>>>>>> How do you want to split the file?
>>>>>> Is there a special character that denotes a new "record"
>>>>>>
>>>>>> Using java.util.Scanner is great as it can do streaming. And also what
>>>>>> Camel can do if you for example want to split by new line etc.
>>>>>>
>>>>>> --
>>>>>> Claus Ibsen
>>>>>> Apache Camel Committer
>>>>>>
>>>>>> Open Source Integration: http://fusesource.com
>>>>>> Blog: http://davsclaus.blogspot.com/
>>>>>> Twitter: http://twitter.com/davsclaus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

Claus Ibsen-2
Hi

Try searching with google and ask on the AMQ forum.

Are you AMQ queue big enough to contain all this data? Do you have
consumers reading off the queues?
And AMQ have many parameters to configure it correctly.

Remember to report which AMQ version you are using.

And how big is the messages you send? Are they text based on binary based.

On Fri, Oct 16, 2009 at 2:41 PM, mcarson <[hidden email]> wrote:

>
> Using the scanner seems to work for parsing down the huge file based upon a
> delimiter.  However it appears that either the JmsTemplate I'm using to send
> messages or ActiveMQ cannot keep pace.
>
> Somewhere between 250K - 500K sends, I get this stack trace:
>
> Exception in thread "main"
> org.springframework.jms.UncategorizedJmsException: Uncategorized exception
> occured during JMS processing; nested exception is javax.jms.JMSException:
> java.io.EOFException
>        at
> org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:308)
>        at
> org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:168)
>        at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:474)
>        at org.springframework.jms.core.JmsTemplate.send(JmsTemplate.java:548)
>        at asa.camel.TestScanner.parseRecord(TestScanner.java:62)
>        at asa.camel.TestScanner.readFile(TestScanner.java:34)
>        at asa.camel.TestScanner.main(TestScanner.java:82)
> Caused by: javax.jms.JMSException: java.io.EOFException
>        at
> org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:49)
>        at
> org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1244)
>        at
> org.apache.activemq.ActiveMQConnection.ensureConnectionInfoSent(ActiveMQConnection.java:1339)
>        at
> org.apache.activemq.ActiveMQConnection.createSession(ActiveMQConnection.java:298)
>        at
> org.springframework.jms.support.JmsAccessor.createSession(JmsAccessor.java:196)
>        at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:462)
>        ... 4 more
> Caused by: java.io.EOFException
>        at java.io.DataInputStream.readInt(DataInputStream.java:375)
>        at
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
>        at
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:210)
>        at
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:202)
>        at
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:185)
>        at java.lang.Thread.run(Thread.java:619)
>
> Any ideas what could cause this?
>
>
> Claus Ibsen-2 wrote:
>>
>>>>>>>
>>>>>>>
>>>>>>> Claus Ibsen-2 wrote:
>>>>>>>>
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> How do you want to split the file?
>>>>>>>> Is there a special character that denotes a new "record"
>>>>>>>>
>>>>>>>> Using java.util.Scanner is great as it can do streaming. And also
>>>>>>>> what
>>>>>>>> Camel can do if you for example want to split by new line etc.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Claus Ibsen
>>>>>>>> Apache Camel Committer
>>>>>>>>
>>>>>>>> Open Source Integration: http://fusesource.com
>>>>>>>> Blog: http://davsclaus.blogspot.com/
>>>>>>>> Twitter: http://twitter.com/davsclaus
>>
>>
>
> --
> View this message in context: http://www.nabble.com/handling-large-files-tp25826380p25924781.html
> Sent from the Camel - Users mailing list archive at Nabble.com.
>
>



--
Claus Ibsen
Apache Camel Committer

Open Source Integration: http://fusesource.com
Blog: http://davsclaus.blogspot.com/
Twitter: http://twitter.com/davsclaus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

Bruce Snyder
In reply to this post by mcarson
On Fri, Oct 16, 2009 at 7:41 AM, mcarson <[hidden email]> wrote:

>
> Using the scanner seems to work for parsing down the huge file based upon a
> delimiter.  However it appears that either the JmsTemplate I'm using to send
> messages or ActiveMQ cannot keep pace.
>
> Somewhere between 250K - 500K sends, I get this stack trace:
>
> Exception in thread "main"
> org.springframework.jms.UncategorizedJmsException: Uncategorized exception
> occured during JMS processing; nested exception is javax.jms.JMSException:
> java.io.EOFException
>        at
> org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:308)
>        at
> org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:168)
>        at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:474)
>        at org.springframework.jms.core.JmsTemplate.send(JmsTemplate.java:548)
>        at asa.camel.TestScanner.parseRecord(TestScanner.java:62)
>        at asa.camel.TestScanner.readFile(TestScanner.java:34)
>        at asa.camel.TestScanner.main(TestScanner.java:82)
> Caused by: javax.jms.JMSException: java.io.EOFException
>        at
> org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:49)
>        at
> org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1244)
>        at
> org.apache.activemq.ActiveMQConnection.ensureConnectionInfoSent(ActiveMQConnection.java:1339)
>        at
> org.apache.activemq.ActiveMQConnection.createSession(ActiveMQConnection.java:298)
>        at
> org.springframework.jms.support.JmsAccessor.createSession(JmsAccessor.java:196)
>        at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:462)
>        ... 4 more
> Caused by: java.io.EOFException
>        at java.io.DataInputStream.readInt(DataInputStream.java:375)
>        at
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
>        at
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:210)
>        at
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:202)
>        at
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:185)
>        at java.lang.Thread.run(Thread.java:619)
>
> Any ideas what could cause this?

I'm not sure what kind of configuration you're using for ActiveMQ, but
are you utilizing a connection pooler so that the the connection,
session and producer are not recreated for every call to send()? See
the following for more info:

http://activemq.apache.org/jmstemplate-gotchas.html

Bruce
--
perl -e 'print unpack("u30","D0G)U8V4\@4VYY9&5R\"F)R=6-E+G-N>61E<D\!G;6%I;\"YC;VT*"
);'

ActiveMQ in Action: http://bit.ly/2je6cQ
Blog: http://bruceblog.org/
Twitter: http://twitter.com/brucesnyder
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

mcarson

<quote author="bsnyder">
On Fri, Oct 16, 2009 at 7:41 AM, mcarson <mcarson@amsa.com> wrote:
>
> Using the scanner seems to work for parsing down the huge file based upon a
> delimiter.  However it appears that either the JmsTemplate I'm using to send
> messages or ActiveMQ cannot keep pace.
>
> Somewhere between 250K - 500K sends, I get this stack trace:
>
> Exception in thread "main"
> org.springframework.jms.UncategorizedJmsException: Uncategorized exception
> occured during JMS processing; nested exception is javax.jms.JMSException:
> java.io.EOFException
>        at
> org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:308)
>        at
> org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:168)
>        at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:474)
>        at org.springframework.jms.core.JmsTemplate.send(JmsTemplate.java:548)
>        at asa.camel.TestScanner.parseRecord(TestScanner.java:62)
>        at asa.camel.TestScanner.readFile(TestScanner.java:34)
>        at asa.camel.TestScanner.main(TestScanner.java:82)
> Caused by: javax.jms.JMSException: java.io.EOFException
>        at
> org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:49)
>        at
> org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1244)
>        at
> org.apache.activemq.ActiveMQConnection.ensureConnectionInfoSent(ActiveMQConnection.java:1339)
>        at
> org.apache.activemq.ActiveMQConnection.createSession(ActiveMQConnection.java:298)
>        at
> org.springframework.jms.support.JmsAccessor.createSession(JmsAccessor.java:196)
>        at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:462)
>        ... 4 more
> Caused by: java.io.EOFException
>        at java.io.DataInputStream.readInt(DataInputStream.java:375)
>        at
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
>        at
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:210)
>        at
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:202)
>        at
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:185)
>        at java.lang.Thread.run(Thread.java:619)
>
> Any ideas what could cause this?

I'm not sure what kind of configuration you're using for ActiveMQ, but
are you utilizing a connection pooler so that the the connection,
session and producer are not recreated for every call to send()? See
the following for more info:

http://activemq.apache.org/jmstemplate-gotchas.html

Bruce
--
perl -e 'print unpack("u30","D0G)U8V4\@4VYY9&5R\"F)R=6-E+G-N>61E<D\!G;6%I;\"YC;VT*"
);'

ActiveMQ in Action: http://bit.ly/2je6cQ
Blog: http://bruceblog.org/
Twitter: http://twitter.com/brucesnyder

</quote>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

mcarson
In reply to this post by Bruce Snyder
This problem was solved by loosening the configuration of ActiveMQ to allow for fast producers.

bsnyder wrote
I'm not sure what kind of configuration you're using for ActiveMQ, but
are you utilizing a connection pooler so that the the connection,
session and producer are not recreated for every call to send()? See
the following for more info:

http://activemq.apache.org/jmstemplate-gotchas.html

Bruce
--
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

Justinson
In reply to this post by Claus Ibsen-2
Unfortunately I'm getting an OutOfMemoryError using XPath splitting the way you shown. I'm parsing a file with about 500000 xml messages.

How can we use Apache Digester instead?
 
Claus Ibsen-2 wrote
Hi

This is as far I got with the xpath expression for splitting
http://svn.apache.org/viewvc?rev=825156&view=rev



On Wed, Oct 14, 2009 at 4:40 PM, Claus Ibsen <claus.ibsen@gmail.com> wrote:
> On Wed, Oct 14, 2009 at 4:21 PM, Claus Ibsen <claus.ibsen@gmail.com> wrote:
>> Hi
>>
>> On Wed, Oct 14, 2009 at 4:16 PM, mcarson <mcarson@amsa.com> wrote:
>>>
>>> It looks like the scanner might provide me with the capabilities I was
>>> looking for regarding reading in a file in delimited chunks.  I'm assuming I
>>> would implement this as a bean... can the bean component be used as a "from"
>>> in a camel route?  I'm new to Camel, and I have never seen that done.  Is
>>> there an example bean (that is a consumer of some sort) that I could use to
>>> model my code after?
>>>
>>
>> Since you use xpath then I took at dive into looking how to split big files.
>> Using InputSource seems to do the trick as it allow xpath to use SAX
>> events which fits with streaming.
>>
>> I will work a bit to get it supported nice out of the box. And provide
>> details how to do it in 2.0.
>>
>
> Ah yeah the xpath will still at least hold all the result into memory.
>
> As you can only get a result of these types listed here:
> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/xpath/XPathConstants.html
>
> And none of them is stream based.
>
> So even with SAX to parse the big xml file the xpath expression
> evaluation will result into all data being loaded into memory, or at
> least the NodeList which contains all the splitted entries.
>
> So maybe that Scanner is better if you can do some custom clipping. I
> believe its regexp based so you may be able to find a good regexp that
> can split on </person> or something.
>
>
>
>
>
>
>
>>
>>
>>>
>>>
>>> Claus Ibsen-2 wrote:
>>>>
>>>> Hi
>>>>
>>>> How do you want to split the file?
>>>> Is there a special character that denotes a new "record"
>>>>
>>>> Using java.util.Scanner is great as it can do streaming. And also what
>>>> Camel can do if you for example want to split by new line etc.
>>>>
>>>> --
>>>> Claus Ibsen
>>>> Apache Camel Committer
>>>>
>>>> Open Source Integration: http://fusesource.com
>>>> Blog: http://davsclaus.blogspot.com/
>>>> Twitter: http://twitter.com/davsclaus
>>>>
>>>>
>>>
>>> --
>>> View this message in context: http://www.nabble.com/handling-large-files-tp25826380p25891924.html
>>> Sent from the Camel - Users mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> Claus Ibsen
>> Apache Camel Committer
>>
>> Open Source Integration: http://fusesource.com
>> Blog: http://davsclaus.blogspot.com/
>> Twitter: http://twitter.com/davsclaus
>>
>
>
>
> --
> Claus Ibsen
> Apache Camel Committer
>
> Open Source Integration: http://fusesource.com
> Blog: http://davsclaus.blogspot.com/
> Twitter: http://twitter.com/davsclaus
>



--
Claus Ibsen
Apache Camel Committer

Open Source Integration: http://fusesource.com
Blog: http://davsclaus.blogspot.com/
Twitter: http://twitter.com/davsclaus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

Claus Ibsen-2
On Tue, Mar 23, 2010 at 8:24 PM, Justinson <[hidden email]> wrote:
>
> Unfortunately I'm getting an OutOfMemoryError using XPath splitting the way
> you shown. I'm parsing a file with about 500000 xml messages.
>

You could pre process the big file and split it into X files.
Maybe by using the java.util.Scanner to identify "good places" to
split the big file.

Or you could try using SAX based XML parsing when splitting to reduce
the memory overhead.
Just use a Bean for that. Something like this:

public Iterator splitBigFile(java.io.File file) {
  // SAX parsing the big file and return an iterator or something that
can walk the XML messages you like
}

And use the bean with the Camel Split EIP


> How can we use Apache Digester instead?
>
>
> Claus Ibsen-2 wrote:
>>
>> Hi
>>
>> This is as far I got with the xpath expression for splitting
>> http://svn.apache.org/viewvc?rev=825156&view=rev
>>
>>
>>
>> On Wed, Oct 14, 2009 at 4:40 PM, Claus Ibsen <[hidden email]>
>> wrote:
>>> On Wed, Oct 14, 2009 at 4:21 PM, Claus Ibsen <[hidden email]>
>>> wrote:
>>>> Hi
>>>>
>>>> On Wed, Oct 14, 2009 at 4:16 PM, mcarson <[hidden email]> wrote:
>>>>>
>>>>> It looks like the scanner might provide me with the capabilities I was
>>>>> looking for regarding reading in a file in delimited chunks.  I'm
>>>>> assuming I
>>>>> would implement this as a bean... can the bean component be used as a
>>>>> "from"
>>>>> in a camel route?  I'm new to Camel, and I have never seen that done.
>>>>>  Is
>>>>> there an example bean (that is a consumer of some sort) that I could
>>>>> use to
>>>>> model my code after?
>>>>>
>>>>
>>>> Since you use xpath then I took at dive into looking how to split big
>>>> files.
>>>> Using InputSource seems to do the trick as it allow xpath to use SAX
>>>> events which fits with streaming.
>>>>
>>>> I will work a bit to get it supported nice out of the box. And provide
>>>> details how to do it in 2.0.
>>>>
>>>
>>> Ah yeah the xpath will still at least hold all the result into memory.
>>>
>>> As you can only get a result of these types listed here:
>>> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/xpath/XPathConstants.html
>>>
>>> And none of them is stream based.
>>>
>>> So even with SAX to parse the big xml file the xpath expression
>>> evaluation will result into all data being loaded into memory, or at
>>> least the NodeList which contains all the splitted entries.
>>>
>>> So maybe that Scanner is better if you can do some custom clipping. I
>>> believe its regexp based so you may be able to find a good regexp that
>>> can split on </person> or something.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>> Claus Ibsen-2 wrote:
>>>>>>
>>>>>> Hi
>>>>>>
>>>>>> How do you want to split the file?
>>>>>> Is there a special character that denotes a new "record"
>>>>>>
>>>>>> Using java.util.Scanner is great as it can do streaming. And also what
>>>>>> Camel can do if you for example want to split by new line etc.
>>>>>>
>>>>>> --
>>>>>> Claus Ibsen
>>>>>> Apache Camel Committer
>>>>>>
>>>>>> Open Source Integration: http://fusesource.com
>>>>>> Blog: http://davsclaus.blogspot.com/
>>>>>> Twitter: http://twitter.com/davsclaus
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/handling-large-files-tp25826380p25891924.html
>>>>> Sent from the Camel - Users mailing list archive at Nabble.com.
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Claus Ibsen
>>>> Apache Camel Committer
>>>>
>>>> Open Source Integration: http://fusesource.com
>>>> Blog: http://davsclaus.blogspot.com/
>>>> Twitter: http://twitter.com/davsclaus
>>>>
>>>
>>>
>>>
>>> --
>>> Claus Ibsen
>>> Apache Camel Committer
>>>
>>> Open Source Integration: http://fusesource.com
>>> Blog: http://davsclaus.blogspot.com/
>>> Twitter: http://twitter.com/davsclaus
>>>
>>
>>
>>
>> --
>> Claus Ibsen
>> Apache Camel Committer
>>
>> Open Source Integration: http://fusesource.com
>> Blog: http://davsclaus.blogspot.com/
>> Twitter: http://twitter.com/davsclaus
>>
>>
>
> --
> View this message in context: http://old.nabble.com/handling-large-files-tp25826380p28005868.html
> Sent from the Camel - Users mailing list archive at Nabble.com.
>
>



--
Claus Ibsen
Apache Camel Committer

Author of Camel in Action: http://www.manning.com/ibsen/
Open Source Integration: http://fusesource.com
Blog: http://davsclaus.blogspot.com/
Twitter: http://twitter.com/davsclaus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

Justinson
Thank you very much for your advices.

Claus Ibsen-2 wrote
> Claus Ibsen-2 wrote:
>>
>> Hi
>>
>> This is as far I got with the xpath expression for splitting
>> http://svn.apache.org/viewvc?rev=825156&view=rev

On Tue, Mar 23, 2010 at 8:24 PM, Justinson <justinson@googlemail.com> wrote:
> Unfortunately I'm getting an OutOfMemoryError using XPath splitting the way
> you shown. I'm parsing a file with about 500000 xml messages.

You could pre process the big file and split it into X files.
Maybe by using the java.util.Scanner to identify "good places" to
split the big file.
I'm just trying to handle the "format stack" properly: It's a byte stream in the base layer but an XML stream in the second layer. In my case the byte stream has no own structure so I cannot split it. Therefore I'd try to apply your second advice using XML-aware parsing.

Claus Ibsen-2 wrote
Or you could try using SAX based XML parsing when splitting to reduce
the memory overhead.
Just use a Bean for that. Something like this:

public Iterator splitBigFile(java.io.File file) {
  // SAX parsing the big file and return an iterator or something that
can walk the XML messages you like
}

And use the bean with the Camel Split EIP
How it possible to integrate a "push" parser paradigm more smoothly into Camel than hinding it behind an iterator?

(For iterator-based XML splitting, using StAX "pull" XML parsing is probably a more proper choice. Do you know a StAX-based product supporting XPath-like pattern matching?)

Claus Ibsen-2 wrote
> How can we use Apache Digester instead?
The Commons Digester supports a XPath-like pattern-matching syntax and uses SAX behind the scenes. It also exibits the "push" paradigm of SAX but introduces a stack concept for match results. Thats why a stream-like handling is supported. Unfortunately Camel does not have a support for Digester at the moment.

Another idea: Would you recommend using of Xstream for this task?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: handling large files

Claus Ibsen-2
On Fri, Mar 26, 2010 at 9:15 AM, Justinson <[hidden email]> wrote:

>
> Thank you very much for your advices.
>
>
> Claus Ibsen-2 wrote:
>>
>> On Tue, Mar 23, 2010 at 8:24 PM, Justinson <[hidden email]>
>> wrote:
>>>
>>> Unfortunately I'm getting an OutOfMemoryError using XPath splitting the
>>> way
>>> you shown. I'm parsing a file with about 500000 xml messages.
>>
>> You could pre process the big file and split it into X files.
>> Maybe by using the java.util.Scanner to identify "good places" to
>> split the big file.
>>
>
> I'm just trying to handle the "format stack" properly: It's a byte stream in
> the base layer but an XML stream in the second layer. In my case the byte
> stream has no own structure so I cannot split it. Therefore I'd try to apply
> your second advice using XML-aware parsing.
>
>
> Claus Ibsen-2 wrote:
>>
>>
>> Or you could try using SAX based XML parsing when splitting to reduce
>> the memory overhead.
>> Just use a Bean for that. Something like this:
>>
>> public Iterator splitBigFile(java.io.File file) {
>>   // SAX parsing the big file and return an iterator or something that
>> can walk the XML messages you like
>> }
>>
>> And use the bean with the Camel Split EIP
>>
>>
>
> How it possible to integrate a "push" parser paradigm more smoothly into
> Camel than hinding it behind an iterator?
>
> (For iterator-based XML splitting, using StAX "pull" XML parsing is probably
> a more proper choice.)
>

Try googling for a solution using XPath in Java as its what is used
under the covers.
It have a XPathFactory where you can set features and whatnot. I may
offer ways to
tweak how it should run in pull or push mode. And whether it offers to
stream the result etc.



>
> Claus Ibsen-2 wrote:
>>
>>
>>> How can we use Apache Digester instead?
>>
>>
>
> The Commons Digester supports a XPath-like pattern-matching syntax and uses
> SAX behind the scenes. It also exibits the "push" paradigm of SAX but
> introduces a stack concept for match results. Thats why a stream-like
> handling is supported. Unfortunately Camel does not have a support for
> Digester at the moment.
>
> Another idea: Would you recommend using of Xstream for this task?
>
>
> Claus Ibsen-2 wrote:
>>
>>
>>> Claus Ibsen-2 wrote:
>>>>
>>>> Hi
>>>>
>>>> This is as far I got with the xpath expression for splitting
>>>> http://svn.apache.org/viewvc?rev=825156&view=rev
>>
>>
> --
> View this message in context: http://old.nabble.com/handling-large-files-tp25826380p28038839.html
> Sent from the Camel - Users mailing list archive at Nabble.com.
>
>



--
Claus Ibsen
Apache Camel Committer

Author of Camel in Action: http://www.manning.com/ibsen/
Open Source Integration: http://fusesource.com
Blog: http://davsclaus.blogspot.com/
Twitter: http://twitter.com/davsclaus
12
Loading...