|
I need to move large files (around 800 MB) using camel route via Active MQ.
The split method does not give any option to split on size. Besides when I use a route like from(“file:data”).split(..).to("jms:myq").to("file:dest") - loads the entire file into memory and then tries to split (causing out of memory)? Is there a better way? Thanks, Reju |
|
You can use camel-stream - http://camel.apache.org/stream.html
|
|
I have to solve this exact same problem! I am also trying to:
-- read an extremely large file from a file system -- doing a split on that file -- send each split record to a JMS queue However I am relatively new to Camel.. I do not understand how the stream component can help specify how much data to read before the split occurs... and when I tried to stream the file I got exceptions in my split because when streaming the document it is no longer well formed XML? Can someone help explain in more detail how using the stream component solves this problem? This is the camel context I was using successfully to perform these tasks for SMALL files: <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.5.xsd http://camel.apache.org/schema/spring http://camel.apache.org/schema/spring/camel-spring.xsd"> <bean id="jms" class="org.apache.camel.component.jms.JmsComponent"> <property name="connectionFactory"> <bean class="org.apache.activemq.ActiveMQConnectionFactory"> <property name="brokerURL" value="tcp://localhost:61616"/> </bean> </property> </bean> <camelContext xmlns="http://camel.apache.org/schema/spring" xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"> <package>camel.myprototype</package> <route> <from uri="file:src/inbox?delete=true"/> <split> <xpath>/s:Envelope/s:Body/person</xpath> <to uri="jms:queue:Q_People"/> </split> </route> </camelContext> |
|
Hi
How do you want to split the file? Is there a special character that denotes a new "record" Using java.util.Scanner is great as it can do streaming. And also what Camel can do if you for example want to split by new line etc. On Wed, Oct 14, 2009 at 3:01 PM, mcarson <[hidden email]> wrote: > > I have to solve this exact same problem! I am also trying to: > -- read an extremely large file from a file system > -- doing a split on that file > -- send each split record to a JMS queue > > However I am relatively new to Camel.. I do not understand how the stream > component can help specify how much data to read before the split occurs... > and when I tried to stream the file I got exceptions in my split because > when streaming the document it is no longer well formed XML? > > Can someone help explain in more detail how using the stream component > solves this problem? > > > This is the camel context I was using successfully to perform these tasks > for SMALL files: > > <?xml version="1.0" encoding="UTF-8"?> > <beans xmlns="http://www.springframework.org/schema/beans" > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xsi:schemaLocation=" > http://www.springframework.org/schema/beans > http://www.springframework.org/schema/beans/spring-beans-2.5.xsd > http://camel.apache.org/schema/spring > http://camel.apache.org/schema/spring/camel-spring.xsd"> > > <bean id="jms" class="org.apache.camel.component.jms.JmsComponent"> > <property name="connectionFactory"> > <bean class="org.apache.activemq.ActiveMQConnectionFactory"> > <property name="brokerURL" value="tcp://localhost:61616"/> > </bean> > </property> > </bean> > > <camelContext xmlns="http://camel.apache.org/schema/spring" > xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"> > <package>camel.myprototype</package> > > <route> > <from uri="file:src/inbox?delete=true"/> > <split> > <xpath>/s:Envelope/s:Body/person</xpath> > <to uri="jms:queue:Q_People"/> > </split> > </route> > </camelContext> > -- > View this message in context: http://www.nabble.com/handling-large-files-tp25826380p25890696.html > Sent from the Camel - Users mailing list archive at Nabble.com. > > -- Claus Ibsen Apache Camel Committer Open Source Integration: http://fusesource.com Blog: http://davsclaus.blogspot.com/ Twitter: http://twitter.com/davsclaus |
|
I would like to split using xpath if possible:
<xpath>/Envelope/Body/person</xpath>
|
|
It looks like the scanner might provide me with the capabilities I was looking for regarding reading in a file in delimited chunks. I'm assuming I would implement this as a bean... can the bean component be used as a "from" in a camel route? I'm new to Camel, and I have never seen that done. Is there an example bean (that is a consumer of some sort) that I could use to model my code after?
|
|
Hi
On Wed, Oct 14, 2009 at 4:16 PM, mcarson <[hidden email]> wrote: > > It looks like the scanner might provide me with the capabilities I was > looking for regarding reading in a file in delimited chunks. I'm assuming I > would implement this as a bean... can the bean component be used as a "from" > in a camel route? I'm new to Camel, and I have never seen that done. Is > there an example bean (that is a consumer of some sort) that I could use to > model my code after? > Since you use xpath then I took at dive into looking how to split big files. Using InputSource seems to do the trick as it allow xpath to use SAX events which fits with streaming. I will work a bit to get it supported nice out of the box. And provide details how to do it in 2.0. > > > Claus Ibsen-2 wrote: >> >> Hi >> >> How do you want to split the file? >> Is there a special character that denotes a new "record" >> >> Using java.util.Scanner is great as it can do streaming. And also what >> Camel can do if you for example want to split by new line etc. >> >> -- >> Claus Ibsen >> Apache Camel Committer >> >> Open Source Integration: http://fusesource.com >> Blog: http://davsclaus.blogspot.com/ >> Twitter: http://twitter.com/davsclaus >> >> > > -- > View this message in context: http://www.nabble.com/handling-large-files-tp25826380p25891924.html > Sent from the Camel - Users mailing list archive at Nabble.com. > > -- Claus Ibsen Apache Camel Committer Open Source Integration: http://fusesource.com Blog: http://davsclaus.blogspot.com/ Twitter: http://twitter.com/davsclaus |
|
On Wed, Oct 14, 2009 at 4:21 PM, Claus Ibsen <[hidden email]> wrote:
> Hi > > On Wed, Oct 14, 2009 at 4:16 PM, mcarson <[hidden email]> wrote: >> >> It looks like the scanner might provide me with the capabilities I was >> looking for regarding reading in a file in delimited chunks. I'm assuming I >> would implement this as a bean... can the bean component be used as a "from" >> in a camel route? I'm new to Camel, and I have never seen that done. Is >> there an example bean (that is a consumer of some sort) that I could use to >> model my code after? >> > > Since you use xpath then I took at dive into looking how to split big files. > Using InputSource seems to do the trick as it allow xpath to use SAX > events which fits with streaming. > > I will work a bit to get it supported nice out of the box. And provide > details how to do it in 2.0. > Ah yeah the xpath will still at least hold all the result into memory. As you can only get a result of these types listed here: http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/xpath/XPathConstants.html And none of them is stream based. So even with SAX to parse the big xml file the xpath expression evaluation will result into all data being loaded into memory, or at least the NodeList which contains all the splitted entries. So maybe that Scanner is better if you can do some custom clipping. I believe its regexp based so you may be able to find a good regexp that can split on </person> or something. > > >> >> >> Claus Ibsen-2 wrote: >>> >>> Hi >>> >>> How do you want to split the file? >>> Is there a special character that denotes a new "record" >>> >>> Using java.util.Scanner is great as it can do streaming. And also what >>> Camel can do if you for example want to split by new line etc. >>> >>> -- >>> Claus Ibsen >>> Apache Camel Committer >>> >>> Open Source Integration: http://fusesource.com >>> Blog: http://davsclaus.blogspot.com/ >>> Twitter: http://twitter.com/davsclaus >>> >>> >> >> -- >> View this message in context: http://www.nabble.com/handling-large-files-tp25826380p25891924.html >> Sent from the Camel - Users mailing list archive at Nabble.com. >> >> > > > > -- > Claus Ibsen > Apache Camel Committer > > Open Source Integration: http://fusesource.com > Blog: http://davsclaus.blogspot.com/ > Twitter: http://twitter.com/davsclaus > -- Claus Ibsen Apache Camel Committer Open Source Integration: http://fusesource.com Blog: http://davsclaus.blogspot.com/ Twitter: http://twitter.com/davsclaus |
|
Hi
This is as far I got with the xpath expression for splitting http://svn.apache.org/viewvc?rev=825156&view=rev On Wed, Oct 14, 2009 at 4:40 PM, Claus Ibsen <[hidden email]> wrote: > On Wed, Oct 14, 2009 at 4:21 PM, Claus Ibsen <[hidden email]> wrote: >> Hi >> >> On Wed, Oct 14, 2009 at 4:16 PM, mcarson <[hidden email]> wrote: >>> >>> It looks like the scanner might provide me with the capabilities I was >>> looking for regarding reading in a file in delimited chunks. I'm assuming I >>> would implement this as a bean... can the bean component be used as a "from" >>> in a camel route? I'm new to Camel, and I have never seen that done. Is >>> there an example bean (that is a consumer of some sort) that I could use to >>> model my code after? >>> >> >> Since you use xpath then I took at dive into looking how to split big files. >> Using InputSource seems to do the trick as it allow xpath to use SAX >> events which fits with streaming. >> >> I will work a bit to get it supported nice out of the box. And provide >> details how to do it in 2.0. >> > > Ah yeah the xpath will still at least hold all the result into memory. > > As you can only get a result of these types listed here: > http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/xpath/XPathConstants.html > > And none of them is stream based. > > So even with SAX to parse the big xml file the xpath expression > evaluation will result into all data being loaded into memory, or at > least the NodeList which contains all the splitted entries. > > So maybe that Scanner is better if you can do some custom clipping. I > believe its regexp based so you may be able to find a good regexp that > can split on </person> or something. > > > > > > > >> >> >>> >>> >>> Claus Ibsen-2 wrote: >>>> >>>> Hi >>>> >>>> How do you want to split the file? >>>> Is there a special character that denotes a new "record" >>>> >>>> Using java.util.Scanner is great as it can do streaming. And also what >>>> Camel can do if you for example want to split by new line etc. >>>> >>>> -- >>>> Claus Ibsen >>>> Apache Camel Committer >>>> >>>> Open Source Integration: http://fusesource.com >>>> Blog: http://davsclaus.blogspot.com/ >>>> Twitter: http://twitter.com/davsclaus >>>> >>>> >>> >>> -- >>> View this message in context: http://www.nabble.com/handling-large-files-tp25826380p25891924.html >>> Sent from the Camel - Users mailing list archive at Nabble.com. >>> >>> >> >> >> >> -- >> Claus Ibsen >> Apache Camel Committer >> >> Open Source Integration: http://fusesource.com >> Blog: http://davsclaus.blogspot.com/ >> Twitter: http://twitter.com/davsclaus >> > > > > -- > Claus Ibsen > Apache Camel Committer > > Open Source Integration: http://fusesource.com > Blog: http://davsclaus.blogspot.com/ > Twitter: http://twitter.com/davsclaus > -- Claus Ibsen Apache Camel Committer Open Source Integration: http://fusesource.com Blog: http://davsclaus.blogspot.com/ Twitter: http://twitter.com/davsclaus |
|
In order to get the scanner solution to work, I would still need some way to start polling on the directory at the beginning of my camel route, correct? Is there a way to use the "file" component (or any components) as a "from" to detect that a file has arrived in a directory, but NOT actually read the file? It would be nice if I could simply detect a large file, and passing along only the fileName to the next step in the route. This would then make it possible for next step in the route (which could be a "scanner" bean) that could scan through and split the received file effectively.
|
|
On Wed, Oct 14, 2009 at 10:28 PM, mcarson <[hidden email]> wrote:
> > In order to get the scanner solution to work, I would still need some way to > start polling on the directory at the beginning of my camel route, correct? > Is there a way to use the "file" component (or any components) as a "from" > to detect that a file has arrived in a directory, but NOT actually read the > file? It would be nice if I could simply detect a large file, and passing > along only the fileName to the next step in the route. This would then make > it possible for next step in the route (which could be a "scanner" bean) > that could scan through and split the received file effectively. > Yeah the file component does NOT read the content. It holds just a java.io.File object (in facts its a GenericFile as it also works for FTP files). Anyway you just grab the java.io.File using File file = exchange.getIn().getBody(File.class); > > > Claus Ibsen-2 wrote: >> >> Hi >> >> This is as far I got with the xpath expression for splitting >> http://svn.apache.org/viewvc?rev=825156&view=rev >> >> >> >> On Wed, Oct 14, 2009 at 4:40 PM, Claus Ibsen <[hidden email]> >> wrote: >>> On Wed, Oct 14, 2009 at 4:21 PM, Claus Ibsen <[hidden email]> >>> wrote: >>>> Hi >>>> >>>> On Wed, Oct 14, 2009 at 4:16 PM, mcarson <[hidden email]> wrote: >>>>> >>>>> It looks like the scanner might provide me with the capabilities I was >>>>> looking for regarding reading in a file in delimited chunks. I'm >>>>> assuming I >>>>> would implement this as a bean... can the bean component be used as a >>>>> "from" >>>>> in a camel route? I'm new to Camel, and I have never seen that done. >>>>> Is >>>>> there an example bean (that is a consumer of some sort) that I could >>>>> use to >>>>> model my code after? >>>>> >>>> >>>> Since you use xpath then I took at dive into looking how to split big >>>> files. >>>> Using InputSource seems to do the trick as it allow xpath to use SAX >>>> events which fits with streaming. >>>> >>>> I will work a bit to get it supported nice out of the box. And provide >>>> details how to do it in 2.0. >>>> >>> >>> Ah yeah the xpath will still at least hold all the result into memory. >>> >>> As you can only get a result of these types listed here: >>> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/xpath/XPathConstants.html >>> >>> And none of them is stream based. >>> >>> So even with SAX to parse the big xml file the xpath expression >>> evaluation will result into all data being loaded into memory, or at >>> least the NodeList which contains all the splitted entries. >>> >>> So maybe that Scanner is better if you can do some custom clipping. I >>> believe its regexp based so you may be able to find a good regexp that >>> can split on </person> or something. >>> >>> >>> >>> >>> >>> >>> >>>> >>>> >>>>> >>>>> >>>>> Claus Ibsen-2 wrote: >>>>>> >>>>>> Hi >>>>>> >>>>>> How do you want to split the file? >>>>>> Is there a special character that denotes a new "record" >>>>>> >>>>>> Using java.util.Scanner is great as it can do streaming. And also what >>>>>> Camel can do if you for example want to split by new line etc. >>>>>> >>>>>> -- >>>>>> Claus Ibsen >>>>>> Apache Camel Committer >>>>>> >>>>>> Open Source Integration: http://fusesource.com >>>>>> Blog: http://davsclaus.blogspot.com/ >>>>>> Twitter: http://twitter.com/davsclaus >>>>>> >>>>>> >>>>> >>>>> -- >>>>> View this message in context: >>>>> http://www.nabble.com/handling-large-files-tp25826380p25891924.html >>>>> Sent from the Camel - Users mailing list archive at Nabble.com. >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Claus Ibsen >>>> Apache Camel Committer >>>> >>>> Open Source Integration: http://fusesource.com >>>> Blog: http://davsclaus.blogspot.com/ >>>> Twitter: http://twitter.com/davsclaus >>>> >>> >>> >>> >>> -- >>> Claus Ibsen >>> Apache Camel Committer >>> >>> Open Source Integration: http://fusesource.com >>> Blog: http://davsclaus.blogspot.com/ >>> Twitter: http://twitter.com/davsclaus >>> >> >> >> >> -- >> Claus Ibsen >> Apache Camel Committer >> >> Open Source Integration: http://fusesource.com >> Blog: http://davsclaus.blogspot.com/ >> Twitter: http://twitter.com/davsclaus >> >> > > -- > View this message in context: http://www.nabble.com/handling-large-files-tp25826380p25898450.html > Sent from the Camel - Users mailing list archive at Nabble.com. > > -- Claus Ibsen Apache Camel Committer Open Source Integration: http://fusesource.com Blog: http://davsclaus.blogspot.com/ Twitter: http://twitter.com/davsclaus |
|
Using the scanner seems to work for parsing down the huge file based upon a delimiter. However it appears that either the JmsTemplate I'm using to send messages or ActiveMQ cannot keep pace.
Somewhere between 250K - 500K sends, I get this stack trace: Exception in thread "main" org.springframework.jms.UncategorizedJmsException: Uncategorized exception occured during JMS processing; nested exception is javax.jms.JMSException: java.io.EOFException at org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:308) at org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:168) at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:474) at org.springframework.jms.core.JmsTemplate.send(JmsTemplate.java:548) at asa.camel.TestScanner.parseRecord(TestScanner.java:62) at asa.camel.TestScanner.readFile(TestScanner.java:34) at asa.camel.TestScanner.main(TestScanner.java:82) Caused by: javax.jms.JMSException: java.io.EOFException at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:49) at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1244) at org.apache.activemq.ActiveMQConnection.ensureConnectionInfoSent(ActiveMQConnection.java:1339) at org.apache.activemq.ActiveMQConnection.createSession(ActiveMQConnection.java:298) at org.springframework.jms.support.JmsAccessor.createSession(JmsAccessor.java:196) at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:462) ... 4 more Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269) at org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:210) at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:202) at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:185) at java.lang.Thread.run(Thread.java:619) Any ideas what could cause this?
|
|
Hi
Try searching with google and ask on the AMQ forum. Are you AMQ queue big enough to contain all this data? Do you have consumers reading off the queues? And AMQ have many parameters to configure it correctly. Remember to report which AMQ version you are using. And how big is the messages you send? Are they text based on binary based. On Fri, Oct 16, 2009 at 2:41 PM, mcarson <[hidden email]> wrote: > > Using the scanner seems to work for parsing down the huge file based upon a > delimiter. However it appears that either the JmsTemplate I'm using to send > messages or ActiveMQ cannot keep pace. > > Somewhere between 250K - 500K sends, I get this stack trace: > > Exception in thread "main" > org.springframework.jms.UncategorizedJmsException: Uncategorized exception > occured during JMS processing; nested exception is javax.jms.JMSException: > java.io.EOFException > at > org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:308) > at > org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:168) > at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:474) > at org.springframework.jms.core.JmsTemplate.send(JmsTemplate.java:548) > at asa.camel.TestScanner.parseRecord(TestScanner.java:62) > at asa.camel.TestScanner.readFile(TestScanner.java:34) > at asa.camel.TestScanner.main(TestScanner.java:82) > Caused by: javax.jms.JMSException: java.io.EOFException > at > org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:49) > at > org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1244) > at > org.apache.activemq.ActiveMQConnection.ensureConnectionInfoSent(ActiveMQConnection.java:1339) > at > org.apache.activemq.ActiveMQConnection.createSession(ActiveMQConnection.java:298) > at > org.springframework.jms.support.JmsAccessor.createSession(JmsAccessor.java:196) > at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:462) > ... 4 more > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269) > at > org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:210) > at > org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:202) > at > org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:185) > at java.lang.Thread.run(Thread.java:619) > > Any ideas what could cause this? > > > Claus Ibsen-2 wrote: >> >>>>>>> >>>>>>> >>>>>>> Claus Ibsen-2 wrote: >>>>>>>> >>>>>>>> Hi >>>>>>>> >>>>>>>> How do you want to split the file? >>>>>>>> Is there a special character that denotes a new "record" >>>>>>>> >>>>>>>> Using java.util.Scanner is great as it can do streaming. And also >>>>>>>> what >>>>>>>> Camel can do if you for example want to split by new line etc. >>>>>>>> >>>>>>>> -- >>>>>>>> Claus Ibsen >>>>>>>> Apache Camel Committer >>>>>>>> >>>>>>>> Open Source Integration: http://fusesource.com >>>>>>>> Blog: http://davsclaus.blogspot.com/ >>>>>>>> Twitter: http://twitter.com/davsclaus >> >> > > -- > View this message in context: http://www.nabble.com/handling-large-files-tp25826380p25924781.html > Sent from the Camel - Users mailing list archive at Nabble.com. > > -- Claus Ibsen Apache Camel Committer Open Source Integration: http://fusesource.com Blog: http://davsclaus.blogspot.com/ Twitter: http://twitter.com/davsclaus |
|
In reply to this post by mcarson
On Fri, Oct 16, 2009 at 7:41 AM, mcarson <[hidden email]> wrote:
> > Using the scanner seems to work for parsing down the huge file based upon a > delimiter. However it appears that either the JmsTemplate I'm using to send > messages or ActiveMQ cannot keep pace. > > Somewhere between 250K - 500K sends, I get this stack trace: > > Exception in thread "main" > org.springframework.jms.UncategorizedJmsException: Uncategorized exception > occured during JMS processing; nested exception is javax.jms.JMSException: > java.io.EOFException > at > org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:308) > at > org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:168) > at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:474) > at org.springframework.jms.core.JmsTemplate.send(JmsTemplate.java:548) > at asa.camel.TestScanner.parseRecord(TestScanner.java:62) > at asa.camel.TestScanner.readFile(TestScanner.java:34) > at asa.camel.TestScanner.main(TestScanner.java:82) > Caused by: javax.jms.JMSException: java.io.EOFException > at > org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:49) > at > org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1244) > at > org.apache.activemq.ActiveMQConnection.ensureConnectionInfoSent(ActiveMQConnection.java:1339) > at > org.apache.activemq.ActiveMQConnection.createSession(ActiveMQConnection.java:298) > at > org.springframework.jms.support.JmsAccessor.createSession(JmsAccessor.java:196) > at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:462) > ... 4 more > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269) > at > org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:210) > at > org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:202) > at > org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:185) > at java.lang.Thread.run(Thread.java:619) > > Any ideas what could cause this? I'm not sure what kind of configuration you're using for ActiveMQ, but are you utilizing a connection pooler so that the the connection, session and producer are not recreated for every call to send()? See the following for more info: http://activemq.apache.org/jmstemplate-gotchas.html Bruce -- perl -e 'print unpack("u30","D0G)U8V4\@4VYY9&5R\"F)R=6-E+G-N>61E<D\!G;6%I;\"YC;VT*" );' ActiveMQ in Action: http://bit.ly/2je6cQ Blog: http://bruceblog.org/ Twitter: http://twitter.com/brucesnyder |
|
<quote author="bsnyder"> On Fri, Oct 16, 2009 at 7:41 AM, mcarson <mcarson@amsa.com> wrote: > > Using the scanner seems to work for parsing down the huge file based upon a > delimiter. However it appears that either the JmsTemplate I'm using to send > messages or ActiveMQ cannot keep pace. > > Somewhere between 250K - 500K sends, I get this stack trace: > > Exception in thread "main" > org.springframework.jms.UncategorizedJmsException: Uncategorized exception > occured during JMS processing; nested exception is javax.jms.JMSException: > java.io.EOFException > at > org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:308) > at > org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:168) > at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:474) > at org.springframework.jms.core.JmsTemplate.send(JmsTemplate.java:548) > at asa.camel.TestScanner.parseRecord(TestScanner.java:62) > at asa.camel.TestScanner.readFile(TestScanner.java:34) > at asa.camel.TestScanner.main(TestScanner.java:82) > Caused by: javax.jms.JMSException: java.io.EOFException > at > org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:49) > at > org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1244) > at > org.apache.activemq.ActiveMQConnection.ensureConnectionInfoSent(ActiveMQConnection.java:1339) > at > org.apache.activemq.ActiveMQConnection.createSession(ActiveMQConnection.java:298) > at > org.springframework.jms.support.JmsAccessor.createSession(JmsAccessor.java:196) > at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:462) > ... 4 more > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269) > at > org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:210) > at > org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:202) > at > org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:185) > at java.lang.Thread.run(Thread.java:619) > > Any ideas what could cause this? I'm not sure what kind of configuration you're using for ActiveMQ, but are you utilizing a connection pooler so that the the connection, session and producer are not recreated for every call to send()? See the following for more info: http://activemq.apache.org/jmstemplate-gotchas.html Bruce -- perl -e 'print unpack("u30","D0G)U8V4\@4VYY9&5R\"F)R=6-E+G-N>61E<D\!G;6%I;\"YC;VT*" );' ActiveMQ in Action: http://bit.ly/2je6cQ Blog: http://bruceblog.org/ Twitter: http://twitter.com/brucesnyder </quote> |
|
In reply to this post by Bruce Snyder
This problem was solved by loosening the configuration of ActiveMQ to allow for fast producers.
|
|
In reply to this post by Claus Ibsen-2
Unfortunately I'm getting an OutOfMemoryError using XPath splitting the way you shown. I'm parsing a file with about 500000 xml messages.
How can we use Apache Digester instead?
|
|
On Tue, Mar 23, 2010 at 8:24 PM, Justinson <[hidden email]> wrote:
> > Unfortunately I'm getting an OutOfMemoryError using XPath splitting the way > you shown. I'm parsing a file with about 500000 xml messages. > You could pre process the big file and split it into X files. Maybe by using the java.util.Scanner to identify "good places" to split the big file. Or you could try using SAX based XML parsing when splitting to reduce the memory overhead. Just use a Bean for that. Something like this: public Iterator splitBigFile(java.io.File file) { // SAX parsing the big file and return an iterator or something that can walk the XML messages you like } And use the bean with the Camel Split EIP > How can we use Apache Digester instead? > > > Claus Ibsen-2 wrote: >> >> Hi >> >> This is as far I got with the xpath expression for splitting >> http://svn.apache.org/viewvc?rev=825156&view=rev >> >> >> >> On Wed, Oct 14, 2009 at 4:40 PM, Claus Ibsen <[hidden email]> >> wrote: >>> On Wed, Oct 14, 2009 at 4:21 PM, Claus Ibsen <[hidden email]> >>> wrote: >>>> Hi >>>> >>>> On Wed, Oct 14, 2009 at 4:16 PM, mcarson <[hidden email]> wrote: >>>>> >>>>> It looks like the scanner might provide me with the capabilities I was >>>>> looking for regarding reading in a file in delimited chunks. I'm >>>>> assuming I >>>>> would implement this as a bean... can the bean component be used as a >>>>> "from" >>>>> in a camel route? I'm new to Camel, and I have never seen that done. >>>>> Is >>>>> there an example bean (that is a consumer of some sort) that I could >>>>> use to >>>>> model my code after? >>>>> >>>> >>>> Since you use xpath then I took at dive into looking how to split big >>>> files. >>>> Using InputSource seems to do the trick as it allow xpath to use SAX >>>> events which fits with streaming. >>>> >>>> I will work a bit to get it supported nice out of the box. And provide >>>> details how to do it in 2.0. >>>> >>> >>> Ah yeah the xpath will still at least hold all the result into memory. >>> >>> As you can only get a result of these types listed here: >>> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/xpath/XPathConstants.html >>> >>> And none of them is stream based. >>> >>> So even with SAX to parse the big xml file the xpath expression >>> evaluation will result into all data being loaded into memory, or at >>> least the NodeList which contains all the splitted entries. >>> >>> So maybe that Scanner is better if you can do some custom clipping. I >>> believe its regexp based so you may be able to find a good regexp that >>> can split on </person> or something. >>> >>> >>> >>> >>> >>> >>> >>>> >>>> >>>>> >>>>> >>>>> Claus Ibsen-2 wrote: >>>>>> >>>>>> Hi >>>>>> >>>>>> How do you want to split the file? >>>>>> Is there a special character that denotes a new "record" >>>>>> >>>>>> Using java.util.Scanner is great as it can do streaming. And also what >>>>>> Camel can do if you for example want to split by new line etc. >>>>>> >>>>>> -- >>>>>> Claus Ibsen >>>>>> Apache Camel Committer >>>>>> >>>>>> Open Source Integration: http://fusesource.com >>>>>> Blog: http://davsclaus.blogspot.com/ >>>>>> Twitter: http://twitter.com/davsclaus >>>>>> >>>>>> >>>>> >>>>> -- >>>>> View this message in context: >>>>> http://www.nabble.com/handling-large-files-tp25826380p25891924.html >>>>> Sent from the Camel - Users mailing list archive at Nabble.com. >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Claus Ibsen >>>> Apache Camel Committer >>>> >>>> Open Source Integration: http://fusesource.com >>>> Blog: http://davsclaus.blogspot.com/ >>>> Twitter: http://twitter.com/davsclaus >>>> >>> >>> >>> >>> -- >>> Claus Ibsen >>> Apache Camel Committer >>> >>> Open Source Integration: http://fusesource.com >>> Blog: http://davsclaus.blogspot.com/ >>> Twitter: http://twitter.com/davsclaus >>> >> >> >> >> -- >> Claus Ibsen >> Apache Camel Committer >> >> Open Source Integration: http://fusesource.com >> Blog: http://davsclaus.blogspot.com/ >> Twitter: http://twitter.com/davsclaus >> >> > > -- > View this message in context: http://old.nabble.com/handling-large-files-tp25826380p28005868.html > Sent from the Camel - Users mailing list archive at Nabble.com. > > -- Claus Ibsen Apache Camel Committer Author of Camel in Action: http://www.manning.com/ibsen/ Open Source Integration: http://fusesource.com Blog: http://davsclaus.blogspot.com/ Twitter: http://twitter.com/davsclaus |
|
Thank you very much for your advices.
I'm just trying to handle the "format stack" properly: It's a byte stream in the base layer but an XML stream in the second layer. In my case the byte stream has no own structure so I cannot split it. Therefore I'd try to apply your second advice using XML-aware parsing. How it possible to integrate a "push" parser paradigm more smoothly into Camel than hinding it behind an iterator? (For iterator-based XML splitting, using StAX "pull" XML parsing is probably a more proper choice. Do you know a StAX-based product supporting XPath-like pattern matching?) The Commons Digester supports a XPath-like pattern-matching syntax and uses SAX behind the scenes. It also exibits the "push" paradigm of SAX but introduces a stack concept for match results. Thats why a stream-like handling is supported. Unfortunately Camel does not have a support for Digester at the moment. Another idea: Would you recommend using of Xstream for this task? |
|
On Fri, Mar 26, 2010 at 9:15 AM, Justinson <[hidden email]> wrote:
> > Thank you very much for your advices. > > > Claus Ibsen-2 wrote: >> >> On Tue, Mar 23, 2010 at 8:24 PM, Justinson <[hidden email]> >> wrote: >>> >>> Unfortunately I'm getting an OutOfMemoryError using XPath splitting the >>> way >>> you shown. I'm parsing a file with about 500000 xml messages. >> >> You could pre process the big file and split it into X files. >> Maybe by using the java.util.Scanner to identify "good places" to >> split the big file. >> > > I'm just trying to handle the "format stack" properly: It's a byte stream in > the base layer but an XML stream in the second layer. In my case the byte > stream has no own structure so I cannot split it. Therefore I'd try to apply > your second advice using XML-aware parsing. > > > Claus Ibsen-2 wrote: >> >> >> Or you could try using SAX based XML parsing when splitting to reduce >> the memory overhead. >> Just use a Bean for that. Something like this: >> >> public Iterator splitBigFile(java.io.File file) { >> // SAX parsing the big file and return an iterator or something that >> can walk the XML messages you like >> } >> >> And use the bean with the Camel Split EIP >> >> > > How it possible to integrate a "push" parser paradigm more smoothly into > Camel than hinding it behind an iterator? > > (For iterator-based XML splitting, using StAX "pull" XML parsing is probably > a more proper choice.) > Try googling for a solution using XPath in Java as its what is used under the covers. It have a XPathFactory where you can set features and whatnot. I may offer ways to tweak how it should run in pull or push mode. And whether it offers to stream the result etc. > > Claus Ibsen-2 wrote: >> >> >>> How can we use Apache Digester instead? >> >> > > The Commons Digester supports a XPath-like pattern-matching syntax and uses > SAX behind the scenes. It also exibits the "push" paradigm of SAX but > introduces a stack concept for match results. Thats why a stream-like > handling is supported. Unfortunately Camel does not have a support for > Digester at the moment. > > Another idea: Would you recommend using of Xstream for this task? > > > Claus Ibsen-2 wrote: >> >> >>> Claus Ibsen-2 wrote: >>>> >>>> Hi >>>> >>>> This is as far I got with the xpath expression for splitting >>>> http://svn.apache.org/viewvc?rev=825156&view=rev >> >> > -- > View this message in context: http://old.nabble.com/handling-large-files-tp25826380p28038839.html > Sent from the Camel - Users mailing list archive at Nabble.com. > > -- Claus Ibsen Apache Camel Committer Author of Camel in Action: http://www.manning.com/ibsen/ Open Source Integration: http://fusesource.com Blog: http://davsclaus.blogspot.com/ Twitter: http://twitter.com/davsclaus |
| Powered by Nabble | Edit this page |
