[jira] Created: (CAMEL-876) splitter() should support batch for processing large files

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (CAMEL-876) splitter() should support batch for processing large files

JIRA jira@apache.org
splitter() should support batch for processing large files
----------------------------------------------------------

                 Key: CAMEL-876
                 URL: https://issues.apache.org/activemq/browse/CAMEL-876
             Project: Apache Camel
          Issue Type: Improvement
          Components: camel-core
    Affects Versions: 1.4.0
            Reporter: Claus Ibsen
             Fix For: 1.5.0


See nabble:
http://www.nabble.com/Splitter-for-big-files-td19272583s22882.html

Somekind of batch(size) parameter to the splitter() DSL so we can process the exchanges in batches instead of all in one go.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (CAMEL-876) splitter() should support batch for processing large files

JIRA jira@apache.org

    [ https://issues.apache.org/activemq/browse/CAMEL-876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=45384#action_45384 ]

Claus Ibsen commented on CAMEL-876:
-----------------------------------

See CAMEL-875 with suggestions for a solution and the IRC chat log as of today.

> splitter() should support batch for processing large files
> ----------------------------------------------------------
>
>                 Key: CAMEL-876
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-876
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-core
>    Affects Versions: 1.4.0
>            Reporter: Claus Ibsen
>             Fix For: 1.5.0
>
>
> See nabble:
> http://www.nabble.com/Splitter-for-big-files-td19272583s22882.html
> Somekind of batch(size) parameter to the splitter() DSL so we can process the exchanges in batches instead of all in one go.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Assigned: (CAMEL-876) splitter() should support batch for processing large files

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/activemq/browse/CAMEL-876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claus Ibsen reassigned CAMEL-876:
---------------------------------

    Assignee: Gert Vanthienen

Gert wanted to give a go as he had the solution in his head

> splitter() should support batch for processing large files
> ----------------------------------------------------------
>
>                 Key: CAMEL-876
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-876
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-core
>    Affects Versions: 1.4.0
>            Reporter: Claus Ibsen
>            Assignee: Gert Vanthienen
>             Fix For: 1.5.0
>
>
> See nabble:
> http://www.nabble.com/Splitter-for-big-files-td19272583s22882.html
> Somekind of batch(size) parameter to the splitter() DSL so we can process the exchanges in batches instead of all in one go.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (CAMEL-876) splitter() should support batch for processing large files

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/activemq/browse/CAMEL-876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=45553#action_45553 ]

Gert Vanthienen commented on CAMEL-876:
---------------------------------------

My plan was to use an Iterable instead of a List wherever possible, so the Splitter could read and send exchanges as it gets them from the splitter expression.  This would allow us to just add an an aggregator in the dsl to do the batching.

This works fine, but there's only one problem there: the SPLIT_SIZE attribute requires us to know the amount of items being split and we can't know that yet until we have handled all the messages with this approach.  How about adding a streaming() flag to the Splitter where people can choose to use the new approach, but we can warn them in the javadoc that this will also mean we can not send the split size along?

> splitter() should support batch for processing large files
> ----------------------------------------------------------
>
>                 Key: CAMEL-876
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-876
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-core
>    Affects Versions: 1.4.0
>            Reporter: Claus Ibsen
>            Assignee: Gert Vanthienen
>             Fix For: 1.5.0
>
>
> See nabble:
> http://www.nabble.com/Splitter-for-big-files-td19272583s22882.html
> Somekind of batch(size) parameter to the splitter() DSL so we can process the exchanges in batches instead of all in one go.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (CAMEL-876) splitter() should support batch for processing large files

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/activemq/browse/CAMEL-876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=45555#action_45555 ]

Claus Ibsen commented on CAMEL-876:
-----------------------------------

+1

Gert if not already there we should also add a current index in the header as well. So end-users know the exchange is number 1,2,3,4,5 ... n

> splitter() should support batch for processing large files
> ----------------------------------------------------------
>
>                 Key: CAMEL-876
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-876
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-core
>    Affects Versions: 1.4.0
>            Reporter: Claus Ibsen
>            Assignee: Gert Vanthienen
>             Fix For: 1.5.0
>
>
> See nabble:
> http://www.nabble.com/Splitter-for-big-files-td19272583s22882.html
> Somekind of batch(size) parameter to the splitter() DSL so we can process the exchanges in batches instead of all in one go.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (CAMEL-876) splitter() should support batch for processing large files

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/activemq/browse/CAMEL-876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=45561#action_45561 ]

Gert Vanthienen commented on CAMEL-876:
---------------------------------------

http://svn.eu.apache.org/viewvc?view=rev&revision=693409 allows the splitter to use an Iterable instead of a List to ensure messages get sent as the data for them becomes available.
There's still one TODO in the code: for parallel processing, we need a new CountDownLatch implementation, that somehow allows to 'count up' as well.

> splitter() should support batch for processing large files
> ----------------------------------------------------------
>
>                 Key: CAMEL-876
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-876
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-core
>    Affects Versions: 1.4.0
>            Reporter: Claus Ibsen
>            Assignee: Gert Vanthienen
>             Fix For: 1.5.0
>
>
> See nabble:
> http://www.nabble.com/Splitter-for-big-files-td19272583s22882.html
> Somekind of batch(size) parameter to the splitter() DSL so we can process the exchanges in batches instead of all in one go.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (CAMEL-876) splitter() should support batch for processing large files

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/activemq/browse/CAMEL-876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=45676#action_45676 ]

Claus Ibsen commented on CAMEL-876:
-----------------------------------

Gert can you elaborate a bit on the "count up"?

> splitter() should support batch for processing large files
> ----------------------------------------------------------
>
>                 Key: CAMEL-876
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-876
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-core
>    Affects Versions: 1.4.0
>            Reporter: Claus Ibsen
>            Assignee: Gert Vanthienen
>             Fix For: 1.5.0
>
>
> See nabble:
> http://www.nabble.com/Splitter-for-big-files-td19272583s22882.html
> Somekind of batch(size) parameter to the splitter() DSL so we can process the exchanges in batches instead of all in one go.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (CAMEL-876) splitter() should support batch for processing large files

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/activemq/browse/CAMEL-876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=45689#action_45689 ]

Gert Vanthienen commented on CAMEL-876:
---------------------------------------

http://svn.eu.apache.org/viewvc?view=rev&revision=695358 shows what I meant -- like a CountDownLatch, but which allows to increment() the count as more exchanges are being sent.
There now is one more step left: for aggregating the exchanges, we still need the entire list of exchanges.  We can aggregate them on the fly as the exchanges get answered, but this no longer guarantees the order of aggregation.  Unless someone objects, I'm going to move the {{streaming}} flag up from the {{Splitter}} to the {{MulticastProcessor}}, once again giving people the choice to use the new approach, but warning them about the out-of-order behavior in the the aggregator.

> splitter() should support batch for processing large files
> ----------------------------------------------------------
>
>                 Key: CAMEL-876
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-876
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-core
>    Affects Versions: 1.4.0
>            Reporter: Claus Ibsen
>            Assignee: Gert Vanthienen
>             Fix For: 1.5.0
>
>
> See nabble:
> http://www.nabble.com/Splitter-for-big-files-td19272583s22882.html
> Somekind of batch(size) parameter to the splitter() DSL so we can process the exchanges in batches instead of all in one go.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (CAMEL-876) splitter() should support batch for processing large files

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/activemq/browse/CAMEL-876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=45690#action_45690 ]

Claus Ibsen commented on CAMEL-876:
-----------------------------------

+1

Great work Gert.

> splitter() should support batch for processing large files
> ----------------------------------------------------------
>
>                 Key: CAMEL-876
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-876
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-core
>    Affects Versions: 1.4.0
>            Reporter: Claus Ibsen
>            Assignee: Gert Vanthienen
>             Fix For: 1.5.0
>
>
> See nabble:
> http://www.nabble.com/Splitter-for-big-files-td19272583s22882.html
> Somekind of batch(size) parameter to the splitter() DSL so we can process the exchanges in batches instead of all in one go.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (CAMEL-876) splitter() should support batch for processing large files

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/activemq/browse/CAMEL-876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gert Vanthienen resolved CAMEL-876.
-----------------------------------

    Resolution: Fixed

http://svn.apache.org/viewvc?view=rev&revision=695502 should be the last piece of the puzzle - this makes it works with streaming and parallel processing enabled

> splitter() should support batch for processing large files
> ----------------------------------------------------------
>
>                 Key: CAMEL-876
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-876
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-core
>    Affects Versions: 1.4.0
>            Reporter: Claus Ibsen
>            Assignee: Gert Vanthienen
>             Fix For: 1.5.0
>
>
> See nabble:
> http://www.nabble.com/Splitter-for-big-files-td19272583s22882.html
> Somekind of batch(size) parameter to the splitter() DSL so we can process the exchanges in batches instead of all in one go.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Closed: (CAMEL-876) splitter() should support batch for processing large files

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/activemq/browse/CAMEL-876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claus Ibsen closed CAMEL-876.
-----------------------------


> splitter() should support batch for processing large files
> ----------------------------------------------------------
>
>                 Key: CAMEL-876
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-876
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-core
>    Affects Versions: 1.4.0
>            Reporter: Claus Ibsen
>            Assignee: Gert Vanthienen
>             Fix For: 1.5.0
>
>
> See nabble:
> http://www.nabble.com/Splitter-for-big-files-td19272583s22882.html
> Somekind of batch(size) parameter to the splitter() DSL so we can process the exchanges in batches instead of all in one go.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.