[jira] Created: (CAMEL-2330) camel-jaxb should filter the control and invalid UTF charater

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (CAMEL-2330) camel-jaxb should filter the control and invalid UTF charater

JIRA jira@apache.org
camel-jaxb should filter the control and invalid  UTF charater
--------------------------------------------------------------

                 Key: CAMEL-2330
                 URL: https://issues.apache.org/activemq/browse/CAMEL-2330
             Project: Apache Camel
          Issue Type: Improvement
          Components: camel-jaxb
            Reporter: Willem Jiang
            Assignee: Willem Jiang
             Fix For: 2.2.0


Here is the mail thread[1] which discusses about it.
[1]  http://old.nabble.com/JAXB-marshaller---control-characters-td26978215.html


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (CAMEL-2330) camel-jaxb should filter the control and invalid UTF charater

JIRA jira@apache.org

     [ https://issues.apache.org/activemq/browse/CAMEL-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Willem Jiang resolved CAMEL-2330.
---------------------------------

    Resolution: Fixed

trunk
http://svn.apache.org/viewvc?rev=895993&view=rev

> camel-jaxb should filter the control and invalid  UTF charater
> --------------------------------------------------------------
>
>                 Key: CAMEL-2330
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-2330
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-jaxb
>            Reporter: Willem Jiang
>            Assignee: Willem Jiang
>             Fix For: 2.2.0
>
>
> Here is the mail thread[1] which discusses about it.
> [1]  http://old.nabble.com/JAXB-marshaller---control-characters-td26978215.html

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (CAMEL-2330) camel-jaxb should filter the control and invalid UTF charater

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/activemq/browse/CAMEL-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pavel Grushetzky updated CAMEL-2330:
------------------------------------

    Attachment: camel-jaxb-test.patch

Patch with test that indicates a few issues with JaxbFilterReader. No need to merge, as it does not provide solution, only highlights problem.

> camel-jaxb should filter the control and invalid  UTF charater
> --------------------------------------------------------------
>
>                 Key: CAMEL-2330
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-2330
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-jaxb
>            Reporter: Willem Jiang
>            Assignee: Willem Jiang
>             Fix For: 2.2.0
>
>         Attachments: camel-jaxb-test.patch
>
>
> Here is the mail thread[1] which discusses about it.
> [1]  http://old.nabble.com/JAXB-marshaller---control-characters-td26978215.html

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (CAMEL-2330) camel-jaxb should filter the control and invalid UTF charater

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/activemq/browse/CAMEL-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=56849#action_56849 ]

Willem Jiang commented on CAMEL-2330:
-------------------------------------

@ Pavel,
I just committed your patch for this issue,
The highlight of Pavel's patch
* Marshalling uses custom XmlStreamWriter, while unmarshalling relies on the same, non-xml reader. I realized that wrapping XmlStreamReader does not solve the problem, as wrapper has no power to prevent underlying reader from reading bad chars and failing therefore.

* Filtering has changed slightly - it replaces bad chars with space chars. This a) allows to get rid of intemediate buffer and simplify the code and b) is consistent with the way e.g. Woodstox performs similar filtering.

* It is exchange property or data format property that turns filtering on/off.

* NonXmlCharFilterer performs logging of the replacement fact in case of char[]; and more readable message in case of String.
 
Here are the highlight of my changes
* Fixed the issue of patch that can't build with JDK 1.5.0, comment out the NonXmlCharFiltererTest.testFilter1ArgFiltered(), also add the stax implementation dependency in the camel-jaxb

* You just check the Exchange property in JaxbDataFormat marshal(), and   check the filterNonXmlChars in JaxbDataFormat unmarshal method.
   My change is let the exchange property override the configure of JaxbDataFormat that is same with other Camel components do.

* When unmarshaling the InputStream, you need to get the CharsetEncoding from Exchange like this
   {code}
    answer = unmarshaller.unmarshal(new NonXmlFilterReader(new InputStreamReader(stream, IOConvertor(getCharsetName(exchange))));
   {code}

* Finish the filtering option for JAXB data format via spring DSL .

> camel-jaxb should filter the control and invalid  UTF charater
> --------------------------------------------------------------
>
>                 Key: CAMEL-2330
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-2330
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-jaxb
>            Reporter: Willem Jiang
>            Assignee: Willem Jiang
>             Fix For: 2.2.0
>
>         Attachments: camel-jaxb-test.patch
>
>
> Here is the mail thread[1] which discusses about it.
> [1]  http://old.nabble.com/JAXB-marshaller---control-characters-td26978215.html

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (CAMEL-2330) camel-jaxb should filter the control and invalid UTF charater

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/activemq/browse/CAMEL-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pavel Grushetzky updated CAMEL-2330:
------------------------------------

    Attachment: camel-2330_2.patch

Attaching another patch with 3 changes

* Mock tests converted to mockito.

* Recovered NonXmlCharFiltererTest.testFilter1ArgFiltered() and verified under JDK 1.5

* Added missing test for JaxbDataFormat.unmarshal()

======================
I also suggest to document required StAX dependencies at http://camel.apache.org/jaxb.html, "Ignoring the NonXML Character" section.

| || JDK 1.5 || JDK 1.6+ ||
|| Filtering in use | StAX API and implementation | No |
|| Filtering not in use | StAX API only | No |

This feature has been tested with Woodstox 3.2.9 and Sun JDK 1.6 StAX implementation.

======================

> camel-jaxb should filter the control and invalid  UTF charater
> --------------------------------------------------------------
>
>                 Key: CAMEL-2330
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-2330
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-jaxb
>            Reporter: Willem Jiang
>            Assignee: Willem Jiang
>             Fix For: 2.2.0
>
>         Attachments: camel-2330_2.patch, camel-jaxb-test.patch
>
>
> Here is the mail thread[1] which discusses about it.
> [1]  http://old.nabble.com/JAXB-marshaller---control-characters-td26978215.html

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (CAMEL-2330) camel-jaxb should filter the control and invalid UTF charater

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/activemq/browse/CAMEL-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=56985#action_56985 ]

Willem Jiang commented on CAMEL-2330:
-------------------------------------


Applied patch with thanks to Pavel, also update the wiki page.

After reviewed the patch, I found mockito is much easy to use :)

> camel-jaxb should filter the control and invalid  UTF charater
> --------------------------------------------------------------
>
>                 Key: CAMEL-2330
>                 URL: https://issues.apache.org/activemq/browse/CAMEL-2330
>             Project: Apache Camel
>          Issue Type: Improvement
>          Components: camel-jaxb
>            Reporter: Willem Jiang
>            Assignee: Willem Jiang
>             Fix For: 2.2.0
>
>         Attachments: camel-2330_2.patch, camel-jaxb-test.patch
>
>
> Here is the mail thread[1] which discusses about it.
> [1]  http://old.nabble.com/JAXB-marshaller---control-characters-td26978215.html

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.