Re: Salesforce Change Data Capture reconnection and replayId

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Salesforce Change Data Capture reconnection and replayId

Zoran Regvart-2
Hi Andrés,
folk have found this problem and several similar problems before:

https://issues.apache.org/jira/browse/CAMEL-13170
https://issues.apache.org/jira/browse/CAMEL-12812
https://issues.apache.org/jira/browse/CAMEL-12871
https://issues.apache.org/jira/browse/CAMEL-13577

I'm afraid that we don't have the full understanding of the issue or a
reliable way to reproduce it. Any help would be appreciated.

zoran

On Tue, May 5, 2020 at 1:29 AM Andres Q <[hidden email]> wrote:

>
> Hi
>
> I'm subscribing to Salesforce Change Data Capture events. I had the
> problem that after 24 hours the replayId is stale and if Camel tries
> to reconnect to it it throws an error (I asked about this here:
> https://mail-archives.apache.org/mod_mbox/camel-users/202002.mbox/%3CCAJrxdruca8yqT7fs4snObY6QoCD8cHPUWDkqF3XdBXbGBd4Spg%40mail.gmail.com%3E)
>
> As suggested by Zoran Regvart I'm storing the replayId and the
> timestamp, so on app startup if I know the replayId is old enough, I
> pass ?replayId=-2 to replay all events.
>
> This works fine, but the problem I'm facing now is that if the app
> reconnects after some time I get this event:
>
> Connect failure: {advice={reconnect=handshake, interval=0},
> channel=/meta/connect, id=3320, error=403::Unknown client,
> successful=false}
>
> And then the client will try to reconnect automatically using the
> latest replayId. Since this logic is handled by SubscriptionHelper, if
> the replayId is old enough it will fail on reconnect and then the app
> will stop receiving events forever.
>
> The only thing I can think of is to always use replayId=-2 to fetch
> all events and ignore the ones I already processed, but it seems
> suboptimal to say the least.
>
> How should I handle this scenario so that the implementation is robust?
>
> Thanks,
>
> Andrés



--
Zoran Regvart
Reply | Threaded
Open this post in threaded view
|

Re: Salesforce Change Data Capture reconnection and replayId

Andres
My case is the same as

https://issues.apache.org/jira/browse/CAMEL-13170

And if my understanding is correct, it's simple (the problem, not the solution):

Camel reconnects to a topic automatically using always the same replayId

At least that's what happens in my case when I set the replayId in the
endpoint like:

from("salesforce:data/ChangeEvents?replayId=" + replayIdForSalesforce)

Scenario is I start the routes, replayId=X. If a disconnect happens
(and they happen about every 3 hours, I think) then Camel would try to
reconnect using replayId=X. That works fine, but 24 hours later, that
replayId is invalid because Salesforce only stores 24 hours of events.
Hence the reconnection to the topic would throw the error:

org.apache.camel.component.salesforce.api.SalesforceException - Error
subscribing to data/ChangeEvents: 400::The replayId {13344} you
provided was invalid.  Please provide a valid ID, -2 to replay all
events, or -1 to replay only new events.

And it won't retry reconnecting, hence I would lose any further events.

On the EMP-Connector project by salesforce they worked around this by
completely removing the replayId after the connection starts, so all
subsequent reconnections would use -2. This can of course result in
repeated events, so I guess it's up to the client code to handle
those. See: https://github.com/forcedotcom/EMP-Connector/pull/42/commits/19766eca02970658691a7372af4851d3ef10667a


El mar., 5 de may. de 2020 a la(s) 05:16, Zoran Regvart
([hidden email]) escribió:

>
> Hi Andrés,
> folk have found this problem and several similar problems before:
>
> https://issues.apache.org/jira/browse/CAMEL-13170
> https://issues.apache.org/jira/browse/CAMEL-12812
> https://issues.apache.org/jira/browse/CAMEL-12871
> https://issues.apache.org/jira/browse/CAMEL-13577
>
> I'm afraid that we don't have the full understanding of the issue or a
> reliable way to reproduce it. Any help would be appreciated.
>
> zoran
>
> On Tue, May 5, 2020 at 1:29 AM Andres Q <[hidden email]> wrote:
> >
> > Hi
> >
> > I'm subscribing to Salesforce Change Data Capture events. I had the
> > problem that after 24 hours the replayId is stale and if Camel tries
> > to reconnect to it it throws an error (I asked about this here:
> > https://mail-archives.apache.org/mod_mbox/camel-users/202002.mbox/%3CCAJrxdruca8yqT7fs4snObY6QoCD8cHPUWDkqF3XdBXbGBd4Spg%40mail.gmail.com%3E)
> >
> > As suggested by Zoran Regvart I'm storing the replayId and the
> > timestamp, so on app startup if I know the replayId is old enough, I
> > pass ?replayId=-2 to replay all events.
> >
> > This works fine, but the problem I'm facing now is that if the app
> > reconnects after some time I get this event:
> >
> > Connect failure: {advice={reconnect=handshake, interval=0},
> > channel=/meta/connect, id=3320, error=403::Unknown client,
> > successful=false}
> >
> > And then the client will try to reconnect automatically using the
> > latest replayId. Since this logic is handled by SubscriptionHelper, if
> > the replayId is old enough it will fail on reconnect and then the app
> > will stop receiving events forever.
> >
> > The only thing I can think of is to always use replayId=-2 to fetch
> > all events and ignore the ones I already processed, but it seems
> > suboptimal to say the least.
> >
> > How should I handle this scenario so that the implementation is robust?
> >
> > Thanks,
> >
> > Andrés
>
>
>
> --
> Zoran Regvart
Reply | Threaded
Open this post in threaded view
|

Re: Salesforce Change Data Capture reconnection and replayId

Zoran Regvart-2
Hi Andrés,
could storing the age of the last seen id along with the value also work?

Then if the id is older than 24h instead of that last seen value set
it to -2 or -1 as the use case dictates. I kinda think that the
management of replayId needs to be in the client application not in
Camel, it's going to be difficult to satisfy different use cases folk
have; meaning some might be interested only in latest some might want
to replay all events, and there could be some that wish to crash and
let the operator choose.

zoran

On Tue, May 5, 2020 at 1:32 PM Andres Q <[hidden email]> wrote:

>
> My case is the same as
>
> https://issues.apache.org/jira/browse/CAMEL-13170
>
> And if my understanding is correct, it's simple (the problem, not the solution):
>
> Camel reconnects to a topic automatically using always the same replayId
>
> At least that's what happens in my case when I set the replayId in the
> endpoint like:
>
> from("salesforce:data/ChangeEvents?replayId=" + replayIdForSalesforce)
>
> Scenario is I start the routes, replayId=X. If a disconnect happens
> (and they happen about every 3 hours, I think) then Camel would try to
> reconnect using replayId=X. That works fine, but 24 hours later, that
> replayId is invalid because Salesforce only stores 24 hours of events.
> Hence the reconnection to the topic would throw the error:
>
> org.apache.camel.component.salesforce.api.SalesforceException - Error
> subscribing to data/ChangeEvents: 400::The replayId {13344} you
> provided was invalid.  Please provide a valid ID, -2 to replay all
> events, or -1 to replay only new events.
>
> And it won't retry reconnecting, hence I would lose any further events.
>
> On the EMP-Connector project by salesforce they worked around this by
> completely removing the replayId after the connection starts, so all
> subsequent reconnections would use -2. This can of course result in
> repeated events, so I guess it's up to the client code to handle
> those. See: https://github.com/forcedotcom/EMP-Connector/pull/42/commits/19766eca02970658691a7372af4851d3ef10667a
>
>
> El mar., 5 de may. de 2020 a la(s) 05:16, Zoran Regvart
> ([hidden email]) escribió:
> >
> > Hi Andrés,
> > folk have found this problem and several similar problems before:
> >
> > https://issues.apache.org/jira/browse/CAMEL-13170
> > https://issues.apache.org/jira/browse/CAMEL-12812
> > https://issues.apache.org/jira/browse/CAMEL-12871
> > https://issues.apache.org/jira/browse/CAMEL-13577
> >
> > I'm afraid that we don't have the full understanding of the issue or a
> > reliable way to reproduce it. Any help would be appreciated.
> >
> > zoran
> >
> > On Tue, May 5, 2020 at 1:29 AM Andres Q <[hidden email]> wrote:
> > >
> > > Hi
> > >
> > > I'm subscribing to Salesforce Change Data Capture events. I had the
> > > problem that after 24 hours the replayId is stale and if Camel tries
> > > to reconnect to it it throws an error (I asked about this here:
> > > https://mail-archives.apache.org/mod_mbox/camel-users/202002.mbox/%3CCAJrxdruca8yqT7fs4snObY6QoCD8cHPUWDkqF3XdBXbGBd4Spg%40mail.gmail.com%3E)
> > >
> > > As suggested by Zoran Regvart I'm storing the replayId and the
> > > timestamp, so on app startup if I know the replayId is old enough, I
> > > pass ?replayId=-2 to replay all events.
> > >
> > > This works fine, but the problem I'm facing now is that if the app
> > > reconnects after some time I get this event:
> > >
> > > Connect failure: {advice={reconnect=handshake, interval=0},
> > > channel=/meta/connect, id=3320, error=403::Unknown client,
> > > successful=false}
> > >
> > > And then the client will try to reconnect automatically using the
> > > latest replayId. Since this logic is handled by SubscriptionHelper, if
> > > the replayId is old enough it will fail on reconnect and then the app
> > > will stop receiving events forever.
> > >
> > > The only thing I can think of is to always use replayId=-2 to fetch
> > > all events and ignore the ones I already processed, but it seems
> > > suboptimal to say the least.
> > >
> > > How should I handle this scenario so that the implementation is robust?
> > >
> > > Thanks,
> > >
> > > Andrés
> >
> >
> >
> > --
> > Zoran Regvart



--
Zoran Regvart
Reply | Threaded
Open this post in threaded view
|

Re: Salesforce Change Data Capture reconnection and replayId

Andres
I'm doing it that way in my routebuilder, but I don't have control of
the replayId the SubscriptionHelper uses when reconnecting.

If SubscriptionHelper keeps track of the last replayId with its time
as well it would help, but it would also need to update the replayId
any time a new event comes.

As you say, it's tricky, because each use case might be different and
people would want different results depending on their use cases

El mar., 5 de may. de 2020 a la(s) 10:02, Zoran Regvart
([hidden email]) escribió:

>
> Hi Andrés,
> could storing the age of the last seen id along with the value also work?
>
> Then if the id is older than 24h instead of that last seen value set
> it to -2 or -1 as the use case dictates. I kinda think that the
> management of replayId needs to be in the client application not in
> Camel, it's going to be difficult to satisfy different use cases folk
> have; meaning some might be interested only in latest some might want
> to replay all events, and there could be some that wish to crash and
> let the operator choose.
>
> zoran
>
> On Tue, May 5, 2020 at 1:32 PM Andres Q <[hidden email]> wrote:
> >
> > My case is the same as
> >
> > https://issues.apache.org/jira/browse/CAMEL-13170
> >
> > And if my understanding is correct, it's simple (the problem, not the solution):
> >
> > Camel reconnects to a topic automatically using always the same replayId
> >
> > At least that's what happens in my case when I set the replayId in the
> > endpoint like:
> >
> > from("salesforce:data/ChangeEvents?replayId=" + replayIdForSalesforce)
> >
> > Scenario is I start the routes, replayId=X. If a disconnect happens
> > (and they happen about every 3 hours, I think) then Camel would try to
> > reconnect using replayId=X. That works fine, but 24 hours later, that
> > replayId is invalid because Salesforce only stores 24 hours of events.
> > Hence the reconnection to the topic would throw the error:
> >
> > org.apache.camel.component.salesforce.api.SalesforceException - Error
> > subscribing to data/ChangeEvents: 400::The replayId {13344} you
> > provided was invalid.  Please provide a valid ID, -2 to replay all
> > events, or -1 to replay only new events.
> >
> > And it won't retry reconnecting, hence I would lose any further events.
> >
> > On the EMP-Connector project by salesforce they worked around this by
> > completely removing the replayId after the connection starts, so all
> > subsequent reconnections would use -2. This can of course result in
> > repeated events, so I guess it's up to the client code to handle
> > those. See: https://github.com/forcedotcom/EMP-Connector/pull/42/commits/19766eca02970658691a7372af4851d3ef10667a
> >
> >
> > El mar., 5 de may. de 2020 a la(s) 05:16, Zoran Regvart
> > ([hidden email]) escribió:
> > >
> > > Hi Andrés,
> > > folk have found this problem and several similar problems before:
> > >
> > > https://issues.apache.org/jira/browse/CAMEL-13170
> > > https://issues.apache.org/jira/browse/CAMEL-12812
> > > https://issues.apache.org/jira/browse/CAMEL-12871
> > > https://issues.apache.org/jira/browse/CAMEL-13577
> > >
> > > I'm afraid that we don't have the full understanding of the issue or a
> > > reliable way to reproduce it. Any help would be appreciated.
> > >
> > > zoran
> > >
> > > On Tue, May 5, 2020 at 1:29 AM Andres Q <[hidden email]> wrote:
> > > >
> > > > Hi
> > > >
> > > > I'm subscribing to Salesforce Change Data Capture events. I had the
> > > > problem that after 24 hours the replayId is stale and if Camel tries
> > > > to reconnect to it it throws an error (I asked about this here:
> > > > https://mail-archives.apache.org/mod_mbox/camel-users/202002.mbox/%3CCAJrxdruca8yqT7fs4snObY6QoCD8cHPUWDkqF3XdBXbGBd4Spg%40mail.gmail.com%3E)
> > > >
> > > > As suggested by Zoran Regvart I'm storing the replayId and the
> > > > timestamp, so on app startup if I know the replayId is old enough, I
> > > > pass ?replayId=-2 to replay all events.
> > > >
> > > > This works fine, but the problem I'm facing now is that if the app
> > > > reconnects after some time I get this event:
> > > >
> > > > Connect failure: {advice={reconnect=handshake, interval=0},
> > > > channel=/meta/connect, id=3320, error=403::Unknown client,
> > > > successful=false}
> > > >
> > > > And then the client will try to reconnect automatically using the
> > > > latest replayId. Since this logic is handled by SubscriptionHelper, if
> > > > the replayId is old enough it will fail on reconnect and then the app
> > > > will stop receiving events forever.
> > > >
> > > > The only thing I can think of is to always use replayId=-2 to fetch
> > > > all events and ignore the ones I already processed, but it seems
> > > > suboptimal to say the least.
> > > >
> > > > How should I handle this scenario so that the implementation is robust?
> > > >
> > > > Thanks,
> > > >
> > > > Andrés
> > >
> > >
> > >
> > > --
> > > Zoran Regvart
>
>
>
> --
> Zoran Regvart
Reply | Threaded
Open this post in threaded view
|

Re: Salesforce Change Data Capture reconnection and replayId

Andres
I'm thinking that instead of setting the replayId on the URL such as:

from("salesforce:data/ChangeEvents?replayId=" + replayIdForSalesforce)

I can set it on the SalesforceComponent level, so I can update it
whenever I get a new event so it's not fixed to the startup one.

The problem still remains if no events arrive in 24 hours, if a
disconnect happens after that time, it will still try to use the
latest replayId which would be too old at this point.

Is it possible to have a callback or something when a
disconnect/reconnect is happening, so that we can update the replayId
in SalesforceComponent?

The logic would be that if there's a reconnect, then I can check the
time of the latest replayId and if it's older than 24 hours use -2, or
else use the one I have

El mar., 5 de may. de 2020 a la(s) 10:07, Andres Q
([hidden email]) escribió:

>
> I'm doing it that way in my routebuilder, but I don't have control of
> the replayId the SubscriptionHelper uses when reconnecting.
>
> If SubscriptionHelper keeps track of the last replayId with its time
> as well it would help, but it would also need to update the replayId
> any time a new event comes.
>
> As you say, it's tricky, because each use case might be different and
> people would want different results depending on their use cases
>
> El mar., 5 de may. de 2020 a la(s) 10:02, Zoran Regvart
> ([hidden email]) escribió:
> >
> > Hi Andrés,
> > could storing the age of the last seen id along with the value also work?
> >
> > Then if the id is older than 24h instead of that last seen value set
> > it to -2 or -1 as the use case dictates. I kinda think that the
> > management of replayId needs to be in the client application not in
> > Camel, it's going to be difficult to satisfy different use cases folk
> > have; meaning some might be interested only in latest some might want
> > to replay all events, and there could be some that wish to crash and
> > let the operator choose.
> >
> > zoran
> >
> > On Tue, May 5, 2020 at 1:32 PM Andres Q <[hidden email]> wrote:
> > >
> > > My case is the same as
> > >
> > > https://issues.apache.org/jira/browse/CAMEL-13170
> > >
> > > And if my understanding is correct, it's simple (the problem, not the solution):
> > >
> > > Camel reconnects to a topic automatically using always the same replayId
> > >
> > > At least that's what happens in my case when I set the replayId in the
> > > endpoint like:
> > >
> > > from("salesforce:data/ChangeEvents?replayId=" + replayIdForSalesforce)
> > >
> > > Scenario is I start the routes, replayId=X. If a disconnect happens
> > > (and they happen about every 3 hours, I think) then Camel would try to
> > > reconnect using replayId=X. That works fine, but 24 hours later, that
> > > replayId is invalid because Salesforce only stores 24 hours of events.
> > > Hence the reconnection to the topic would throw the error:
> > >
> > > org.apache.camel.component.salesforce.api.SalesforceException - Error
> > > subscribing to data/ChangeEvents: 400::The replayId {13344} you
> > > provided was invalid.  Please provide a valid ID, -2 to replay all
> > > events, or -1 to replay only new events.
> > >
> > > And it won't retry reconnecting, hence I would lose any further events.
> > >
> > > On the EMP-Connector project by salesforce they worked around this by
> > > completely removing the replayId after the connection starts, so all
> > > subsequent reconnections would use -2. This can of course result in
> > > repeated events, so I guess it's up to the client code to handle
> > > those. See: https://github.com/forcedotcom/EMP-Connector/pull/42/commits/19766eca02970658691a7372af4851d3ef10667a
> > >
> > >
> > > El mar., 5 de may. de 2020 a la(s) 05:16, Zoran Regvart
> > > ([hidden email]) escribió:
> > > >
> > > > Hi Andrés,
> > > > folk have found this problem and several similar problems before:
> > > >
> > > > https://issues.apache.org/jira/browse/CAMEL-13170
> > > > https://issues.apache.org/jira/browse/CAMEL-12812
> > > > https://issues.apache.org/jira/browse/CAMEL-12871
> > > > https://issues.apache.org/jira/browse/CAMEL-13577
> > > >
> > > > I'm afraid that we don't have the full understanding of the issue or a
> > > > reliable way to reproduce it. Any help would be appreciated.
> > > >
> > > > zoran
> > > >
> > > > On Tue, May 5, 2020 at 1:29 AM Andres Q <[hidden email]> wrote:
> > > > >
> > > > > Hi
> > > > >
> > > > > I'm subscribing to Salesforce Change Data Capture events. I had the
> > > > > problem that after 24 hours the replayId is stale and if Camel tries
> > > > > to reconnect to it it throws an error (I asked about this here:
> > > > > https://mail-archives.apache.org/mod_mbox/camel-users/202002.mbox/%3CCAJrxdruca8yqT7fs4snObY6QoCD8cHPUWDkqF3XdBXbGBd4Spg%40mail.gmail.com%3E)
> > > > >
> > > > > As suggested by Zoran Regvart I'm storing the replayId and the
> > > > > timestamp, so on app startup if I know the replayId is old enough, I
> > > > > pass ?replayId=-2 to replay all events.
> > > > >
> > > > > This works fine, but the problem I'm facing now is that if the app
> > > > > reconnects after some time I get this event:
> > > > >
> > > > > Connect failure: {advice={reconnect=handshake, interval=0},
> > > > > channel=/meta/connect, id=3320, error=403::Unknown client,
> > > > > successful=false}
> > > > >
> > > > > And then the client will try to reconnect automatically using the
> > > > > latest replayId. Since this logic is handled by SubscriptionHelper, if
> > > > > the replayId is old enough it will fail on reconnect and then the app
> > > > > will stop receiving events forever.
> > > > >
> > > > > The only thing I can think of is to always use replayId=-2 to fetch
> > > > > all events and ignore the ones I already processed, but it seems
> > > > > suboptimal to say the least.
> > > > >
> > > > > How should I handle this scenario so that the implementation is robust?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Andrés
> > > >
> > > >
> > > >
> > > > --
> > > > Zoran Regvart
> >
> >
> >
> > --
> > Zoran Regvart
Reply | Threaded
Open this post in threaded view
|

Re: Salesforce Change Data Capture reconnection and replayId

Andres
For posterity, the way I solved this is by scheduling in ~24 hours a
function that resets the replayId in SalesforceComponent to -2

That way, when app starts I set the replayId (to its value or -2 if
it's old enough, I have that data on a db) and schedule the reset.
Whenever a new event arrives, I update the replayId and again schedule
the reset, that way, whenever a reconnect happens it will either
reconnect using the latest replayId or with -2 if the latest is older
than 24 hours

El mar., 5 de may. de 2020 a la(s) 12:30, Andres Q
([hidden email]) escribió:

>
> I'm thinking that instead of setting the replayId on the URL such as:
>
> from("salesforce:data/ChangeEvents?replayId=" + replayIdForSalesforce)
>
> I can set it on the SalesforceComponent level, so I can update it
> whenever I get a new event so it's not fixed to the startup one.
>
> The problem still remains if no events arrive in 24 hours, if a
> disconnect happens after that time, it will still try to use the
> latest replayId which would be too old at this point.
>
> Is it possible to have a callback or something when a
> disconnect/reconnect is happening, so that we can update the replayId
> in SalesforceComponent?
>
> The logic would be that if there's a reconnect, then I can check the
> time of the latest replayId and if it's older than 24 hours use -2, or
> else use the one I have
>
> El mar., 5 de may. de 2020 a la(s) 10:07, Andres Q
> ([hidden email]) escribió:
> >
> > I'm doing it that way in my routebuilder, but I don't have control of
> > the replayId the SubscriptionHelper uses when reconnecting.
> >
> > If SubscriptionHelper keeps track of the last replayId with its time
> > as well it would help, but it would also need to update the replayId
> > any time a new event comes.
> >
> > As you say, it's tricky, because each use case might be different and
> > people would want different results depending on their use cases
> >
> > El mar., 5 de may. de 2020 a la(s) 10:02, Zoran Regvart
> > ([hidden email]) escribió:
> > >
> > > Hi Andrés,
> > > could storing the age of the last seen id along with the value also work?
> > >
> > > Then if the id is older than 24h instead of that last seen value set
> > > it to -2 or -1 as the use case dictates. I kinda think that the
> > > management of replayId needs to be in the client application not in
> > > Camel, it's going to be difficult to satisfy different use cases folk
> > > have; meaning some might be interested only in latest some might want
> > > to replay all events, and there could be some that wish to crash and
> > > let the operator choose.
> > >
> > > zoran
> > >
> > > On Tue, May 5, 2020 at 1:32 PM Andres Q <[hidden email]> wrote:
> > > >
> > > > My case is the same as
> > > >
> > > > https://issues.apache.org/jira/browse/CAMEL-13170
> > > >
> > > > And if my understanding is correct, it's simple (the problem, not the solution):
> > > >
> > > > Camel reconnects to a topic automatically using always the same replayId
> > > >
> > > > At least that's what happens in my case when I set the replayId in the
> > > > endpoint like:
> > > >
> > > > from("salesforce:data/ChangeEvents?replayId=" + replayIdForSalesforce)
> > > >
> > > > Scenario is I start the routes, replayId=X. If a disconnect happens
> > > > (and they happen about every 3 hours, I think) then Camel would try to
> > > > reconnect using replayId=X. That works fine, but 24 hours later, that
> > > > replayId is invalid because Salesforce only stores 24 hours of events.
> > > > Hence the reconnection to the topic would throw the error:
> > > >
> > > > org.apache.camel.component.salesforce.api.SalesforceException - Error
> > > > subscribing to data/ChangeEvents: 400::The replayId {13344} you
> > > > provided was invalid.  Please provide a valid ID, -2 to replay all
> > > > events, or -1 to replay only new events.
> > > >
> > > > And it won't retry reconnecting, hence I would lose any further events.
> > > >
> > > > On the EMP-Connector project by salesforce they worked around this by
> > > > completely removing the replayId after the connection starts, so all
> > > > subsequent reconnections would use -2. This can of course result in
> > > > repeated events, so I guess it's up to the client code to handle
> > > > those. See: https://github.com/forcedotcom/EMP-Connector/pull/42/commits/19766eca02970658691a7372af4851d3ef10667a
> > > >
> > > >
> > > > El mar., 5 de may. de 2020 a la(s) 05:16, Zoran Regvart
> > > > ([hidden email]) escribió:
> > > > >
> > > > > Hi Andrés,
> > > > > folk have found this problem and several similar problems before:
> > > > >
> > > > > https://issues.apache.org/jira/browse/CAMEL-13170
> > > > > https://issues.apache.org/jira/browse/CAMEL-12812
> > > > > https://issues.apache.org/jira/browse/CAMEL-12871
> > > > > https://issues.apache.org/jira/browse/CAMEL-13577
> > > > >
> > > > > I'm afraid that we don't have the full understanding of the issue or a
> > > > > reliable way to reproduce it. Any help would be appreciated.
> > > > >
> > > > > zoran
> > > > >
> > > > > On Tue, May 5, 2020 at 1:29 AM Andres Q <[hidden email]> wrote:
> > > > > >
> > > > > > Hi
> > > > > >
> > > > > > I'm subscribing to Salesforce Change Data Capture events. I had the
> > > > > > problem that after 24 hours the replayId is stale and if Camel tries
> > > > > > to reconnect to it it throws an error (I asked about this here:
> > > > > > https://mail-archives.apache.org/mod_mbox/camel-users/202002.mbox/%3CCAJrxdruca8yqT7fs4snObY6QoCD8cHPUWDkqF3XdBXbGBd4Spg%40mail.gmail.com%3E)
> > > > > >
> > > > > > As suggested by Zoran Regvart I'm storing the replayId and the
> > > > > > timestamp, so on app startup if I know the replayId is old enough, I
> > > > > > pass ?replayId=-2 to replay all events.
> > > > > >
> > > > > > This works fine, but the problem I'm facing now is that if the app
> > > > > > reconnects after some time I get this event:
> > > > > >
> > > > > > Connect failure: {advice={reconnect=handshake, interval=0},
> > > > > > channel=/meta/connect, id=3320, error=403::Unknown client,
> > > > > > successful=false}
> > > > > >
> > > > > > And then the client will try to reconnect automatically using the
> > > > > > latest replayId. Since this logic is handled by SubscriptionHelper, if
> > > > > > the replayId is old enough it will fail on reconnect and then the app
> > > > > > will stop receiving events forever.
> > > > > >
> > > > > > The only thing I can think of is to always use replayId=-2 to fetch
> > > > > > all events and ignore the ones I already processed, but it seems
> > > > > > suboptimal to say the least.
> > > > > >
> > > > > > How should I handle this scenario so that the implementation is robust?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Andrés
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Zoran Regvart
> > >
> > >
> > >
> > > --
> > > Zoran Regvart
Reply | Threaded
Open this post in threaded view
|

Re: Salesforce Change Data Capture reconnection and replayId

Claus Ibsen-2
On Wed, May 6, 2020 at 11:32 PM Andres Q <[hidden email]> wrote:

>
> For posterity, the way I solved this is by scheduling in ~24 hours a
> function that resets the replayId in SalesforceComponent to -2
>
> That way, when app starts I set the replayId (to its value or -2 if
> it's old enough, I have that data on a db) and schedule the reset.
> Whenever a new event arrives, I update the replayId and again schedule
> the reset, that way, whenever a reconnect happens it will either
> reconnect using the latest replayId or with -2 if the latest is older
> than 24 hours
>

I wonder if you could create a JIRA about this. And we could add some
interface as hook for the camel-salesforce component which allows
to control this reply id, so you can implement this logic about 24h -
and we can have some implementations out of the box if this is a
common use case for 24h then set to -2 etc.

And sure we love contributions so you are of course welcome to work on
an implementation of this
https://camel.apache.org/manual/latest/contributing.html

> El mar., 5 de may. de 2020 a la(s) 12:30, Andres Q
> ([hidden email]) escribió:
> >
> > I'm thinking that instead of setting the replayId on the URL such as:
> >
> > from("salesforce:data/ChangeEvents?replayId=" + replayIdForSalesforce)
> >
> > I can set it on the SalesforceComponent level, so I can update it
> > whenever I get a new event so it's not fixed to the startup one.
> >
> > The problem still remains if no events arrive in 24 hours, if a
> > disconnect happens after that time, it will still try to use the
> > latest replayId which would be too old at this point.
> >
> > Is it possible to have a callback or something when a
> > disconnect/reconnect is happening, so that we can update the replayId
> > in SalesforceComponent?
> >
> > The logic would be that if there's a reconnect, then I can check the
> > time of the latest replayId and if it's older than 24 hours use -2, or
> > else use the one I have
> >
> > El mar., 5 de may. de 2020 a la(s) 10:07, Andres Q
> > ([hidden email]) escribió:
> > >
> > > I'm doing it that way in my routebuilder, but I don't have control of
> > > the replayId the SubscriptionHelper uses when reconnecting.
> > >
> > > If SubscriptionHelper keeps track of the last replayId with its time
> > > as well it would help, but it would also need to update the replayId
> > > any time a new event comes.
> > >
> > > As you say, it's tricky, because each use case might be different and
> > > people would want different results depending on their use cases
> > >
> > > El mar., 5 de may. de 2020 a la(s) 10:02, Zoran Regvart
> > > ([hidden email]) escribió:
> > > >
> > > > Hi Andrés,
> > > > could storing the age of the last seen id along with the value also work?
> > > >
> > > > Then if the id is older than 24h instead of that last seen value set
> > > > it to -2 or -1 as the use case dictates. I kinda think that the
> > > > management of replayId needs to be in the client application not in
> > > > Camel, it's going to be difficult to satisfy different use cases folk
> > > > have; meaning some might be interested only in latest some might want
> > > > to replay all events, and there could be some that wish to crash and
> > > > let the operator choose.
> > > >
> > > > zoran
> > > >
> > > > On Tue, May 5, 2020 at 1:32 PM Andres Q <[hidden email]> wrote:
> > > > >
> > > > > My case is the same as
> > > > >
> > > > > https://issues.apache.org/jira/browse/CAMEL-13170
> > > > >
> > > > > And if my understanding is correct, it's simple (the problem, not the solution):
> > > > >
> > > > > Camel reconnects to a topic automatically using always the same replayId
> > > > >
> > > > > At least that's what happens in my case when I set the replayId in the
> > > > > endpoint like:
> > > > >
> > > > > from("salesforce:data/ChangeEvents?replayId=" + replayIdForSalesforce)
> > > > >
> > > > > Scenario is I start the routes, replayId=X. If a disconnect happens
> > > > > (and they happen about every 3 hours, I think) then Camel would try to
> > > > > reconnect using replayId=X. That works fine, but 24 hours later, that
> > > > > replayId is invalid because Salesforce only stores 24 hours of events.
> > > > > Hence the reconnection to the topic would throw the error:
> > > > >
> > > > > org.apache.camel.component.salesforce.api.SalesforceException - Error
> > > > > subscribing to data/ChangeEvents: 400::The replayId {13344} you
> > > > > provided was invalid.  Please provide a valid ID, -2 to replay all
> > > > > events, or -1 to replay only new events.
> > > > >
> > > > > And it won't retry reconnecting, hence I would lose any further events.
> > > > >
> > > > > On the EMP-Connector project by salesforce they worked around this by
> > > > > completely removing the replayId after the connection starts, so all
> > > > > subsequent reconnections would use -2. This can of course result in
> > > > > repeated events, so I guess it's up to the client code to handle
> > > > > those. See: https://github.com/forcedotcom/EMP-Connector/pull/42/commits/19766eca02970658691a7372af4851d3ef10667a
> > > > >
> > > > >
> > > > > El mar., 5 de may. de 2020 a la(s) 05:16, Zoran Regvart
> > > > > ([hidden email]) escribió:
> > > > > >
> > > > > > Hi Andrés,
> > > > > > folk have found this problem and several similar problems before:
> > > > > >
> > > > > > https://issues.apache.org/jira/browse/CAMEL-13170
> > > > > > https://issues.apache.org/jira/browse/CAMEL-12812
> > > > > > https://issues.apache.org/jira/browse/CAMEL-12871
> > > > > > https://issues.apache.org/jira/browse/CAMEL-13577
> > > > > >
> > > > > > I'm afraid that we don't have the full understanding of the issue or a
> > > > > > reliable way to reproduce it. Any help would be appreciated.
> > > > > >
> > > > > > zoran
> > > > > >
> > > > > > On Tue, May 5, 2020 at 1:29 AM Andres Q <[hidden email]> wrote:
> > > > > > >
> > > > > > > Hi
> > > > > > >
> > > > > > > I'm subscribing to Salesforce Change Data Capture events. I had the
> > > > > > > problem that after 24 hours the replayId is stale and if Camel tries
> > > > > > > to reconnect to it it throws an error (I asked about this here:
> > > > > > > https://mail-archives.apache.org/mod_mbox/camel-users/202002.mbox/%3CCAJrxdruca8yqT7fs4snObY6QoCD8cHPUWDkqF3XdBXbGBd4Spg%40mail.gmail.com%3E)
> > > > > > >
> > > > > > > As suggested by Zoran Regvart I'm storing the replayId and the
> > > > > > > timestamp, so on app startup if I know the replayId is old enough, I
> > > > > > > pass ?replayId=-2 to replay all events.
> > > > > > >
> > > > > > > This works fine, but the problem I'm facing now is that if the app
> > > > > > > reconnects after some time I get this event:
> > > > > > >
> > > > > > > Connect failure: {advice={reconnect=handshake, interval=0},
> > > > > > > channel=/meta/connect, id=3320, error=403::Unknown client,
> > > > > > > successful=false}
> > > > > > >
> > > > > > > And then the client will try to reconnect automatically using the
> > > > > > > latest replayId. Since this logic is handled by SubscriptionHelper, if
> > > > > > > the replayId is old enough it will fail on reconnect and then the app
> > > > > > > will stop receiving events forever.
> > > > > > >
> > > > > > > The only thing I can think of is to always use replayId=-2 to fetch
> > > > > > > all events and ignore the ones I already processed, but it seems
> > > > > > > suboptimal to say the least.
> > > > > > >
> > > > > > > How should I handle this scenario so that the implementation is robust?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Andrés
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Zoran Regvart
> > > >
> > > >
> > > >
> > > > --
> > > > Zoran Regvart



--
Claus Ibsen
-----------------
http://davsclaus.com @davsclaus
Camel in Action 2: https://www.manning.com/ibsen2