With this question we came up with a concept to automatically detect if there is one of the target systems down and then suspend the triggers used for this connection.
Once the system detects that this target system is back online, we will enable the triggers again and the normal process can continue.
Why did we start this concept?
We’ve seen that we cannot guarantee the 24/7 uptime of the target systems and we wanted to have some dynamic tool which will take care of these downtimes (scheduled or not) so we shouldn’t have to worry about any failed transactions and resubmit them. This in some cases can be a hard task if there are thousand failed messages.
Concept requirements:
1. Database
We need to identify the relationship between the trigger and the connections. Why? One connection can be used in different packages and also one package can have multiple triggers pointing to different target systems.
We will store this data in a database which can be easily implemented and cached as well.
- Info required for the database
- Full path name of the trigger
- Package name of the trigger
- Type of connection (e.g. : JDBC, SAP,…)
- Full path name of the connection
- identifier to make it unique if we have multiple target systems for 1 trigger. So we can identify all triggers to one target system.
We also need to have a job running every x time to check if all connections are still active and responding. We will create a dummy query per type of connection so we can see if the target system is still responding, if not there is an issue and we should not send any data towards that IZ.
Of course we also need to build in some security checks like, what if there was a network blips why the dummy query didn’t ran successfully. So we will have to execute this a 2nd time to make sure that there is definitely an issue, but we have to delay this request with some time (e.g. one minute) to know there are no one time issues.
Let’s say we’ve detected a target system which is down. We now have to start another thread which will suspend the triggers and connection for that target system. Why disabling the connection? It’s not required, but easier to monitor if a connection has been disabled. As in a good environment we won’t have any disabled connections.
If we suspend processing of documents they will remain in the broker, so we will have to monitor the broker so it will not grow to large and even in the worst case the broker can crash.
We can use Optimize for that of develop custom code to check the total size of the broker and alert if it reaches a predefined limit. We could even make it “smart” and release the trigger that is causing the broker to pill up and once the broker is back to a “safe” size, we can see if the target is back online or not, if not we can suspend this trigger again.
Now with this information in mind we can develop some concept. Will keep you posted on the progress and post some code samples as well.
Author : Jeroen W.
No comments:
Post a Comment