Timeout and Retry Issue in Wappler – Possible Bottleneck

@patrick @Teodor , I need your help.

I have built a robust and scalable system in Wappler for the financial sector, where we manage transactions, receivables, and banking integrations. However, we are facing a critical issue that affects the reliability of the application.

:stop_sign: The Problem

When a request is made to our API in Wappler to create a transaction, the process works as follows:

  1. Our API receives the request and processes the data.
  2. The API formats and forwards the request to the bank.
  3. The bank returns a response – which sometimes takes up to 3 seconds.
  4. Here’s the issue: Our API, with a timeout set to 10000ms, does not recognize the response in some cases and retries the request, causing duplicate transactions and failures.

Upon checking the logs, I found that the transactions were successfully created in the bank, but Wappler either does not receive or does not interpret the response within the expected time, which suggests a bottleneck.

:mag: What We Have Already Tested

  • Checked the banking API: The service is stable, with no downtime or abnormal delays.
  • Tried different timeout settings: Even with 10000ms, Wappler sometimes ignores the response and retries the request.
  • Other integrated systems do not have this issue: This unexpected behavior only occurs within Wappler.
  • The bank’s response headers include connection: keep-alive, indicating that connections can be reused.

Example of Bank API Response Headers

Here are the response headers from the banking API to our Wappler application:

{
  "date": "Wed, 29 Jan 2025 00:33:10 GMT",
  "content-type": "application/json; charset=utf-8",
  "content-length": "3076",
  "connection": "keep-alive",
  "x-envoy-upstream-service-time": "650",
  "x-amzn-requestid": "7608ae22-4544-4244-b655-8d13e071ed0f",
  "x-amzn-trace-id": "Root=1-67997745-1d64c1b324ea491656107128",
  "set-cookie": [
    "az_asm=F1tvVGlw8EyAURsfigR/GkuQKV9OB4p+3Kb9gSwwyXF+P0q9owVn3PpDGbESNfEoWCzGJOXCFfBRy8Dt; Secure; HttpOnly; SameSite=Lax",
    "az_botm=136943fa7d5676e2c7fdd3ce748773c6; Secure; HttpOnly; SameSite=Lax"
  ]
}

:small_blue_diamond: Note: The cookies are not required for future requests, so we can rule out any dependency on them.

:rocket: What We Need to Know

  1. Does Wappler have any limitations in processing asynchronous responses beyond a certain time threshold?
  2. Is there an internal automatic retry mechanism that could be causing these unnecessary retries?
  3. Are there any specific configurations to improve connection stability and ensure that the banking API response is correctly recognized?
  4. Could Wappler be closing the connection before receiving the response? Even though keep-alive is enabled, this might still be an issue.

I have been working on this for three months without a successful resolution. I really need your help to understand if something within Wappler is causing this behavior. I have traces and detailed logs that I can share if needed.

I look forward to your response and truly appreciate any guidance you can provide!

I had something similar happening a long time ago when I was handling bank transactions. What I ended up doing at the time was inserting a repeat of the api data, then a query to search for an id field returned by the bank, and adding a condition for length and if returned, ignore the record.

It was also suggested to use an index, but I don't believe I ever tested that.

1 Like

Hey Scott, thanks for your time.

So, this problem has been persisting for several months. The flow of this API is very simple. The system checks, validates, formats, and makes a POST request to the banking backend (XXXX). Sometimes, this POST request doesn’t receive or recognize the response from the requested API, while other times our backend processes it normally. I use timeout and retry (while + setValue) to attempt again if it returns a 504 (Timed out), and that's the issue.

I want to make it clear that only this system is facing this bottleneck. I’ve tried so many different solutions that I’m exhausted hahaha. I’ve even reached out to Patrick, Teodor, and the Wappler team, but... I just can’t get the help I need.

I’m even considering migrating this specific backend to Golang.

Anyway... just venting hahaha.

Sadly, I can not speak to some of the features you mention because I do not use them (while, wait, retry). However, what I would recommend is posting a screenshot of the server action (with all steps showing), so members of the community can look at it and see if they can detect any issues.

I understand, it turns out that this backend is using library, that is, there are several other steps. However, as I use Jaeger (Opentelemetry) I can see where the bottleneck is precisely, I can see the processing time for each step of the entire flow.

Flow of the "Problematic API Action":

Traces image (I have more traces):

Could it be that the error is actually on the bank APIs? They are not actually returning a response in time?
As per the trace image, it states clearly that the request has timed out after waiting for 8 seconds. Try increasing it to say 100 seconds, and see if it still fails.

Another reason could be that your server is failing to connect to the bank remote servers, causing the request to timeout? But since you say that request is recorded on bank, this seems less likely.

From what I know, there is no special mechanism Wappler implements for API calls, so it should just work like any other API call.
Since you have been struggling on this for 3 months, I would suggest to build the alternate with Golang, and see if APIs also fail under load there.

P.S. I recently encountered an issue where API calls from Wappler to certain endpoints would just timeout in 0ms. Have a post in community, but since its a very weird issue, haven't received any reply. The solution I have implemented is to set 120 seconds as default timeout value in the core Wappler file from where APIs calls are made - worked well for me.

1 Like

Hey Sid, thanks for your time. Alright, this problem is weird. So let's check this timeout setting and remove the timeout to collect more information requested by the bank and save the logs internally for accurate debugging.