Parallel Repeat

The current Parallel block of the NodeJS server target doesn’t allow to parallelize an array, only a series of hard-coded steps - meaning you cannot do a database query and parallelize actions over each element/row/item.

An example makes it easier to understand. Let’s say you have a table “websites” with URLs, such as:

https://wappler.io
https://microsoft.com
https://bing.com
https://google.com
https://youtube.com

If you want to perform an API Action (fetch those websites) in parallel, you can’t - the only way to do so is to hard-code those API Actions inside the Parallel block - you cannot iterate the database query “websites” and have it perform parallel requests.

Previous post of mine, once posted privately to the Wappler team and Ambassadors:

I think a Parallel Repeat would be nice, to repeat whatever is inside the repeat in parallel. Probably you’d need to add some sort of safety to limit the number of stuff in parallel, I found this:
https://caolan.github.io/async/v3/docs.html#parallelLimit

(scroll upwards on that page to see an example of a regular parallel)

Example usage:

Set Value urls = ["http://example.com", "https://wappler.com"]
ParallelRepeat urls:
    API Action $value

So those requests would be fetched in parallel. Anything that goes inside the repeat would run in parallel as a group (iteration 1 is parallel 1, iteration 2 is parallel 2 and so on)

In the parallel step options you can add the option of a parallel limit, default value e.g.: 5

Just stumbled upon this. Indeed this is a must for the parallel group to be really useful.

I guess this one is a tricky one considering what the parallel group does.

This comes to mind. Take into account that I haven’t even seen or used this functionality yet so I might be talking bs :slight_smile:

How about allowing an array as an input variable for a parallel group. This way you can calculate dynamic data outside the parallel group and pass it to the group.

Inside the group any reference to the input variable will cause the logic to loop it automagically. That would mean that only certain actions steps should be allowed inside the parallel group to avoid weird stuff happening.

Also custom extensions should be updated to support a parallel group.

3 Likes

You started well, and then I miserably failed to understand everything you just said :joy:

I think it’s not required special extensions support, I think you might be over complicating with restricting to certain steps and such

The current way Parallel works, it runs all inner steps in parallel (each step runs within its own “Promise”). The difference we want is, instead of parallelizing an array of steps, we want to parallelize an array of things

One could use the Group step inside Parallel Repeat as well, so we could still keep the step parallelization as well

1 Like

As said I was probably BSing due to my ignorance of how the thing is working. I haven’t updated yet so I didn’t check what the code looks like. Thanks for the insight into it.

What I basically meant is that whatever dynamic data you need to figure out first(the array of websites) needs to be passed as a variable to the parallel group and then based on the size of the array(size n) and the step where it’s used it will create n parallel execution of that step.

P.S. I reserve my right to continue BSing until I upgrade :joy:

1 Like

Your computer scientist explanation of size N works for me, sounds good to me :+1:

1 Like

Let’s pray @patrick doesn’t start with that O notation complexity BS that we all forgot already to excuse himself from working on this.

I spent hours, HOURS, trying to implement this.

Unfortunately, the inner details are very confusing (due to the usage of variables such as this.scope and this.data), I couldn’t get it working straight, variables were lost on the way

This is in the hands of Patrick now

1 Like

Thank you for your effort!

@patrick would you have any indication of where in the pipeline this is?

I’ll share my use case: I’d like to speed up the process of uploading files to S3, and then scrapping their contents.
Ideally this would be done in parallel to save time.

Many thanks

This would be extremely useful for me. Would love to see this.

Bump

As @Apple already figured out it is not easy to implement in the current Server Connect due how the data scoping works. Having the actions run in parallel will make them overwrite each others data and it would probably require a big rewrite of the core to make it work.

2 Likes

Thank you for the feedback. What a pity, I was holding my breath on this one.

2 Likes

Hey @patrick - stumbled on this as i’m looking to speed up some of my complex api’s in the new project. There is a few scenario’s where I am repeating on a dataset from the database, and calling an API that can take a couple of seconds to respond. So the process sometimes time’s out due to this.

So i’d like to run the api’s in parallel, but as parallel is not an option for this (the module) can you guide us on another way we can possibly achieve this?