Most efficient way to build a system that counts 'impessions' of listings

mgaussie · June 4, 2021, 11:44pm

Hey all,

With one of our projects we have a dynamic diectory, this creates thousands of page variations with results.

We have page views being counted when a specific profile is viewed, however, i’d now like to introduce impressions when the results are displayed on page.

Initially the hyphothesis was easy to create this.

One new transaction type table that counts the record id, impression date etc.
Then on the actual record table, I have a simple integer that counts upwards.

When the page loads, and the paged DB Query returns the 15 results, my thought (and my current setup) was simply to repeat the paged query (the 15 results) - only the specific id, not the whole data returned and then have an ‘insert’ db query for the impressions table, and then an update query matching the record id and simply adding 1 to the records impressions number - all happening on the server side aligned with the results.

This is working fine.

However, the speed the page results return and display on the front end has slowed quite a lot.

So I was wondering if perhaps anyone had any other ideas that may improve the logic to achieve this, and importantly not impact performance ‘too much’. As it’s a directory listing results, with SEO being the primary goal - we need to maintain rapid page load.

Appreciate any input from anyone on the forum.

sid · June 5, 2021, 1:01am

Hi.
Not sure what you mean by “repeat page query on specific id, not the whole data”, but I am assuming it simply translates to - using a repeat on the paged query. Am I right?

Also, inside the repeat you are doing a check first to identify if you have to insert or update?

mgaussie · June 5, 2021, 1:18am

Hi Sid

Yep I’m repeating the pages query but within the repeat I’m only outputting the id which I’m matching against in update or using for insert.

I don’t need a condition to check as the process both inserts and then updates every time

sid · June 5, 2021, 2:04am

The number of items in output don’t really affect processing time. Only the response size would be affected.

From the looks of it, 15 repeat insert and update should not take much time at all.
Do you see speed improvements when you disable this repeat step?

Another reason could be if this particular table is being written-to quite often, that could be the actual culprit and not the SA.

JonL · June 5, 2021, 8:08am

You could use Redis for that if you are concerned about performance and we are talking about millions of hits. It’s meant for that.

mgaussie · June 7, 2021, 7:09am

Ok thanks Sid, I just prefer to keep things as minimal as possible, but good to know that that would not really play a role in the increased load time.

Just ran a test, 15.74 seconds load time with the new repeat included in the SC.
With this disabled:
4.52seconds load time, so the difference is significant.

This hasn't gone to production yet, the insert goes into a brand new table and is only inserted into in this one test SC - the update goes into a primary table (that holds public viewable data) so is read-accessed often, but again, not in this test scenario on my local environment.

It's a super basic workflow.

First Query is a page query, returns 15 records with 20+ data points.
Then I repeat that, output the id and use the id for the insert and to match against for the update.

I have also tested disabling either the insert, or the update in the repeat. If either are disabled, it brings it down to 12.4/5 seconds. So still far too heavy, and neither of them causing the issue by themselves. If I leave the repeat enabled, but then disable the update AND insert, back down to 4/3 second load times.

The insert only inserts 5 fields and the update is a simple number + 1 update of a single field.

I appreciate that the local environment will be a heap slower, it doesn't take 4+ seconds to load on prod, but the different is far too high right now for me to consider this approach suitable for prod.

mgaussie · June 7, 2021, 7:17am

Thanks JonL, I'm pretty entry level when it comes to redis - I am using it within Wappler and then also with an external DB for a second project for maintaining logged in sessions and caching any heavy queries where the data never changes - so excuse my basic question/s.

So in this instance, are you suggesting that I'd cache the initial paged query, thus reduce load time by that query, helping reduce the overall load time? I don't at this point understand if redis could be used for the insert/update actions - I assume not?

JonL · June 7, 2021, 7:37am

You can use it to increment a counter. Then if you want the data to live in your database you could schedule a job that updates the records there.

But if you have millions of impressions and you don’t want to query the database for each visit redis is a good alternative.

https://redis.io/commands/INCR

mgaussie · June 7, 2021, 9:12am

Thank you for the direction here Jon. Don’t fully understand it yet but will dive into this to see if I can get my head around it. Feels like the ‘right’ way to do it - rather than my approach

JonL · June 7, 2021, 9:17am

Additionally, in case you don’t know Redis commands are available in SC(but not yet from the UI).

For more info check /lib/modules/redis.js

mgaussie · June 7, 2021, 9:25am

I did not know that!! I feel a lot more confident now I can get something up.

Am I right in assuming this will need to be a custom module?

sid · June 7, 2021, 9:32am

It looks like it could be a connection delay issue.
Have you tried to insert into the DB directly using Dbeaver or similar DB manager software - and checked how much time that takes?

From what I understand, everytime a query needs to run, Wappler connects to the DB mentioned in the query and execute it.
So for each insert & update, it would be doing that. And, if it takes some time to connect with the DB, that would explain the 10 second time to run simple insert & update queries.

I could be wrong. @patrick would be the best guy to idenitfy whats wrong here.

JonL · June 7, 2021, 9:32am

I believe so. @George?