Database entries amount question

I have always just made small projects with a (very) limited amount of database entries.

This new project requires a lot more.
The one with the most entries is probably where I would like to store prices of collections a few times a day. And that for a growing number of collections. Let’s say about 100 but possibly growing to a thousand…

So that means at least 200 entries if I only record the prices twice a day. But if I would record the prices every hour, it would already be 2400 entries per day.

If the collection grows to 1000 it would mean 24000 entries… per day… (if recorded per hour)

Is this even sustainable in the long term?

Or is my worry unnecessarily?

NB: I understand that the bigger problem is what I do with the data and not the storage itself (at first)

(I would like to be able to display the price progression per collection)

Oh, I am using a mysql database…

Hi,

“If your data is inherently relational, and subject to queries that work well with SQL, you should be able to scale to hundreds of millions of records without exotic hardware requirements.

You will need to invest in indexing, query tuning, and making the occasional sacrifice to the relational model in the interests of speed. You should at least nod to performance when designing tables – preferring integers to strings for keys, for instance.”

https://stackoverflow.com/questions/5350581/how-many-rows-are-too-many-for-a-mysql-table/5351306

So it seems that you won’t have too much of a problem with those numbers

2 Likes

There are specialized databases in time-series data, such as TimescaleDB - this one’s based on PostgreSQL

For my next project I’m looking forward using Graphite

A specialized time-series database helps you handle the case where e.g. you want to “compress” old data (e.g.: transform daily price points into weekly after a year, transform weekly to a month after 5 years). Graphite also has a built-in graphing functionality

Usually you can’t run these kind of apps on regular shared hosting, so that’s something to keep in mind

Edit: I’ve actually done something similar to your project (regularly inserting data) a very long time ago in a MariaDB database. It did became slow, but I was also doing it way worse: I was inserting like 300 rows every minute. I’ve made a cron to erase old data and I think that helped keep things sane :upside_down_face:

So, naturally, for me using a time-series database is a logical evolution. For you, I think it’s ok to continue using MySQL. When you believe it doesn’t fit anymore I’m sure it won’t be difficult to migrate away

1 Like

Thanks htatd and apple.
Good to hear that for the time being I’m good :wink:

And yes time series database would be great. The high precision is only valuable for me in the short term. The data from a few months back there is no need for hourly updates. Daily or weekly would be enough.

Maybe there is an easy way to delete old data to keep only 1 datapoint per day?
Not needed right now but would be great to know for the future…

In case you want that, you could just create a scheduled action to check once a day/week/month for data that's older than x time, then perform actions to that older data. You could then delete all same day points and just keep last point of each day or something.

In any case I think you'll find the best solution for you once you get there

Since you are using mysql you can use (depending on what your hosting server techies have allowed to your level of account) the MySQL equivalent of a chron job --

Home » MySQL Triggers » Working with MySQL Scheduled Event

Working with MySQL Scheduled Event

Summary : in this tutorial, you will learn about MySQL event scheduler and how to create events to automate repetitive database tasks.

MySQL Events are tasks that execute according to a specified schedule. Therefore, sometimes MySQL events are referred to as scheduled events.

MySQL Events are named object which contains one or more SQL statement. They are stored in the database and executed at one or more intervals.

For example, you can create an event that optimizes all tables in the database that runs at 1:00 AM every Sunday.

You don't need any other database than mysql.

mysql> delete from price_collections where creationDate < now() - interval 30 DAY;

USING referenced page link you could duplicate what a chron job is usually suggested for in this forum

Make sure you have a timestamp field to record when the record was created
and make sure your key id is sufficient to handle thousands of integers in an auto_increment = 1 or something

run the delete checking the timestamp field and have the server compare its local now() time

Hi newmedia,
wouldn’t that delete all records older than one month?
I would like to keep one record (per collection) per day…

Running PHP & MySQL I made a copy of my main records table with a different name like “archive_main_records” to hold a backup table that was never called for by my php application.

I ran events or triggers or even periodic php scripts to command the mysql database and did regular backups of older records by timestamps. Then keeping only records in my online queried “main_records” table. (Names are examples) that were, say, created within the last week, 2 weeks, or 2 days – whatever.

MySQL queries & functions offer many different ways or sequence of procedures to solve needs.

But CREATE EVENT is the start of ways to get the server itself to do the job for you. Different procedures might vary according to whether you are running ISAM or INNODB schemas.

My example is just an example of possibilities. Using chron events then schedules WHEN it runs and you tell it to delete itself after running or you tell it to hang around and be ready to run again.

If your record has a timestamp on its creation named "recordtime"
SELECT * FROM price_collections WHERE recordtime <= NOW() - INTERVAL 1 DAY
would trim the table with entries older than 24 hours from whatever your server time is. Set up a Server event to run it at midnight, every midnight.
You should combine it with an export to sql or csv with a timestamp in the sql export as in "2022-Feb-21.sql" or whatever.
Your mysql is also capable of sending this nightly export to your chosen email address.

select *from price_collections where creation_date < now() - interval 1 week;

mysql understands things like "interval 1 week" or "interval 4 weeks" see the options available here --

What is the table schema you have & what do you mean with specific example by

I would like to keep one record (per collection) per day

Thanks for your help newmedia

My floor prices table schema is like this:

  • collection_name
  • date_time
  • floor_price

My current cronjob records the floor price of all collections every hour.

What I would like is to keep all this for one month and from all records older than one month I would like to keep just one record per day (so the other 23 records from that day can be deleted) per collection

If I understand you then “floor_price” is the biggie.
One record that shows the lowest price for that day or 24 hour period?
And is “date_time” a real UTC time stamp that equates to 2022-020-28 13:16:29 ?

What does this mean in very specific language?

"KEEP all this"

Exactly:

image

I would like to keep all data points (all records recorded) for one month (or so) and after that this kind of detail is no longer neccesary so I only would like to keep one record (out of the 24)

Saving this kind of data, as you already know, is not your problem.
But, are you saying that you want the existing online database being inserted into with each user submission, to, after 24 hours or something, simply contain One Lowest Floor Price PER “slug” so that each slug category has its own lowest floor price?

You want to start each new post-cleanup of the online table with maybe 5 records of each distinct slug with its Lowest value floorprice. And then let all the new days entries to commence, maybe another thousand plus new entries by users are added into this same table?

Then after 24 hours or something you strip out these thousands of new records, using the same filtering until you have the NEW 5 category records left with their new floorprices left over to start a new day’s entries?

I’m positive that mysql can easily handle what you want.

With perhaps 5 or 6 different most direct procedures which will all provide you with exactly what you need to do.

I’d like to keep this going but I have to dive out now.

I vill be baaaack.

But not until I find Sarah and protect her unborn child from the future past. Plus, my mission has added a new name – Elon Musk.

thnx newmedia!
I’ll look into it

Jim