Hi guys,
So here is my query:
select distinct `job_posts`.`id` as `job_post_id` from `job_posts` left join `job_post_job_level` on `job_post_job_level`.`job_post_id` = `job_posts`.`id` left join `job_post_job_post_tech` on `job_post_job_post_tech`.`job_post_id` = `job_posts`.`id` left join `job_post_job_post_cat_related` on `job_post_job_post_cat_related`.`job_post_id` = `job_posts`.`id` left join `job_post_job_locations` on `job_post_job_locations`.`job_post_id` = `job_posts`.`id` where (`job_posts`.`approved` = ? and ((`job_post_job_post_tech`.`tech` in ?) or (`job_post_job_level`.`level` in ?) or (`job_post_job_post_cat_related`.`job_sub_category` in ?) or (`job_post_job_locations`.`remote_location_id` in ?) or (`job_post_job_locations`.`non_remote_location_id` in ?)))
Here is the breakdown of what I’m trying to accomplish:
I’m building a matching engine, using MySQL.
The query should match the integer from each of the candidates preferences (e.g. tech, location, work_type) with the relevant integer from the jobs preferences (tech, location, work_type). The preferences are held in a number of sub tables which link to the main table (e.g. job or candidate). The match should output job_id which I then insert into a recommendatons table along with the candidate.id.
The data structure is similar for both, a main table (candidate / job) and then sub tables that hold the criteria identifier as an integer, and the ref to either the candidate or the job e.g:
Candidate (main table, with most of the text content e.g. name, position title etc.)
Candidate_tech_preference_sub_table
And then I have the job:
Job (main table, with most of the text content e.g. job_title, job_description etc. )
Job_tech_sub_table
What I’m doing in the Server Connect is first query the candidate to get all candidate data, and then I’m sing a series of conditions to check if the data exists (as not every field will always be presented) and then using a setval for this, I then is an ‘IN’ condition to match against the integers in the set val.
Here is an example of a setval used in this way:
"setval_location": [
16,
18
],
I have also built a custom module to handle the data in an way easier to match against, here is an example of the output:
"sqlprep": {
"candidate_id": 1,
"min_salary": "110000.00",
"user_role_level_join": null,
"user_location_join": [
11,
12
],
"user_industry_preferences": null,
"user_company_size_join": null,
"user_tech_preferences": null,
"user_roleType_preferences": null,
"user_role_priorities_join": null,
"user_language_join": null
}
The query does not fail, it works fine with OR conditions. But OR conditions are not desirable due to the nature of the matches. Instead I must use a AND query.
The match must get more narrow as more criteria are included and broader as there is less.
E.g.
If the candidate preference is ‘react’ for tech, and ‘sydney’ for locaton, then we must match all jobs with those criteria as a minimum.
What I have noticed is that with this query with the OR condition, it will work as long as there is at least one criteria, but it will return no results if none are present.
I’m assuming this is where the issue exists for the AND. I need the MySQL Query to be able run the query even without all the data.
I had this working with Postgres, but the syntax between the two differs and I javen’t been able to replicate the same outcome:
SELECT job_posts.id AS job_id
FROM job_posts
WHERE (
COALESCE(array_length(:P5::integer[], 1), 0) = 0
OR EXISTS (
SELECT 1
FROM job_post_job_locations
WHERE job_post_job_locations.job_post_id = job_posts.id
AND job_post_job_locations.remote_location_id = ANY(:P5)
)
)
AND (
COALESCE(array_length(:P9::integer[], 1), 0) = 0
OR EXISTS (
SELECT 1
FROM job_post_job_locations
WHERE job_post_job_locations.job_post_id = job_posts.id
AND job_post_job_locations.non_remote_location_id = ANY(:P9)
)
)
AND (
COALESCE(array_length(:P6::integer[], 1), 0) = 0
OR EXISTS (
SELECT 1
FROM job_posts
INNER JOIN companies ON companies.id = job_posts.company_id
INNER JOIN "company_sectorTags_join" ON "company_sectorTags_join".company_id = companies.id
WHERE job_posts.id = job_posts.id
AND "company_sectorTags_join".sector_tag_id = ANY(:P6)
)
)
AND (
COALESCE(array_length(:P4::integer[], 1), 0) = 0
OR EXISTS (
SELECT 1
FROM job_post_job_post_cat_related
WHERE job_post_job_post_cat_related.job_post_id = job_posts.id
AND job_post_job_post_cat_related.job_sub_category = ANY(:P4)
)
)
AND (
COALESCE(array_length(:P2::integer[], 1), 0) = 0
OR EXISTS (
SELECT 1
FROM job_posts
INNER JOIN companies ON companies.id = job_posts.company_id
WHERE job_posts.id = job_posts.id
AND companies.company_staff_size_ref = ANY(:P2)
)
)
AND (
COALESCE(array_length(:P1::integer[], 1), 0) = 0
OR EXISTS (
SELECT 1
FROM job_post_job_post_tech
WHERE job_post_job_post_tech.job_post_id = job_posts.id
AND job_post_job_post_tech.tech = ANY(:P1)
)
)
AND (
COALESCE(array_length(:P3::integer[], 1), 0) = 0
OR EXISTS (
SELECT 1
FROM job_post_job_level
WHERE job_post_job_level.job_post_id = job_posts.id
AND job_post_job_level.level = ANY(:P3)
)
);
I have been playing around a bit more today, using ChatGPT and here is a query we’ve come up with - but It’s still failing, it seems on the :P5 parameter that passes (16,18) to the query. SQL Query:
SELECT DISTINCT job_posts.id AS job_post_id
FROM job_posts
LEFT JOIN job_post_job_level ON job_post_job_level.job_post_id = job_posts.id
LEFT JOIN job_post_job_post_tech ON job_post_job_post_tech.job_post_id = job_posts.id
LEFT JOIN job_post_job_post_cat_related ON job_post_job_post_cat_related.job_post_id = job_posts.id
LEFT JOIN job_post_job_locations ON job_post_job_locations.job_post_id = job_posts.id
WHERE job_posts.approved = 1
AND (
EXISTS (
SELECT 1
FROM job_post_job_locations
WHERE job_post_job_locations.job_post_id = job_posts.id
AND job_post_job_locations.remote_location_id IN (:P5)
)
OR :P5 IS NULL -- Assuming :remoteLocationArray is the parameter you pass in
)
AND (
-- Your other conditions here, similar to the one above, for example:
:P1 IS NULL
OR EXISTS (
SELECT 1
FROM job_post_job_locations
WHERE job_post_job_locations.job_post_id = job_posts.id
AND job_post_job_locations.non_remote_location_id IN (:P1)
)
)
AND (
:P5 IS NULL
OR EXISTS (
SELECT 1
FROM job_post_job_post_cat_related
WHERE job_post_job_post_cat_related.job_post_id = job_posts.id
AND job_post_job_post_cat_related.job_sub_category IN (:P5)
)
)
AND (
:P5 IS NULL
OR EXISTS (
SELECT 1
FROM job_post_job_post_tech
WHERE job_post_job_post_tech.job_post_id = job_posts.id
AND job_post_job_post_tech.tech IN (:P5)
)
)
AND (
:P5 IS NULL
OR EXISTS (
SELECT 1
FROM job_post_job_level
WHERE job_post_job_level.job_post_id = job_posts.id
AND job_post_job_level.level IN (:P5)
)
);
Console Error:
`"SELECT DISTINCT job_posts.id AS job_post_id\nFROM job_posts\nLEFT JOIN job_post_job_level ON job_post_job_level.job_post_id = job_posts.id\nLEFT JOIN job_post_job_post_tech ON job_post_job_post_tech.job_post_id = job_posts.id\nLEFT JOIN job_post_job_post_cat_related ON job_post_job_post_cat_related.job_post_id = job_posts.id\nLEFT JOIN job_post_job_locations ON job_post_job_locations.job_post_id = job_posts.id\nWHERE job_posts.approved = 1\nAND (\n EXISTS (\n SELECT 1\n FROM job_post_job_locations\n WHERE job_post_job_locations.job_post_id = job_posts.id\n AND job_post_job_locations.remote_location_id IN (NULL)\n )\n OR NULL IS NULL -- Assuming :remoteLocationArray is the parameter you pass in\n)\nAND (\n -- Your other conditions here, similar to the one above, for example:\n 16, 18 IS NULL\n OR EXISTS (\n SELECT 1\n FROM job_post_job_locations\n WHERE job_post_job_locations.job_post_id = job_posts.id\n AND job_post_job_locations.non_remote_location_id IN (16, 18)\n )\n)\nAND (\n NULL IS NULL\n OR EXISTS (\n SELECT 1\n FROM job_post_job_post_cat_related\n WHERE job_post_job_post_cat_related.job_post_id = job_posts.id\n AND job_post_job_post_cat_related.job_sub_category IN (NULL)\n )\n)\nAND (\n NULL IS NULL\n OR EXISTS (\n SELECT 1\n FROM job_post_job_post_tech\n WHERE job_post_job_post_tech.job_post_id = job_posts.id\n AND job_post_job_post_tech.tech IN (NULL)\n )\n)\nAND (\n NULL IS NULL\n OR EXISTS (\n SELECT 1\n FROM job_post_job_level\n WHERE job_post_job_level.job_post_id = job_posts.id\n AND job_post_job_level.level IN (NULL)\n )\n);\n - Operand should contain 1 column(s)"`
Any help is really appreciated here and happy to pay for it.