It’s general and popular to find min/max value of a column in tables with aggregate functions such as min and max.
However, I would like to use vanila way to solve this problem, in addition to explain how does it work with join.
The explaination can help you get a abstract of join and how does it work in processing.
Issue
To help you understand how join can help you approach min and max, we need a good example. Now let start. We got
alot of people in a company, each of them has more then one assessment during their work time. The question is how to
get the latest, newest assessment and the first, oldest assessment. The big question is how to get the latest
and the first assessment answers filter by that person and assessment, remember that an assessment can be done many times. Our table will look like:
Thinking approach
To find min/max value of inserted_at value filtered by the person_id and assessment_id, first we need to find all record filtered by person_id and assessment_id, then we must find a min/value of inserted_at via a loop to comparing one to others inserted_at siblings, one which is smaller than others is the smallest and one which is greater than all others is the greatest. This also can be understand in this way.
If you can find a group named A from a fixed set filtered by some conditions, all elements in this group A can find a greater element from a fix set. On the other hands, other elements which does not belong to the group A cannot find any element which is greater than itself, which also means that those other element are the greatest.
If you can find a group named B from a fixed set filtered by some conditions, all elemments in this group B can find a smaller element from a fix set. On the other hands, other elements which does not belong to the group B cannot find any element which is smaller than itself, which also mean that those elements are the smallest.
Solution
I would like to find the latest/newest assessments of all people. Let do it step by step and analyze:
The over all data set of a person who own id is 1. As you can see, for a single person, there are many assessments has been done by him/her. We have to find out the latest assessment answers of each assessment id. For example: assessment answers with id: 1, 7, 8, 9; the person has finished this assessment_id 2 four times.
Now, let find a group of element which can find a greater one among the fixed set. The comparing value based on inserted_at column.
The fixed set we are talking about is starting from id 1 -> 12, however, our group contain 1, 7, 8, 10. 1, 7, 8, 10 can find greater element for instance:
The inserted_at(id_1) is smaller than inserted_at(id_7). For those elements which does not belong to this group cannot find greater element, because they are greatest ones. The other elements should include 2, 3, 4, 5, 6, 9, 11, 12. The previous query used join phrase as a consequence, it remove all unsatisfied element for condition aa.insrted_at < ab.inserted_at. On the other hands, left join takes satisfied and unsatisfied records.
Now let check the id of assessment answers records if the id 2, 3, 4, 5, 6, 7, 9, 11, 12 are the greatest. They cannot find any greater elements.
Let finalize our works to extract the greatest element only
And for short, here it’s final shot to find the latest/newest assessment answers of user(id: 1)
To find the smallest value, here is your solution:
Quesions
Q: Can we place the condition aa.inserted_at < ab.inserted_at under where phrase instead of under lelft join on phrase, and why?
A: No, if we put the condition aa.inserted_at < ab.inserted_at under the where phrase, we can only get a group of not-greatest elements
Q: If I use left join in this query, why dont I get a combine of satisfied and unsatisfied records:
A: This is how the query look like and its result set.
The condition I am talking here is aa.person_id = ab.person_id and aa.assessment_id = ab.assessment_id
Satisfied condition is aa.person_id == ab.person_id and aa.assessment_id == ab.assessment_id
Unsatisfied condition is aa.person_id != ab.person_id or aa.assessment_id != ab.assessment_id
Why the record that are not satisfied exist in the result set. Hmm, that’s a question I am think about. Personally, I think the developers of postgres did make an exception, hardcode perhap if people use equal in the join expression.