Joining Tables in SQL
When data is required from more than one table in a database it is necessary to join the tables using common data. To illustrate this, the ‘person’ table, used in previous examples, can be joined to a ‘person_order’ table using the ‘id’ from ‘person’. The ‘id’ must also exist against each order. In a situation like this the ‘id’ from ‘person’ would normally be prefixed with something in the ‘person_order’ table to signify what it relates to, such as ‘person_id’, so as not to be confused with its own ‘id’. The relationship between these two tables is known as a one-to-many relationship because a person can have many orders, but an order is only related to one person.
person | ||||
id | firstname | lastname | title | dob |
1 | Fred | Bloggs | Mr | 1980-05-05 |
2 | Simon | Smith | Mr | 1960-04-01 |
3 | Freida | Bloggs | Mrs | 1970-10-12 |
4 | Fiona | Jones | Miss | 1985-05-19 |
person_order | |||
id | order_date | order_total | person_id |
1 | 2016-01-19 | 101.60 | 1 |
2 | 2016-02-02 | 256.54 | 2 |
3 | 2016-03-05 | 222.10 | 4 |
4 | 2016-05-23 | 12.45 | 1 |
5 | 2016-07-29 | 325.65 | 2 |
The simplest type of join is known as an ‘inner join’, where data is returned only where common data, in this case the ‘id’ of a person, exists in both tables. This takes the following form. Note that the ‘order by’ is optional.
SELECT table_name.column_name1, table_name.column_name2 FROM table_name1 INNER JOIN table_name2 ON table_name1.column_name=table_name2.column_name ORDER BY table_name.column_name1, table_name.column_name2;
The ‘table_name.’ prefix is only absolutely necessary when columns with the same name exist in both tables, which, in the case of the above two tables would be ‘id’, however, no error is produced if the table name prefix is included for all columns.
The example below returns the ‘id’, ‘firstname’ and ‘lastname’ from the ‘person’ table, as well as ‘id’, ‘order_date’ and ‘order_total’ from the ‘person_order’ table, in order of ‘lastname’, ‘firstname’, ‘id’ and then ‘order_date’ in descending order.
SELECT person.id, person.firstname, person.lastname, person_order.id, person_order.order_date, person_order.order_total FROM person INNER JOIN person_order ON person.id=person_order.person_id ORDER BY person.lastname, person.firstname, person.id, person_order.order_date desc;
The above ‘select’ statement will return the following data.
id | firstname | lastname | id | order_date | order_total |
1 | Fred | Bloggs | 4 | 2016-05-23 | 12.45 |
1 | Fred | Bloggs | 1 | 2016-01-19 | 102.60 |
4 | Fiona | Jones | 3 | 2016-03-05 | 222.10 |
2 | Simon | Smith | 5 | 2016-07-29 | 325.65 |
2 | Simon | Smith | 2 | 2016-02-02 | 257.54 |
Notice that ‘Freida Bloggs’ does not appear in the data that has been returned. This is because she has no corresponding orders, however, ‘Fred Bloggs’ and ‘Simon Smith’ appear twice because they each have two orders attributed to them. The other thing to note is that there are two columns headed up ‘id’, the first of which is the ‘id’ from the ‘person’ table, whilst the second is the ‘id’ from ‘person_order’. To differentiate between the two, an alias could be included for each, for example ‘person_id’ and ‘order_id’ as shown below.
SELECT person.id person_id, person.firstname, person.lastname, person_order.id order_id, person_order.order_date, person_order.order_total FROM person INNER JOIN person_order ON person.id=person_order.person_id ORDER BY person.lastname, person.firstname, person.id, person_order.order_date desc;
If it were necessary to include records in the ‘person’ table where there were no corresponding ‘person_order’ records, such as ‘Freida Bloggs’, then a ‘left outer join’ can be used.
SELECT table_name.column_name1, table_name.column_name2 FROM table_name1 LEFT OUTER JOIN table_name2 ON table_name1.column_name=table_name2.column_name ORDER BY table_name.column_name1, table_name.column_name2;
Putting this into practice for the ‘person’ and ‘person_order’ scenario would look like this.
SELECT person.id person_id, person.firstname, person.lastname, person_order.id order_id, person_order.order_date, person_order.order_total FROM person LEFT OUTER JOIN person_order ON person.id=person_order.person_id ORDER BY person.lastname, person.firstname, person.id, person_order.order_date desc;
The data returned now looks like this. As ‘Freida Bloggs’ has no corresponding orders, the columns relating to orders, namely, ‘order_id’, ‘order_date’ and ‘order_total’ are blank.
person_id | firstname | lastname | order_id | order_date | order_total |
1 | Fred | Bloggs | 4 | 2016-05-23 | 12.45 |
1 | Fred | Bloggs | 1 | 2016-01-19 | 102.60 |
3 | Freida | Bloggs | |||
4 | Fiona | Jones | 3 | 2016-03-05 | 222.10 |
2 | Simon | Smith | 5 | 2016-07-29 | 325.65 |
2 | Simon | Smith | 2 | 2016-02-02 | 257.54 |
There are two more types of join, a ‘right outer join’ and a ‘full outer join’. A ‘right outer join’ is the opposite of a ‘left outer join’, where all the records from the second table are included, along with any matching records in the first. Where no corresponding records exist in table one, null values are returned.
SELECT table_name.column_name1, table_name.column_name2 FROM table_name1 RIGHT OUTER JOIN table_name2 ON table_name1.column_name=table_name2.column_name ORDER BY table_name.column_name1, table_name.column_name2;
If it were necessary to return all records from both tables, regardless of whether there are corresponding record in the other table, then a ‘full outer join’ can be used. Again, null values are returned where no corresponding records exist.
SELECT table_name.column_name1, table_name.column_name2 FROM table_name1 FULL OUTER JOIN table_name2 ON table_name1.column_name=table_name2.column_name ORDER BY table_name.column_name1, table_name.column_name2;
With the two example tables above, neither a ‘right outer join’ nor a ‘full outer join’ will be of use because a ‘person’ record always exists for a ‘person_order’ record.