Joins in SQL: A simplified guide
Introduction to joins in SQL
This guide is aimed at explaining in simple terms the different types of joins between tables in SQL
We use simplified real life examples combining data from Products, Stores, Sales and Inventory
What we will not do unlike other guides we have seen is attempt to explain joins using Venn diagrams. In my experience trying to explain joins using venn diagrams is not helpful and tends to confuse people even more than before
Joins are a vital way to combine data from multiple tables into one coherent data set for analysis
Good ‘normalised’ database design would tend to store data as efficiently as possible, for example a Sales table would likely contain separate IDs for each part of a transaction such as a ProductID, StoreID and a Customer ID. This saves a huge amount of duplication of data as the IDs can be efficiently stored as integer values and then joins can be used to look up the relevant details for each store, product or customer
Here is a simplified representation of a sales table, we can see a record for each store and product combination what the quantity sold has been. But from a reporting point of view we need to know what the products and the stores are
A very simplified store dimension table would like this, with one row representing each store
Products would also have a similar structure (although there would typically be lots more details in separate columns!
To create a report that brings these three data tables together we need to use a join
You can follow along these examples by running the SQL
SQL Joins Examples
There are four main types of join between tables, here we will start with the simplest one
The first type of join is call an inner join
instead of using a convoluted Venn diagram to confuse everyone, we will start with a typical question
What were the sales in the Aberdeen store?
To answer this we run the following query
select Store_Name, Product_Name, qty_sold
from sales
join stores on sales.store_ID = stores.store_ID
join products on sales.product_ID = products.product_ID
where sales.store_ID = 1
We now get a much more useful dataset that can be easily understood
This is an example of a standard join, also called an INNER JOIN
an inner join returns only records that have a match in both tables
To explain this clearly, we know that there are 10 products in the products table, but only 9 had sales. The Aberdeen store sold no fish, so there is no matching entry in the sales table, so the row is excluded
The next type of join is a LEFT JOIN
The next type of join we will examine is called a left join, these can also be called a left outer join
If you were asked the question “Can I see all products and what they sales they have (if any)”
Then straight away we know that some of the products had no sales in Aberdeen
In this instance we will start with all products
select * from products
This gives us a list of all of our products (including the fish!)
If we add the left join (but limited to just sales for Aberdeen)
select * from products
left join sales on sales.product_ID = products.product_ID and sales.store_ID = 1
We can see all 10 products, but with a null row for the fish
RIGHT JOIN
A right join is similar, but in reverse
A typical question here would be something like, “can I see all Aberdeen sales and any products they haven’t sold too”
select a.*, products.*
from
(
select * from Sales where sales.store_ID = 1
) a
right join products on a.product_ID = products.product_ID
Here we start with the Aberdeen sales inside the sub query we have called ‘a’, then a right join adds everything from the products table to this matching where it can and adding rows with null values for the missing sales data
FULL JOIN
A full join is the final type of join, a full join returns all records whether there is a match in either table. These are sometimes called a full outer join
“Can I see all items in Aberdeen that have either sales or stock”
We know that not all items have sales, and some may have no stock despite having had some sales
A full join provides us with this
select * from
(
select * from inventory where inventory.store_id = 1
) i
full join
(
select * from Sales where sales.store_ID = 1
) s on s.product_ID = i.product_ID
Conclusion
Understanding the four main types of join and when you might use them is a key SQL skill to learn
Seeing the examples and being able to follow along should reinforce that knowledge, download the sample SQL script to create your own tables and follow along
Much easier to understand than a Venn diagram
Subscribe to our channel to see more SQL tips and timesavers
Select Distinct YouTube Channel
Or find other useful SQL, Power BI or other business analytics timesavers in our Blog
Our Business Analytics Timesavers are selected from our day to day analytics consultancy work and are the everyday things we see that really help analysts, SQL developers, BI Developers and many more people. Our blog has something for everyone, from tips for improving your SQL skills to posts about BI tools and techniques. We hope that you find these helpful!