SQL – Union vs Join Performance

Precursor

Following on from a comment on the answer I gave for this question on StackOverflow I was asked to comment by @seebiscuit on the following answer by someone else:

a well written join will be faster than a union

So me being me, I decided to look into this issue and do some testing myself.

Enviornment

So first off I will be running this on the following:

  • Local version of SQL Server Express 2014 (12.0.2269.0)
  • Using SSMS to run my queries.
  • Windows 10 x64
  • i7 930 1st Gen CPU | 16GB Ram | Samsung Evo 950 Pro SATA3 SSD

I decided to start of with the trusted Microsft Adventure Works SQL Server 2014 Sample Database available here:

Test Method

In my SSMS session run the following to turn on statistics

SET STATISTICS TIME ON

then record the CPU time and elapsed time for each run. Run each query 10 times, discard the 1 Minimum, 1 Maximum and average the rest (8)

Results

2016-04-14 00_39_00-Book1 - Excel

Queries – PK Clustered Index <int>

Wrote the following queries on the Person.Person table which was the biggest ( ~20k Records / 30mb Data )

Query 1 – UNION ALL:

SELECT BusinessEntityID, ModifiedDate, rowguid FROM Person.Person
UNION ALL
SELECT BusinessEntityID, ModifiedDate, rowguid FROM Person.Person    

2016-04-06 21_20_25-Start

Query 2 – UNION

SELECT BusinessEntityID, ModifiedDate, rowguid FROM Person.Person
UNION
SELECT BusinessEntityID, ModifiedDate, rowguid FROM Person.Person    

2016-04-06 21_23_43-Start

Query 3 – INNER JOIN

SELECT p1.BusinessEntityID, p1.ModifiedDate, p1.rowguid
    ,  p2.BusinessEntityID, p2.ModifiedDate, p2.rowguid 
FROM Person.Person p1 
INNER JOIN Person.Person p2 
    ON p2.BusinessEntityID = p1.BusinessEntityID

2016-04-06 21_25_35-Start

Query 4 – LEFT JOIN

SELECT p1.BusinessEntityID, p1.ModifiedDate, p1.rowguid
    ,  p2.BusinessEntityID, p2.ModifiedDate, p2.rowguid 
FROM Person.Person p1 
LEFT JOIN Person.Person p2 
    ON p2.BusinessEntityID = p1.BusinessEntityID    

2016-04-06 21_26_21-Start

Query 5 – RIGHT JOIN

SELECT p1.BusinessEntityID, p1.ModifiedDate, p1.rowguid
    ,  p2.BusinessEntityID, p2.ModifiedDate, p2.rowguid 
FROM Person.Person p1 
LEFT JOIN Person.Person p2 
    ON p2.BusinessEntityID = p1.BusinessEntityID    

2016-04-06 21_26_21-Start

Query 6 – FULL OUTER JOIN

SELECT p1.BusinessEntityID, p1.ModifiedDate, p1.rowguid
    ,  p2.BusinessEntityID, p2.ModifiedDate, p2.rowguid 
FROM Person.Person p1 
FULL OUTER JOIN Person.Person p2 
    ON p2.BusinessEntityID = p1.BusinessEntityID

2016-04-06 21_30_47-Start

Results – Table

MethodMedianStdDevCPUTime(ms)
1. Union All21.8937 ms6.7085 ms18
2. Union19.1815 ms2.2459 ms16
3. Inner Join24.5083 ms4.8172 ms21
4. Left Join23.3757 ms4.2216 ms23
5. Right Join25.7348 ms5.0168 ms23
6. Full Outer Join21.8262 ms2.2162 ms20

Comments

I was planning to do more and different types of queries and compare how Unions vs Joins compare, as well as much larger datasets (close to 2gb) but this has taken longer than I hope already (a week) and if there is more interest oleave a comment and it wil motivate me to delve into the issue.

In the end, it is hard to think of a scenario where you could use 1 or the other as they serve very different purposes! And you are best of using the right tool for the job.

If someone has some interesting scenario’s where you could actually use both then let me know!

One thought on “SQL – Union vs Join Performance

  1. How many rows in each resultset? That is, did UNION DISTINCT do a lot of dedupping, thereby speed up the result delivery? Please provide the CREATE TABLE. It would be nice to know what indexes the table has. Were all the relevant caches ‘hot’ when you got the timings? The outputs are not consistent — 6 columns in the JOIN cases; 3 columns, but twice as many rows in the UNION cases. So I do not consider JOIN and UNION to be directly interchangeable. That is, the post-processing cost may eat up any savings in the query cost.

Leave a Reply

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

This site uses Akismet to reduce spam. Learn how your comment data is processed.