Set Theory in Relational Databases

Set Theory is the mathematical foundation of the relational database model. Concepts such as Union, Intersection, Difference, and Cartesian Product are directly implemented in SQL. Understanding these concepts helps database engineers write more logical, efficient, and powerful queries. This article explains the relationship between Set Theory and relational databases, the main operations, and practical SQL examples.

Set TheoryRelational DatabaseSet Theory SQL

~3 min read • Updated Apr 12, 2026

Introduction

The relational database model, introduced by Edgar F. Codd in 1970, is built upon mathematics — particularly Set Theory. Every table in a relational database is essentially a set of records (tuples). All major SQL operations such as SELECT, JOIN, UNION, INTERSECT, and EXCEPT are directly derived from set operations.

Basic Concepts of Set Theory

A set is a collection of unique elements. In relational databases:

  • Table = Set
  • Row = Element (Tuple)
  • Column = Attribute

Main Set Operations in Databases

1. Union

The union of two sets contains all unique elements from both sets.

SELECT * FROM Employees
UNION
SELECT * FROM Contractors;

2. Intersection

Returns only the elements that exist in both sets.

SELECT * FROM Employees
INTERSECT
SELECT * FROM Contractors;

3. Difference (Minus / Except)

Returns elements that are in the first set but not in the second set.

SELECT * FROM Employees
EXCEPT
SELECT * FROM Contractors;

4. Cartesian Product

Combines every member of the first set with every member of the second set. This is the foundation of JOIN without any condition.

SELECT * FROM Departments, Employees;   -- Cartesian Product

The Role of Set Theory in JOINs

Different types of JOIN in SQL are derived from set theory concepts:

  • INNER JOIN → Intersection
  • LEFT JOIN → First set + (First set minus Intersection)
  • RIGHT JOIN → Second set + (Second set minus Intersection)
  • FULL OUTER JOIN → Union

Practical Example: INNER JOIN

SELECT e.name, d.department_name
FROM Employees e
INNER JOIN Departments d ON e.dept_id = d.id;

Important Properties of Sets in Relational Databases

  • No Duplicates: Each row in a table is unique (guaranteed by Primary Key).
  • Order is Irrelevant: The order of rows in a set does not matter.
  • Atomicity: Each cell contains only one value (First Normal Form).

Practical Applications of Set Theory

Understanding set operations helps database engineers to:

  • Write simpler and more powerful queries
  • Optimize query performance
  • Use UNION instead of OR in specific cases
  • Manage duplicate data effectively

Complete Example: Combining Multiple Set Operations

-- Employees in Sales department who are not assigned to the current project
SELECT name FROM Employees
WHERE dept_id = (SELECT id FROM Departments WHERE name = 'Sales')
EXCEPT
SELECT e.name FROM Employees e
INNER JOIN Project_Assignments pa ON e.id = pa.employee_id;

Conclusion

Set Theory is the mathematical foundation of the relational model. A deep understanding of it allows you to see SQL not just as a query language, but as a powerful mathematical language. The more you work with UNION, INTERSECT, EXCEPT, and JOIN operations, the better you will become at writing efficient and logical queries.

Written & researched by Dr. Shahin Siami