Python Sets

A set is an unordered collection of unique and immutable objects. An item ca appear only once in a set. Because sets are collections of other objects, they share some behavior with objects such as lists and dictionaries. For example, sets are iterable, can grow and shrink on demand, and may contain a variety of object types.

Set basics

  • Sets do not have a positional ordering
  • Sets support subtraction but do not support addition, multiplication, and division
  • An item can appear only once in a set

To make a set object we use the built-in set() function:

# Create sets
>>> x = set('abcde')
>>> x
{'a', 'e', 'c', 'd', 'b'}
>>> y = set('defgh')
>>> y
{'g', 'h', 'f', 'd', 'e'}
>>> 
# Difference
>>> x - y
{'a', 'c', 'b'}
>>> y - x
{'h', 'f', 'g'}
>>> 
# Union
>>> x | y
{'h', 'd', 'f', 'b', 'a', 'c', 'e', 'g'}
>>> 
# Intersection
>>> x & y
{'d', 'e'}
>>> 
# Symmetric difference (XOR)
>>> x ^ y
{'h', 'f', 'b', 'a', 'c', 'g'}
>>> 
>>> x > y
False
>>> x < y
False
>>>
# Membership
>>> 'a' in x
True
>>> 'a' in y
False

Besides expressions, the set object provides methods that correspond to these operations.

>>> z = x.intersection(y)
>>> z
{'d', 'e'}
>>> z.add('p')
>>> z
{'p', 'd', 'e'}
>>> z.update('m', 'n')
>>> z
{'p', 'm', 'n', 'd', 'e'}
>>> z.remove('e')
>>> z
{'p', 'm', 'n', 'd'}

Sets can be used in operations such as len, for loops, and list comprehensions. Because they are unordered,they don’t support sequence operations like indexing and slicing:

>>> for item in z: print(item * 2)
... 
pp
mm
nn
dd

More examples:

# Expressions require both to be sets
>>> z | set(x)
{'d', 'b', 'a', 'p', 'm', 'c', 'e', 'n'}
>>>
>>> z | x
{'d', 'b', 'a', 'p', 'm', 'c', 'e', 'n'}
>>>
>>> z | {1, 5, 2, 7}
{1, 2, 5, 'd', 7, 'p', 'm', 'n'}
>>>
>>> z | [1, 2]
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unsupported operand type(s) for |: 'set' and 'list'
>>>
# Their methods allow any iterable
>>> z.union(['one', 'two'])
{'two', 'd', 'one', 'p', 'm', 'n'}

You can make sets from lists:

>>> set([1, 2, 3, 4, 5])
{1, 2, 3, 4, 5}
>>>
# issubset() method
>>> {1, 2, 3}.issubset(range(-5, 5))
True

Immutable constraints and frozen sets

Sets can only contain immutable object types. Lists and dictionaries can’t be embedded in sets, but tuples can:

>>> s = set('1a2b3c')
>>> s
{'3', 'b', 'a', 'c', '1', '2'}
>>>
# NO lists can be embeded
>>> s.add([4, 5])
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unhashable type: 'list'
>>>
# NO dictionaries can be embeded
>>> s.add({'v1':4, 'v2':5})
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unhashable type: 'dict'
>>>
# Tuples OK - can be embeded
>>> s.add((4, 5))
>>> s
{(4, 5), '3', 'b', 'a', 'c', '1', '2'}

Set comprehensions

Set comprehensions run a loop and collect the result of an expression on each iteration. The result is a new set you create by running the code:

# set comprehension
>>> {x ** 2 for x in [5, 6, 7]}
{25, 36, 49}
>>>
# Simpler example
>>> {x for x in 'Dan'}
{'a', 'D', 'n'}

Note:
x ** 2 is the collection expression
for x in [5, 6, 7] is the loop

Note:
All comprehensions support nested loops and if tests.

Practical applications of sets

Sets can be used to remove duplicated from a list:

>>> list1 = [1, 'a', 2, 'b', 3, 'c', 'a', 'c', 2]
>>> list1
[1, 'a', 2, 'b', 3, 'c', 'a', 'c', 2]
>>> set(list1)
{1, 2, 3, 'b', 'a', 'c'}
>>> list1 = list(set(list1))
>>> list1
[1, 2, 3, 'b', 'a', 'c']

Sets can be used to isolate differences in iterable objects:

# Lists – find differences
>>>> set([1, 2, 3, 4]) - set([3, 4, 5, 6])
{1, 2}
# Strings – find differences
>>> 
>>> set('abcdefg') - set('abdghi')
{'c', 'f', 'e'}
>>> 
# Mixed – find differences
>>> set('Dan') - set(['C', 'a', 'n'])
{'D'}

You can also use sets to perform order-neutral equality tests which means that two sets are equal if and only if every element of each set is contained in the other. It is used to compare the outputs of programs that should work the same but may generate results in different order.

>>> list1 = [1, 2, 3, 4]
>>> list2 = [2, 4, 3, 1]
>>> list1 == list2
False
>>> set(list1) == set(list2)
True
>>> sorted(list1) == sorted(list2)
True

Using sets for database query results.

>>> animals = {'lion', 'cat', 'zebra', 'dog', 'snake', 'shark'}
>>> reptiles = {'lizard', 'snake', 'crocodile'}
>>> pets = {'cat', 'dog', 'snake', 'lizard'}
>>> seaanimals = {'shark', 'crocodile', 'whale'}
>>>
# one item in a category
>>> 'shark' in pets
False
>>> 'shark' in seaanimals
True
>>>
# All in either category
>>> animals | reptiles
{'snake', 'zebra', 'lion', 'dog', 'shark', 'cat', 'lizard', 'crocodile'}
>>> seaanimals | pets
{'snake', 'whale', 'lizard', 'dog', 'shark', 'cat', 'crocodile'}
>>>
# Who is in both categories
>>> seaanimals & pets
set()
>>> pets & reptiles
{'snake', 'lizard'}
>>> animals & seaanimals
{'shark'}
>>>
# Items that are in the first category but not in the second
>>> reptiles - pets
{'crocodile'}
>>>
# Are shark and whale in seaanimals?
>>> {'shark', 'whale'} < seaanimals
True
>>>
# Are shark and whale in animals?
>>> {'shark', 'whale'} < animals
False
>>>
# All items in pets are also in animals?
>>> pets > animals
False
>>>
# What item is in one category but not in both categories
>>> animals ^ reptiles
{'lizard', 'cat', 'crocodile', 'zebra', 'lion', 'dog', 'shark'}
>>>
# Intersection
>>> a1 = animals | pets
>>> a1
{'snake', 'zebra', 'lion', 'dog', 'shark', 'cat', 'lizard'}
>>> a2 = reptiles ^ seaanimals
>>> a2
{'snake', 'whale', 'lizard', 'shark'}
>>> a1 - a2
{'zebra', 'lion', 'dog', 'cat'}
# shorter way
>>> (animals | pets) - (reptiles ^ seaanimals)
{'zebra', 'lion', 'dog', 'cat'}

Leave a Reply