Set Members in Python

Can a Python set contain another set? 🤔 It depends on what type of set we are using.

Table of Contents

Set Types


Set types are unordered collections of unique objects.

Sets are most often used for quick membership testing, removing duplicates from a sequence, and performing mathematical operations such as intersection, union, difference, and symmetric difference.


Set and Frozenset

There are two types of sets in Python:

- Set (mutable set). A set is mutable and unhashable.

# type constructor
s = set([iterable])

# comma-separated list of elements within braces
s = {item1, item2, ...}

# set comprehension
s = {i for i in iterable if condition}

- Frozen set (immutable set). A frozenset is immutable and hashable.

# type constructor
fs = frozenset([iterable])

To preserve the uniqueness of elements in a set, we need something to distinguish one object from another.

Set members must be hashable objects. Set data structures use hash values internally.


If we have two objects to add to a set, for example, s = {x, y}:

If hash(x) == hash(y) and x == y, then only the first object will be in the set.
If hash(x) == hash(y) and x != y, then both objects will be in the set.
If hash(x) != hash(y) and x == y, then both objects will be in the set.
If hash(x) != hash(y) and x != y, then both objects will be in the set.

Set of Sets

Because a mutable set is unhashable, other sets and frozensets cannot contain it.

s = set()
print(s.__hash__)
# None
print(hash(s))
# TypeError: unhashable type: 'set'

We can succeed with a frozenset.

# set(iterable), frozenset(iterable)

fs1 = frozenset("abc")
print(fs1)
# frozenset({'a', 'b', 'c'})

print(fs1.__hash__)
# <method-wrapper '__hash__' of frozenset object at 0x000001F62C1BF060>

print(hash(fs1))
# 3556539344838112942

fs2 = frozenset([fs1])
print(fs2)
# frozenset({frozenset({'a', 'b', 'c'})})

s = {fs2}
print(s)
# {frozenset({frozenset({'a', 'b', 'c'})})}

s = set([fs2])
print(s)
# {frozenset({frozenset({'a', 'b', 'c'})})}

Can a Python set contain another set? Yes and No.

A mutable set can contain a frozenset(s). A frozenset can contain other frozenset(s).
A mutable set cannot contain another mutable set. A frozenset cannot contain a mutable set.

Set Members


Immutable Objects

Most immutable built-in objects are hashable. Strings, numbers, bytes, range, True, False, None can be set members.

s = {bytes(5), True, False, None, 12, "string", 1.5, range(5)}
print(s)
# {False, True, 1.5, 12, b'\x00\x00\x00\x00\x00', 
# range(0, 5), None, 'string'}

Some objects may have the same hash and be compared as equal but belong to different classes. In this case, the set will take the first two objects with different hashes.

print(hash(1.0))
# 1
print(hash(1))
# 1
print(hash(True))
# 1
print(1.0 == 1, 1.0 == True, 1 == True)
# True True True

s = {1.0, 1, True, 0.0, 0, False}
print(s)
# {0.0, 1.0}

A tuple can also be in a set, but only if it does not contain mutable (unhashable) objects. Tuples have non-transitive immutability.

t = (1, 2, 3)
s = {t}
print(s)
# {(1, 2, 3)}

t = (1, {"a": "A"})
s = {t}
# TypeError: unhashable type: 'dict'

Mutable Objects

If we try to create a set with some mutable object, such as a list or a dictionary, we will get an error.

s = {1, "a", []}
# TypeError: unhashable type: 'list'

d = {"a": "A", "b": "B"}
s = {d}
# TypeError: unhashable type: 'dict'

s = {bytearray()}
# TypeError: unhashable type: 'bytearray'

A set can accept mutable objects via a set()/frozenset() type constructor if all elements of the mutable object are hashable.

# set(iterable), frozenset(iterable)

list_ = [1, 2, 3]
s = set(list_)
print(s)
# {1, 2, 3}

d = {"a": "A", "b": "B"}
s = set(d.values())
print(s)
# {'A', 'B'}

s = set(bytearray(b"abc"))
print(s)
# {97, 98, 99}

Set can accept dict keys as iterable because dict requires the key to be a hashable object.

d = {"a": "A", "b": "B"}
s = set(d.keys())
print(s)
# {'a', 'b'}

# try use list as a dict key
d = {"a": "A", []: "B"}
# TypeError: unhashable type: 'list'

Slices can also be members of a set. But it depends on what object we are slicing. Subscript notation uses slice objects internally.

# slices created using subscript notation, []

seq = "Python"
print(type(seq[1:2]))
# <class 'str'>
s = {seq[1:2]}
print(s)
# {'y'}

seq = [1, 2, 3]
print(type(seq[1:2]))
# <class 'list'>
s = {seq[1:2]}
# TypeError: unhashable type: 'list'

# slice object

so = slice(1, 2)
print(type(so))
# <class 'slice'>
s = {so}
# TypeError: unhashable type: 'slice'

Since Python 3.12, slice() objects are hashable.


Other Objects

All Python objects are instances of some class. Classes in Python are hashable by default. However, this can be changed by overriding the __hash__ method.

Built-in functions and classes are hashable (except for mutable containers such as lists or dicts). Modules, generators, iterators, file objects, user-defined functions, user-defined classes, etc. are hashable.

import random

def foo():
    pass

class A:
    def method(self):
        pass
     
a = A()
g = (x for x in "abc")
i = iter([1, 2, 3, 4, 5])
f = open("test.py")

s = {sorted, Exception("Error"), random, A, a, a.method, g, i, f}

Examples of Hashable Classes


User-Defined Class

Instances of the same class usually have different hash values. If the __hash__ method (inherited from the object superclass) is not overridden, the hash value is calculated from the instance id. If the __eq__ method (inherited from the object superclass) is not overridden, a class instance is compared as equal only to itself.

class A:
    pass

a = A()
a1 = A()

a and a1 have different hash values. a and a1 are not equal.

print(hash(a))
# 87781481645

print(hash(a1))
# 87781510657

print(a == a1, a is a1)
# False False

The set will contain both objects.

s = {a, a1}
print(s)
# {<__main__.A object at 0x0000014702F8E010>,
# <__main__.A object at 0x0000014702F1CAD0>}

pathlib.Path Class

Usually, an instance of a class has no certain value as such.

The pathlib.Path class is designed to work with a specific path to a file or directory. It accepts the path as a string. It makes sense to use this string value to calculate a hash for an instance of the class and to compare instances.

The __hash__ and __eq__ methods are overridden in the superclass of the Path class.

from pathlib import Path

# use the same directory path
p = Path('.')
p1 = Path('.')

p and p1 have the same hash value. These different objects are compared as equals.

print(p == p1, p is p1)
# True False

print(hash(p))
# 5740354900026072187

print(hash(p1))
# 5740354900026072187

In this case, the set contains only the first one object.

s = {p, p1}
print(s)
# {WindowsPath('.')}

print(p in s)
# True
print(p1 in s)
# True

for i in s:
    print(i is p)
    # True
    print(i is p1)
    # False

So, sets can contain almost any Python object, as long as it is hashable (If the class method __hash__ is not None).


References:

  1. Docs: Set Types — set, frozenset
  2. Docs: Data model, Set types
  3. Docs: hashable
  4. Docs: object.__hash__
  5. Docs: hash(object)
  6. GitHub: Path __hash__ and __eq__ methods