Powersets

Table of Contents

1 What

In Set Theory, the power set or poset of a set S is the collection of all subsets of S. Wait. What? OK, maybe an example will help. Suppose we want to find all of the running groups—of any size—that can be formed from members of a group of people who like to run, David, Emily, Duncan, and Claire. Let's call this group our set.

\[ S = \{\text{David}, \text{Emily}, \text{Duncan}, \text{Claire} \} \]

We can enumerate the combinations, if we permit a person to run alone.

  1. \(\{\text{David}\}\)
  2. \(\{\text{Emily}\}\)
  3. \(\{\text{Duncan}\}\)
  4. \(\{\text{Claire}\}\)
  5. \(\{\text{David}, \text{Emily}\}\)
  6. \(\{\text{David}, \text{Duncan}\}\)
  7. \(\{\text{David}, \text{Claire}\}\)
  8. \(\{\text{Emily}, \text{Duncan}\}\)
  9. \(\{\text{Emily}, \text{Claire}\}\)
  10. \(\{\text{Duncan}, \text{Claire}\}\)
  11. \(\{\text{David}, \text{Emily}, \text{Duncan}\}\)
  12. \(\{\text{David}, \text{Emily}, \text{Claire}\}\)
  13. \(\{\text{David}, \text{Duncan}, \text{Claire}\}\)
  14. \(\{\text{Emily}, \text{Duncan}, \text{Claire}\}\)
  15. \(\{\text{David}, \text{Emily}, \text{Duncan}, \text{Claire}\}\)

Have we missed any combinations? It turns out we have! We should remember the possibility that no one wants to run (always a real possibility), a fact we'll record by adding to our list the empty set \(\{\}\), giving us a total of 16 combinations in our power set, which we'll denote with \(\mathscr{P}(S)\).

Notice that \(16 = 2^{4}\) and 4 is the number of people we're considering. It turns out this is a general fact of power sets; that is, if \(S\) is finite, then the power set of \(S\), \(\mathscr{P}(S)\) contains

\[ 2^{n(S)} \]

sets, where \(n(S)\) is just the number of elements (i.e., "people") in the set \(S\). Why? This is the case because for any given subset of \(A\) an element either is or is not a member of that subset. When considering the relationship between an element of \(A\) and a subset of \(A\) we can say that it takes one of two states, "present" or "not present," and for \(n\) elements those \(n\) outcomes (is present or is not present) can be arranged in \(2^{16}\) permutations.

2 Why

The fact that the size, or cardinality, of a power set obeys this simple relationship invites a simple algorithm for generating a power set of a given set: sequentially enumerate the binary numbers with n bits and map each number to a subset by taking the i-th element when the bit is a 1.

3 How

We'll use Python, since it's simple, easy to use, and has decent data structures and operations for collections, sequences, lists, sets, strings, etc. Sometimes it pays to solve a problem from the top down, sometimes from the bottom up, and sometimes both. With this problem, we'll start from the bottom up.

First, how do we even get a binary number in Python? Actually, do we really want a binary number? Not necessarily. It would be useful to treat a binary number not as a number but as an array of individual bits, and if we even had a binary number we probably would want to convert it to a bit array anyway since we can guess that would make it easier to filter the set \(A\) into a given subset. So, how do we get a bit array in Python? The most concise way that I've found so far is the following.

Format a given decimal number into a binary string of length \(n\).

format(7, '04b')
'0111'

Here we just picked some number (7) between 0 and 15 (i.e, from 0 to 24 - 1) and formatted it with 4 bits.

Next, split the binary string into a list.

list(format(7, '04b'))
['0', '1', '1', '1']

Notice that the result isn't exactly a bit array, but rather is an array of strings taking on either '0' or '1'. There's no need to stand on principle! This will be good enough for our filtering purposes.

But, how do we use this as a filter? Again, the most concise way that I've found so far is the following. Pair each element of the "bit array" with the corresponding element (element in the same position) of \(A\) (after all, we arranged the bit array to have the precisely same size as our set \(A\)) using Python's zip function, then using the resulting sequence of tuples as a filter within a set comprehension.

{x[1] for x in zip(list(format(7, '04b')), {'David', 'Emily', 'Duncan', 'Claire'}) if x[0]=='1'}
set(['David', 'Duncan', 'Emily'])

Note that in lieu of a set comprehension we could use the Python filter function. I find set comprehensions to be cleaner and more readable in this situation, but it's really a matter of taste.

This works, so let's promote this into a bona fide function that maps a given decimal number n to a subset taken from a set S.

def poset_entry(n, s):
    return frozenset({x[1] for x in zip(list(format(n, '0%sb' % len(s))), s) if x[0]=='1'})

Notice that what we actually return is a frozenset. A Python set is mutable and therefore unhashable, so that it cannot be added to a set. And yet, the result of poset_entry are precisely the subsets we wish to add to our power set. Consequently, the frozenset function is useful for converting our subsets into the immutable variants we need.

That gives us one subset of the power set, for a given n, but how do we get all of the subsets? Remember that at the outset we said we would enumerate all of the binary numbers with n bits. Let's just select all of the decimal numbers in the interval [0, 2n) where n is the size or length of \(S\).

{i for i in range(2**len({'David', 'Emily', 'Duncan', 'Claire'}))}
set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])

Python set comprehensions are so handy! In fact, why stop just with generating a set of integers when we can use our new poset_entry function to go straight to a list of subsets?

{poset_entry(i, {'David', 'Emily', 'Duncan', 'Claire'}) for i in range(2**len({'David', 'Emily', 'Duncan', 'Claire'}))}
set([frozenset(['Duncan']), frozenset(['Claire', 'Emily', 'Duncan']), frozenset(['Claire', 'Duncan', 'Emily', 'David']), frozenset(['David', 'Duncan']), frozenset(['Claire', 'David']), frozenset(['David']), frozenset(['Duncan', 'Emily', 'David']), frozenset([]), frozenset(['Emily', 'David']), frozenset(['Claire', 'Duncan']), frozenset(['Claire', 'David', 'Duncan']), frozenset(['Emily', 'Duncan']), frozenset(['Claire', 'Emily']), frozenset(['Emily']), frozenset(['Claire', 'Emily', 'David']), frozenset(['Claire'])])

Great! Let's promote that to a function as well, and make a few improvements while we're at it.

def poset(s):
    return {poset_entry(i, s) for i in xrange(2**len(s))}

Besides the variable substitution s, notice that we've replaced range with xrange, which has a better memory footprint as it generates a sequence of integers rather than a whole list. Notice also that we use the set comprehension syntax to generate our final power set.

Now, we're done! We can use our poset function to generate the power set for any modestly sized set.

print poset({1, 2, 3})
set([frozenset([3]), frozenset([1, 2]), frozenset([]), frozenset([2, 3]), frozenset([1]), frozenset([1, 3]), frozenset([1, 2, 3]), frozenset([2])])

That's not very readable, but if we sacrifice a little rigor by converting the sets back into lists it looks better.

[list(p) for p in poset({1, 2, 3})]
[[3], [1, 2], [], [2, 3], [1], [1, 3], [1, 2, 3], [2]]

This was fun, and I maintain that we have produced a good solution. It's simple, concise, easy to understand, and has decent performance characteristics. However, there's more than one way to do it! They're also interesting, and illuminate more of the set characteristics of the problem. We'll explore those in a future exciting entry. Stay tuned!