It suffices to import only the necessary commands from module dionysus
,
but for the purposes of the tutorial, we import everything:
from dionysus import *
Read in a points file (for points_file()
the points should appear one per
line with coordinates separated with spaces; of course, one can read in the
points however one wants):
points = [p for p in points_file('points_filename')]
print "Number of points:", len(points)
If the points are in or , one can construct an alphashape filtration:
simplices = Filtration()
fill_alpha2D_complex(points, simplices) # for 2D, or
fill_alpha3D_complex(points, simplices) # for 3D
Functions fill_alpha*_complex fill the simplices
with all the simplices
of the Delaunay triangulation.
Each one has its attribute data
set to a pair: the
smallest value of the squared distance function on the dual Voronoi cell and
whether the simplex is critical or not (i.e. whether its dual cell does or
does not contain a critical point of the distance function).
See Alpha shapes for more details, and Alpha shape example for a
full example.
As a result, if one wanted only those simplices whose alpha shape value did not exceed 10, one could obtain them as follows:
simplices10 = [s for s in simplices if s.data[0] <= 10]
If the point set lies in higher dimensions, one may construct a Rips complex on
it. This complex requires only pairwise distances, which makes it very
versatile. One must first construct an instance of a class providing such
distances (e.g. PairwiseDistances
for explicit points, see
Distances for more details), and then pass it to the Rips
complex class:
distances = PairwiseDistances(points)
rips = Rips(distances)
Usually, because of space restrictions, generation of a Rips complex has to be restricted to a -skeleton of the complex and some maximal parameter . In the following example and :
simplices = Filtration()
rips.generate(3, 50, simplices.append)
Rips.generate()
takes a skeleton and a maximum distance cutoffs, and a
function which is called with each of the simplices in the complex (in this
case, Python list simplices’ append
method).
There are two ways of computing persistence in Dionysus. The first offline
way, required by StaticPersistence
(and its derivatives), is to set up
the entire filtration at once, compute its persistence in one operation, and
then examine the pairings. The second way is to feed simplices one by one in an
online manner and manually keep track of the pairs that are created.
ZigzagPersistence
and CohomologyPersistence
accept their
input this way,
For the first approach, i.e. to use StaticPersistence
, one must put the
sort the filtration with respect to some ordering
(for example, data_dim_cmp()
for alpha shapes or Rips.cmp()
for the
Rips complex):
simplices.sort(data_dim_cmp) # for the alpha shapes
simplices.sort(rips.cmp) # for the rips complex
Creating an instance of StaticPersistence
initialized with the
filtration really initializes a boundary matrix. The subsequent call to
pair_simplices()
reduces the matrix to compute the
persistence of the filtration:
p = StaticPersistence(simplices)
p.pair_simplices()
Once the simplices are paired, one may examine the pairings by iterating over
the instance of StaticPersistence
. We can use an auxilliary map smap
to remap the persistence indices into the actual simplices:
smap = p.make_simplex_map(simplices)
for i in p:
if i.sign():
birth = smap[i]
if i.unpaired():
print birth.dimension(), birth.data, "inf"
continue
death = smap[i.pair()]
print birth.dimension(), birth.data, death.data
The iterator i
over the instance p
of StaticPersistence
is of type
SPNode
, and represents individual simplices taken in the filtration
order. It knows about its own sign()
and pair()
as well as
whether it is . StaticPersistence.make_simplex_map()
creates
a map that we use to get the actual simplices: smap[i]
and smap[i.pair()]
.
The previous code snippet prints out the persistence diagrams of the given
filtration.
Class ZigzagPersistence
accepts additions and removals of the simplices
one by one, and returns an internal representation of the simplices.
(CohomologyPersistence
works in the same way, but accepts only
additions of the simplices.)
When one
adds a simplex via ZigzagPersistence.add()
, one must provide its boundary
in this internal representation together with a birth value which
ZigzagPersistence
will store in case a class is born as a result of the
addition. add()
returns a pair (i,d)
. i
is the
internal representation of the newly added simplex, which one must record for
future use in the boundaries of its cofaces (below it is recorded in the
dictionary complex
). d
is None in case of the birth, otherwise, it is the
previously recorded birth value of the class that dies as a result of the
addition. The following code adds all the simplices
to a zigzag:
simplices.sort(data_dim_cmp)
complex = {}
zz = ZigzagPersistence()
for s in simplices:
i,d = zz.add([complex[sb] for sb in s.boundary], (s.dimension(), s.data))
complex[s] = i
if d is not None: # we have a death
dimension, birth = d # we previously stored the (dimension, data) pair
print dimension, birth, s.data
Similarly, one can remove simplices by calling ZigzagPersistence.remove()
with the internal index previously returned by add()
:
for s in reversed(simplices):
d = zz.remove(complex[s], (s.dimension() - 1, s.data))
del complex[s]
if d is not None:
dimension, birth = d
print dimension, birth, s.data
Naturally, remove()
returns only d.
If one wants to add or remove an entire set of simplices “at once” (i.e. with
all births being assigned the same value), there are two helper functions:
add_simplices()
and remove_simplices()
.
See the bindings reference for more details, and
Triangle zigzag example for an example of ZigzagPersistence
.
See Python cohomology computation for an example of
CohomologyPersistence
.
Currently, when the choice comes between efficiency and flexibility, the Python
bindings err on the side of flexibility. There is hope that in the future the
choice won’t really be necessary. Meanwhile, one can use a few techniques that
speed up computation at the expense of memory. Note, however, that since the
recent switch of PairwiseDistances
to C++ rather than pure Python, it
is not clear whether these deliver a substantial speed-up:
To avoid (possibly expensive) computation of distances during Rips complex
generation, store ExplicitDistances
(see Distances):
distances = PairwiseDistances(points)
distances = ExplicitDistances(distances)
To avoid the computation of simplex sizes in the Rips complex during the
initialization of a Filtration
, store them explicitly in
Simplex.data
attribute (this is not done by default to save memory);
then use data_dim_cmp()
when sorting the
Filtration
:
rips = Rips(distances)
simplices = Filtration()
rips.generate(..., simplices.append)
for s in simplices: s.data = rips.eval(s)
simplices.sort(data_dim_cmp)