# Profiling Python scripts (4): vecops.indexate

This is the fourth in a series of articles that covers analyzing and improving
performance bottlenecks in Python scripts.
The performance of `stl2ps`

is the topic of this article.

Note

The other articles in this series will be listed under “Related Articles” at the bottom of the page.

## Why `vecops.indexate`

In the course of writing the third article, I looked into the
implementation of `vecops.indexate`

.
This function takes a list of vertices (3-tuples) and returns a tuple of
indices and a tuple of unique vertices.

In STL files the same vertex is often found in three or even more facets.
For generating a POV-ray “mesh2”, we require a list of unique vertices.
The facets are then built out of indices into that list.
That is what `vecops.indexate`

was designed for.

## How it works

To show how `vecops.indexate`

works, it will be executed line by line in IPython.

First we create a list of points (from an existing STL file) and an empty dictionary.

```
In [1]: points = [(139.3592529296875, 97.01824951171875, 69.91365814208984),
...: (138.14346313476562, 97.50768280029297, 70.0313949584961),
...: (138.79571533203125, 97.875244140625, 69.03594970703125),
...: (138.14346313476562, 97.50768280029297, 70.0313949584961),
...: (139.3592529296875, 97.01824951171875, 69.91365814208984),
...: (138.76876831054688, 96.6264419555664, 70.94448852539062),
...: (138.79571533203125, 97.875244140625, 69.03594970703125),
...: (138.14346313476562, 97.50768280029297, 70.0313949584961),
...: (137.59768676757812, 98.35258483886719, 69.14176177978516),
...: (139.3592529296875, 97.01824951171875, 69.91365814208984)];
In [2]: pd = {}
Out[2]: {}
```

Next comes the clever bit.
This generator expression employs the setdefault method of dictionaries to
both generate the indices *and* fill the `pd`

dictionary.
In this case, the vertex (a 3-tuple of `float`

) is used as the *key* and the
current size of the dictionary is the value.
Since dictionary keys are unique this removes duplicate points.

```
In [3]: indices = tuple(pd.setdefault(tuple(p), len(pd)) for p in points)
Out[3]: (0, 1, 2, 1, 0, 3, 2, 1, 4, 0)
In [4]: pd
Out[4]:
{(139.3592529296875, 97.01824951171875, 69.91365814208984): 0,
(138.14346313476562, 97.50768280029297, 70.0313949584961): 1,
(138.79571533203125, 97.875244140625, 69.03594970703125): 2,
(138.76876831054688, 96.6264419555664, 70.94448852539062): 3,
(137.59768676757812, 98.35258483886719, 69.14176177978516): 4}
```

Note that this output was generated using Python 3.9. From Python 3.7 onwards, dictionaries are guaranteed to maintain insertion order, according to the documentation:

- > Changed in version 3.7: Dictionary order is guaranteed to be insertion
- order. This behavior was implementation detail of CPython from 3.6.

That is what made the following lines necessary in the past.

```
In [5]: pt = sorted([(v, k) for k, v in pd.items()], key=lambda x: x[0])
Out[5]:
[(0, (139.3592529296875, 97.01824951171875, 69.91365814208984)),
(1, (138.14346313476562, 97.50768280029297, 70.0313949584961)),
(2, (138.79571533203125, 97.875244140625, 69.03594970703125)),
(3, (138.76876831054688, 96.6264419555664, 70.94448852539062)),
(4, (137.59768676757812, 98.35258483886719, 69.14176177978516))]
In [6]: unique = tuple(i[1] for i in pt)
Out[6]:
((139.3592529296875, 97.01824951171875, 69.91365814208984),
(138.14346313476562, 97.50768280029297, 70.0313949584961),
(138.79571533203125, 97.875244140625, 69.03594970703125),
(138.76876831054688, 96.6264419555664, 70.94448852539062),
(137.59768676757812, 98.35258483886719, 69.14176177978516))
```

The function returns a 2-tuple of `indices`

and `unique`

.

## Improvements

Since `stltools`

*requires* Python 3.6 or later anyway, we can make use of
the fact that insertion order is maintained in dictionaries.
This means that `unique`

can be trivially generated from `pd.keys()`

, and
building `pt`

is no longer needed.

Let’s see what that brings.

```
In [1]: cd src/progs/stltools/
/home/rsmith/src/progs/stltools
In [2]: from stltools import stl
In [3]: vertices, _ = stl.readstl("../testdata-stltools/headrest_mesh.stl", "ascii")
In [4]: len(vertices)
Out[4]: 832944
In [5]: def old_indexate(points):
...: pd = {}
...: indices = tuple(pd.setdefault(tuple(p), len(pd)) for p in points)
...: pt = sorted([(v, k) for k, v in pd.items()], key=lambda x: x[0])
...: unique = tuple(i[1] for i in pt)
...: return indices, unique
...:
In [6]: %timeit old_indexate(vertices)
301 ms ± 729 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [7]: def new_indexate(points):
...: pd = {}
...: indices = tuple(pd.setdefault(p, len(pd)) for p in points)
...: return indices, tuple(pd.keys())
...:
In [8]: %timeit new_indexate(vertices)
227 ms ± 307 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [9]: indices, unique = old_indexate(vertices)
In [10]: len(indices), len(unique)
Out[10]: (832944, 139526)
In [11]: 1 - len(unique)/len(indices)
Out[11]: 0.8324905395800918
```

So for in the order of 800000 vertices, the required time is reduced by around 25%! This in itself is worthwhile.

But more important is that 832944 vertices are reduced to 139526 unique vertices, a reduction of more than 80%, in a fraction of a second. This explains the significant difference in the third article.

## Lessons learned

- A REPL like IPython can be invaluable in understanding how code works.
- It is important to know and take advantage of the features of the data structures your language offers you.
- The greatest speed-ups can often be achieved by changes to the algorithm.

For comments, please send me an e-mail.

## Related articles

- Profiling Python scripts (3): stl2ps
- Profiling Python scripts (2): stlinfo
- Profiling Python scripts (1): stl2pov
- Profiling with pyinstrument
- Profiling Python scripts (5): ent_without_numpy