## Object tracking in 75 lines of code

Tracking objects in video is a thoroughly studied problem in computer vision that has important applications in industries like sports, retail and security. There are several possible approaches to this problem, but a popular one that’s both simple to implement and effective in practice is called tracking-by-detection.

The tracking-by-detection paradigm relies heavily on high quality object detectors. This means it can leverage advances in deep learning that have dramatically improved the performance of these models.

In this post, we’ll walk through an implementation of a simplified tracking-by-detection algorithm that uses an off-the-shelf detector available for PyTorch. If you want to play with the code, check out the algorithm or the visualization on GitHub.

## Cross-entropy loss in PyTorch

There are three cases where you might want to use a cross-entropy loss function:

1. You have a single-label binary target
2. You have a single-label categorical target
3. You have a multi-label categorical target

You can use binary cross-entropy for single-label binary targets and multi-label categorical targets (because it treats multi-label 0/1 indicator variables the same as single-label one-hot vectors). You can use categorical cross-entropy for single-label categorical targets.

But there are a couple things that make it a little weird to figure out which PyTorch loss you should reach for in the above cases.

## Pairwise distance in NumPy

Let’s say you want to compute the pairwise distance between two sets of points, `a` and `b`. The technique works for an arbitrary number of points, but for simplicity make them 2D. Set `a` has `m` points giving it a shape of `(m, 2)` and `b` has `n` points giving it a shape of `(n, 2)`. How do you generate a `(m, n)` distance matrix with pairwise distances?

## Everything is a pipeline

Imagine this workflow: