Public Key Cryptography: Merkle’s Puzzles

Up until the invention of public key cryptography in the 1970s, symmetric ciphers were the only encryption scheme available. A symmetric cipher refers to a cryptographic model where the communicating parties share a secret key to ensure the privacy of communications. Depending on the key, a symmetric encryption algorithm performs various permutations and substitutions to transform a given message into ciphertext. To recover the original message the decryption algorithm would perform the same operations in reverse, using the secret key. However, the key had to be sent through a secure channel, e.g. a trusted courier, registered mail, or best of all – agreed verbally in advance. There did not seem to be any other way for secure information exchange.

In 1974 Ralph Merkle, a computer science student at UC Berkeley attended a course in computer security. As part of the course he submitted a proposal addressing the issue of the distribution of a secret key [1]. The proposal was rejected several times, which led Merkle to drop the course and continue working on his idea independently [2]. Assuming that only an insecure channel is available for transmitting information, Merkle suggested that communications be based on a set of ‘puzzles’, solution to which would contain a unique puzzle ID and a unique puzzle key. If user B wishes to set up a connection with user A, they require A to generate a large number, say N=1,000,000, of puzzles and transmit them to B. After picking at random and solving one of the puzzles, B transmits only the puzzle ID to A, and keeps the puzzle key private, which can then be used for subsequent message encryption with a symmetric cipher. Since the puzzle ID is sent through an insecure channel, it is available to an adversary, however, only  A can match the puzzle ID to the puzzle key. The only way for the adversary to obtain the key is to solve every puzzle until a matching ID is found. This would on average require to solve \frac{1}{2} N puzzles.

Merkle (1)Figure: First A transmits a large number of puzzles \text{X}_1, \text{X}_2, \ldots, \text{X}_N to B. Next, B randomly selects and solves one puzzle \text{X}_j in order to obtain an (\text{ID}_j,\text{key}_j) pair. Finally, B sends only the puzzle ID=ID_j back to A.

A puzzle could be a symmetric encryption of the puzzle key and ID, solvable by brute-force, i.e. by trying all possible keys. If one puzzle takes 2 minutes to solve, with 1,000,000 puzzles it would take an adversary an average of around a year to obtain the secret key. Merkle’s proposal was a new paradigm of which the puzzles were just an example. Indeed, soon after the proposal, cryptographic systems based on hard mathematical problems such as discrete logarithm [3] and integer factorization [4] were developed. The model is now known as public key cryptography, and it revolutionised the field by eliminating the need for a secure channel to transmit the key.


[1] Merkle, R. C. Secure communications over insecure channels. Communications of the ACM 21.4 (1978): 294-299.

[2] Diffie, W. The first ten years of public-key cryptography. Proceedings of the IEEE 76.5 (1988): 560-577.

[3] Diffie, W. and Hellman, M. E. New directions in cryptography. Information Theory, IEEE Transactions on 22.6 (1976): 644-654.

[4] Rivest, R. L., Shamir, A. and Adleman, L. A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM 21.2 (1978): 120-126.

Author: Mante Zemaityte,


On the Leja method…

…or how we replaced the heuristics of a code by a rigorous backward error analysis.
We all know the procedure of developing a numerical method/algorithm:

  1. get it running,
  2. perform experiments,
  3. optimize the algorithm/code.

The emphasis is always on step 1. Unfortunately, step 3 is often not pursued because <<the PhD student left>, <the focus shifted>, <funding ended>,… >. In 2009 there existed a code for computing the action of the matrix exponential based on Leja method, see [3], [4], [5]. However, the code was based on some heuristics and had room for optimization.

The main idea of the Leja method is to compute \exp(A)v without explicitly forming \exp(A) for possibly large (sparse) matrices. The action of the matrix exponential has various applications, e.g. evolution equations (exponential integrators), network analysis, and many other areas. For the computation a sequence of Leja points in an interval is used (see Figure 1 for an example), as this allows for an iterative selection of the degree of interpolation to reduce computation cost.

Figure 1: 15 Leja points in the interval [-2,2].

In order to allow matrices with a large norm one takes advantage of the functional equation of the exponential to introduce scaling steps. In the end the interpolation essentially depends on three parameters, the number of scaling steps, the location/sequence of the interpolation points, and the degree of interpolation. The remaining question is: How do you select these parameters for a specific input?

The code from 2009 had the following heuristics included

% three magic numbers...
maxm = 150; minm = 30;
m = max (2.6 * ceil (gamma), minm);
nsteps = ceil (m / maxm);

These magic numbers from above determined all of the necessary parameters to perform the interpolation. The gamma in the formula describes an ellipse that encloses the field of values of the input and nsteps corresponds to the number of scaling steps. The first attempt to get rid of the heuristics replaced the above selection procedure, see [1]. The selection of the parameters was then based on the interpolation behaviour of a 2-by-2 matrix, with entries reflecting a bound of the extrema of the field of values of A. Nevertheless, it still had some heuristics included.

In the preprint [2], the heuristics were finally eliminated. Yet again, the code was redesigned, based on a rigorous backward error analysis. The main idea is to interpret the result of the interpolation as the exact solution of a perturbed problem. The backward error analysis gives rise to a selection procedure of the essential parameters that can minimize the cost of the algorithm.

As a result, the code is simpler and has no longer any restrictions on the input matrix. Nevertheless, there is still room for optimization. Numerical experiments show that the method can have performance losses for nonnormal matrices. We are currently working to fix this glitch but that is for another blog entry.

The code is available at the Numerical Analysis Innsbruck project page.


[1] Caliari, M., Kandolf, P., Ostermann, A., Rainer, S., 2014. Comparison of methods for computing the action of the matrix exponential. BIT Numer. Math. 52 (1), 113-128.

[2] Caliari, M., Kandolf, P., Ostermann, A., Rainer, S., 2015. The Leja method revisited: backward error analysis for the matrix exponential. Preprint, arXiv:1506.08665.

[3] Caliari, M., Vianello, M., Bergamaschi, L., 2004. Interpolating discrete advection-diffusion propagators at Leja sequences. J. Comput. Appl. Math. 172 (1), 79-99.

[4] Huisinga, W., Pesce, L., Kosloff, R., Saalfrank, P., 1999. Faber and Newton polynomial integrators for open-system density matrix propagation. J. Chem. Phy., 110, 5538-5547.

[5] Reichel, L., 1990. Newton interpolation at Leja points. BIT Numer. Math. 30 (2), 332-346.

Author: Peter Kandolf,

Linear Algebra in Sports Analytics

Recently, due to the large amount of data available and the “Big data” buzz, there has been a surge of activity in applying maths to analyse sport. In addition to keeping tabs of the number of points scored and by whom, companies now collect player tracking data which can locate players to within a few centimetres multiple times per second! All of this new data opens many possibilities for interesting ways to analyse performance.

Sports analytics is growing fast in football.

Defensive Responsibility in Basketball

Measuring the defensive capability of players is a difficult problem. There are plenty of ways to assess offensive capability but, since defenders stop points being scored, it is harder to measure their effect on the game. With access to player tracking data from the NBA, Kirk Goldsberry has designed a model to see which defender is responsible for each attacker at any given time so that their defensive ability can be measured against the league average. Some more information on this work can be found at the following links.

The main use of linear algebra here is in fitting a linear mixed model to an extremely large dataset, which means solving a large least squares problem. This document describes how the lme4 package in R fits linear mixed models using a Cholesky decomposition of the normal equations which squares the condition number! Perhaps they should consider using the QR method instead.

Expected Goals in Football

In a low scoring game such as football (a couple of goals per game) it often makes sense to talk about how many goals a team “should” have scored based upon the quality of the shots they generated. Access to player tracking data has opened up a host of new ways in which we can calculate this quantity: One possibility is to extract features of play from the data and feed them into a machine learning framework. Such features might include the distance from the goal, the number of attackers running forwards, or the current score (losing teams might try harder).

Here are two articles on the subject.

The mathematics involved depends upon the method used but most of them involve solving a (possibly nonlinear) optimization problem which could be solved via the (nonlinear) conjugate gradient method or stochastic gradient descent. The key to solving all these problems in a reasonable amount of time is to use extremely efficient linear algebra.