Programming puzzles – Spy game – NPComplete

When you have worked in a specific area for a long time, many of the tasks begin to be repetitive. One way to keep sharp is to engage the brain with programming puzzles. A site that contains many puzzles is hackerrank.com. The name is a bit over the top for my taste, but they keep an extensive list of many different programming challenges.

Each challenge is solved in the programming language you choose. It makes a great way to both keep the skills up to date and practice using programming languages that you do not use that often. Especially the challenges that are rated medium and up are good for keeping your skills sharp.

I have tried to tackle one of the most difficult challenges; spy game revisited if you keep on reading there will be spoilers so if you want to solve the problem your self, please stop reading. I will implement the solution using C# .NET. You can find the code here.

Challenge description

The challenge consists of a simple board game where we must place spies with a few requirements.

Given an n x n board we must place n spies where:

  1. There can not be two or more spies in the same row.
  2. There can not be two or more spies in the same column.
  3. There can not be two or more spies on any diagonal line (i.e., lie in a line inclined 45° or 135° to the base of the grid).
  4. No three spies may lie in any straight line. This line need not be aligned 45° or 135° to the base of the grid.

It should be solved for all n < 1000

This problem is an NP-complete what this means is a whole topic in an on itself. Sufficient to say, the solution to this type of problems is usually not trivial, and the runtime is expected to be significant, most likely O(k^n) or similar.

In most cases finding the solution is the part that takes time, verifying if the answer is correct usually takes polynomial time.

The main challenge in this problem is the state space. If we try to enumerate every possible board combination we have (n*n)!/(n*n-n)! This gives for a 10×10 board that gives 62,815,650,955,529,500,000 different boards. Which will take a while to loop to locate a correct answer.

Some examples of boards to give you an idea of how they look.

An example of a valid 11-board from the challenge description:

* S * * * * * * * * * 
* * * S * * * * * * * 
* * * * * * S * * * * 
S * * * * * * * * * * 
* * * * * * * S * * * 
* * * * * * * * * * S 
* * * * S * * * * * * 
* * S * * * * * * * * 
* * * * * * * * S * * 
* * * * * S * * * * * 
* * * * * * * * * S *

An example of an invalid 7-board:

S * * * * * * 
* * S * * * * 
* * * * S * * 
* * * * * * S 
* S * * * * * 
* * * S * * * 
* * * * * S *

The four spies marked with bold are on the same line – causing the board to be invalid.
The other three spies marked with italic are also on the same line.

Computerize the problem

To solve the challenge we need to break the rules into individual algorithms.

Data structure

In the challenge description, the output is shown in the following way.

11
4 7 1 6 2 10 8 3 5 9 11

The first line shows that it is a configuration for a 11×11 board.
Each number gives the column number for that row. So the spy in the first row is placed in column 4. The spy in the second row is put in the 7’th column and so on.

* * * S * * * * * * * 
* * * * * * S * * * * 
S * * * * * * * * * * 
* * * * * S * * * * * 
* S * * * * * * * * * 
* * * * * * * * * S * 
* * * * * * * S * * * 
* * S * * * * * * * * 
* * * * S * * * * * * 
* * * * * * * * S * * 
* * * * * * * * * * S

The advantage of this data structure is that it implicit encodes rule 1, that no two spies can be in the same row.
The second rule can also be encoded implicitly by only taking into account the boards where all the numbers are distinct.

The two rules cut the state-space from (n*n)!/(n*n-n)! to n!
The 10×10 board from before now “only” has 3.628.800 combinations.

We can improve this further, but this is a good start.

Implementation – brute force

The first implementation uses a brute force approach to find valid board combinations. You can find this implementation in the AllBoardGenerator.cs and BruteForceValidator.cs files in the repository.

Pseudocode for this implementation

all_permutations of set {1,2...n}
foreach(board in all_permutations) {
    if(IsValid(board)){
       print(board);
       exit;
    }
}

IsValid(board) {
   if(IsInDiagonal(board)) return false;
   if(IsOnLine(board)) return false;

   return true;
}

IsInDiagonal(board) {
   calculate slope of all lines that connects any 2 points
   if(any of the slopes are 1 or -1) return true;

   return false;
}

IsOnLine(board) {
 calculate slope of all lines that connects any 2 points
 foreach(line in lines){
   foreach(point in board){
     if(point is on the line) return true;
   }
 }

 return false;
}

All permutations of the given n x n-board are looped, and when we get to a valid board it is printed, and the program exits.

On my laptop, this gives the following runtimes:

  • 11×11: 00:00:00.1550624 – with solution 4 7 1 6 2 10 8 3 5 9 11
  • 13×13: 00:00:14.7555454 – with solution 8 10 5 1 6 9 3 12 7 4 2 11 13
  • 14×14: 00:02:08.1880126 – with solution 7 4 11 9 6 13 2 12 1 8 5 3 10 14
  • 15×15: 00:20:24.6828372 – with solution 11 4 10 7 3 12 2 9 6 14 5 8 1 13 15

If we plot the runtime, it is clear that it follows an exponential growth.

If we make a rough estimate for the runtime for the 999 x 999 board, it would be around: 2.23×10^(−9)×e^(1.8×999) = 1.979895285×10⁷⁷² seconds. This runtime makes it unfeasible to calculate at all! We need to do better.

Optimizations

We need to optimize the approach beyond brute force. This is where the creativity in software engineering comes in.

Optimization 1 – prune boards

In the brute force approach, all boards are generated. For example, the following three 11-boards will all be generated

  • 4 5 1 6 2 10 8 3 7 9 11
  • 4 5 1 6 2 11 8 3 7 9 10
  • 4 5 1 6 2 10 8 3 7 11 9

But because the first two numbers (4, 5) are in a diagonal, none of the boards are valid. But we know that when we saw the first board, that any combination that starts with those numbers would not be valid. We can change the board generator to create permutations differently, validating on the fly.

Perm(currentPermutation, elementsLeft, length){
   foreach(elm in elementsLeft){
     currentPermutation[length] = elm
     
     if(IsValid(currentPermutation, length+1)){
        elementsLeft.Remove(elm)
        if(empty(elementsLeft){
           return true // permutation is complete and is valid
        }

        if(Perm(currentPermutation, elementsLeft, length+1){
           return true
        }
        // the element was not valid so it should be available for the next iteration
        elementsLeft.Add(elm)
     }
   }
   return false
}

The way this works is that we start with a list of all valid numbers on a board, for example for an 11-board this is {1,2,3,4,5,6,7,8,9,10,11}. Then it creates boards in the following order

  • 1 ( valid )
  • 1,2 ( invalid )
  • 1,3 ( valid )
  • 1,3,4 ( invalid )
  • 1,3,5 ( invalid )
  • 1,3,6 ( valid )
  • 1,3,6,7 ( invalid )
  • 1, 3, 6, 11, 7, 10, 15, 2, 4, 16, 9 ( valid )

Each new board is validated and if a board is invalid all the boards that start with this sequence are never generated. The implementation can be found in the file SmartBoardGenerator.cs

On my laptop, this gives the following runtimes:

  • 11×11: 00:00:00.0105133 – with solution 2 4 7 1 8 11 5 3 9 6 10
  • 17×17: 00:00:00.1137657 – with solution 1 3 6 11 7 10 15 2 4 16 9 17 5 13 8 12 14
  • 23×23: 00:00:03.1701188 – with solution 1 3 6 2 7 16 20 13 8 5 22 19 17 23 21 9 11 14 10 4 18 12 15
  • 25×25: 00:00:33.6615365 – with solution 1 3 6 2 7 13 22 19 25 16 24 21 17 11 8 12 5 23 18 4 10 14 9 15 20
  • 27×27: 00:05:29.9375740 – with solution 1 3 6 2 7 10 20 17 23 21 18 27 8 22 25 4 9 24 15 13 26 19 14 11 5 12 16

The previous runtimes are marked with blue, so the smart generator is a significant improvement. If we calculate expected runtime for a 999×999 board: 2.129333215×10⁴⁷⁵ seconds. Still way too much to be feasible.

Optimization 2 – validate boards smarter

If we split the runtime into the time used to generate boards and the time it takes to verify each board we get the following split. For the 25×25 board, of the 21 seconds, it took to find a valid board 20,5 seconds of that time is spend validating if boards are valid. It is not surprising, but it gives a hint that this could be an area the could improve overall runtime.

In the brute force validator when checking if two spies are on a diagonal or three spies are on a line all points are checked. First, we check if there are three spies in any line. It runs in O(n^3) because of the three nested for loops.

public bool IsOnLine(int[] board, int n)
{
    for (var x = 1; x <= n; x++)
    {
        var y = board[x - 1];
        for (var x2 = x+1; x2 <= n; x2++)
        {
            var y2 = board[x2-1];
            // solve equation for line between the two points
            double a = (double) (y - y2) / (x - x2);
            double b = y - a * x;

            for (var x3 = x2+1; x3 <= n; x3++)
            {
                var y3 = board[x3 - 1];
                // check if the third point are on the line 
                if (y3 == a * x3 + b)
                {
                    return true;
                }
            }
        }
    }
    return false;
}

Every combination of three points is tested. But because of the way that the boards are generated we know that the (n-1) board is valid and that only the final number needs to be tested. Because of this, we can change the algorithm to the following instead.

public bool IsOnLine(int[] board, int n)
{
    var x = n;
    var y = board[x - 1];
    
    // from first point to the second last point 
    for (var x2 = 1; x2 < n; x2++)
    {
        var y2 = board[x2 - 1];
        // solve equation for line between the two points
        double a = (double) (y - y2) / (x - x2);
        double b = y - a * x;

        // from the second point to the third last
        for (var x3 = x2+1; x3 < n-1; x3++)
        {
            var y3 = board[x3 - 1];
            // check if the third point are on the line 
            if (y3 == a * x3 + b)
            {
                return true;
            }
        }
    }

    return false;
}

Now only the last point is compared to the rest of the points, decreasing the function to run O(n^2)

  • 23×23: 00:00:00.6667071
  • 25×25: 00:00:05.7743496
  • 27×27: 00:00:54.5549858
  • 29×29: 00:00:40.4097154
  • 31×31: 00:00:58.8726697
  • 33×33: 00:03:22.1658578

Improvement in the runtime once again bringing the expected runtime for 999×999 “down” too: 4.018954825×10¹²⁵ seconds. Still not fast enough to be feasible to calculate.

Optimization 3 – caching computations

Right now the solution is heavy on computation but not very memory intensive. We might be able to use memory to save on the computation time. The ideas from dynamic programming might improve the runtime further.

The sub-algorithms IsOnLine and IsInDiagonal calculate many of same things again and again. When this solution is available I will update this article.

Conclusion

I will continue to improve on the algorithm until I have a solution that can find a solution to the 999×999 board. But as illustrated this is not a trivial problem to solve. Maybe the approach is wrong since the optimizations do not seem to improve the runtime enough to make it reasonable to find a solution.

But it does show that seeming, “simple” problems can provide significant challenge to solve. It also keeps the mind sharp.


Also published on Medium.