Yeah, sorry about that. Most of the codes in this program are not very pretty. The recursive version (which is not used) is the most simple one and I’ll explain it here.

Since we only want to count the number of solutions, we don’t need to store the positions of the queens, we only need to know what positions are “masked out” by the queens on the previous rows. Since each row has only one queen, a queen can have three different directions to mask out lower rows: directly downward, left downward, and right downward.

For example, consider three rows:

```
. . . . Q . . .
. . . L D R . .
. . L . D . R .
```

L is left downward, D is directly downward, R is right downward.

In the program, we use bit fields to store these positions. “mask” stores directly downward mask, “l_mask” stores left downward mask, “r_mask” stores right downward mask.

To find an available position to place a queen for this row, we mix all three masks to create a “non-available position” mask, like this

```
unsigned int m = mask | l_mask | r_mask;
```

Then we iterate all zero positions in m (i.e. all available positions a queen can be placed on this row). To find a zero bit fast, we use:

```
unsigned int index = (m+1) & ~m;
```

This is a little trick. It works like this:

Suppose that m = 10010011, We want to find the first zero (from right), so we compute m+1= 10010100. Compare the two values:

10010011 (m)

10010100 (m+1)

The left bits are still the same, and the right bits are inverted to zero. Only the bit on the position we want is inverted to one. So we “AND” it with m inverted:

01101100 (~m)

10010100 (m+1)

00000100 (m+1) & ~m

we got the first available position for a queen. Then we can “AND” this position into m, to find the next position, etc.

Now we have a good position for this row, we need to generate masks for the next row. “mask” is simple, just “OR” it with the position we found. For “l_mask”, we need to “OR” it with the position, and then shift to left by 1 to create the new l_mask for next row. “r_mask” is similar, but shift to right by 1. This corresponds to the code:

```
total += solve_nqueen_internal(mask | index, (l_mask | index) << 1, (r_mask | index) >> 1, t_mask);
```

When “mask” is completely filled (we use a “t_mask” to check that), we got a solution.

By the way, the declaration of the recursive version is wrong, it should be

```
long long solve_nqueen_internal(unsigned int mask, unsigned int l_mask, unsigned int r_mask, unsigned int t_mask)
```

The original first argument “int n” is redundant and should be removed.

Although it’s not in the original recursive version, the functions that actually run have an additional trick: because a solution for n-queen can be flipped horizontally to create another solution (an exception is when n is odd and the queen on the first row is in the middle), so we only need to enumerate half of the solutions, and multiply by 2 to save time.