## Tuesday, February 19, 2013

### The Collatz Conjecture

If your number is even, cut it in half.
Write down your new number, and repeat.

Do you always end up at 1, 4, 2, 1, ... ?

Collatz's Conjecture was that you do. No one has found a counter-example in the century or so since he proposed his conjecture. No one has a proof that it will always happen either.

Kids like playing around with this. So I like it. But I hadn't played around much with it until now.

Discrete Mathematics With Ducks mentions this problem / puzzle / conjecture in Chapter 5, on algorithms. So I made a spreadsheet, and am looking at the lengths of the series for each number up to 100. I noticed two things:
1. Quite a few consecutive pairs of numbers have the same length series. (I'm counting how long until the number 1 shows up.) It seems to always be an even number and then the number after it.
2. I haven't counted them all yet (I'm doing it by hand), but there seems to be a big gap. Most sequences have lengths under 25, and a few have lengths over 100. So far, I haven't found any sequence lengths in between 25 and 100.
I made the spreadsheet by putting the numbers 1 through 100 in the first column. Every other cell gets this (with the cell name changed automatically):    =IF(INT(B1/2)=B1/2,B1/2,3*B1+1)

What do you notice about the Collatz series? What do you wonder?

1. The Collatz Conjecture is the reason I'm not a research mathematician. I realized that I would not be a productive researcher when there are interesting and challenging problems like this out there. I knew I would not be able to solve the problems I needed to write papers and get tenure.

I think the key to understanding the Collatz Conjecture is to look at the powers of 2. If you hit 2^n, then the series will eventually reach 1 in n steps. If you can find the odd numbers that are one step away from a power of 2, and then the odd numbers that are two steps away from the previous set, and so on, you can hopefully show that is all of the whole numbers. The trick is in that last step.

So, basically, start at 1 and work backwards.

2. I was thinking about 2^n. I'm going to purposely not think much about what you've said here - I want to approach this fresh.

I'm not a research mathematician because I like to savor my math. I like being a slow learner. ;^)

3. I've looked at this problem a few times. It's one of my favorites.

Here's what I've noticed:

If you start at 1 as Christopher suggested and draw the connections as a tree with 1 at the root, you get a nice branching tree with either 1 or 2 children of each node. For example, 10 has 3 and 20 as children since 3 -> 10 and 20 -> 10. On it's own, this tree isn't very useful (at least to me).

The next step I took is to recognize that the N/2 rule effectively strips zero bits from the binary representation of N.

So if you merge together all the nodes of the tree that have the same binary "prefix", you get something interesting.

Our root is the node *1* which contains all the powers of 2.
The children of *1* are *5*, *21*, *341*, *1365*, ...

What's really interesting is if you look at the binary representations:
*5* = 101(0...)
*21* = 10101(0...)
*341* = 1010101(0...)
*1365* = 101010101(0...)

4. You mentioned "Most sequences have lengths under 25, and a few have lengths over 100. So far, I haven't found any sequence lengths in between 25 and 100."

That's because you've started with small numbers. If you look far enough you will notice that there are more sequences of length k than of length (k-1), but you won't see that with numbers up to 100.

For example, these are all the sequences that have length 17: 1536, 7, 44, 45, 272, 277, 1664, 276, 46, 280, 282, 1696, 1704, 1706, 10240, 296, 1778, 10672, 50, 302, 1816, 1818, 10912, 1820, 10922.

5. So there are lots of sequences of length 26, for example? 