sniffnoy: (Sonic)
[personal profile] sniffnoy
UPDATE 2020: Seems the original problem has now been solved, as of a few years ago! However, the growth rate of the f and g functions below is still an interesting question, as is the interleaving behavior I noticed...
UPDATED Oct 23 with some additional info. See the explanation of the *s for new summary.
UPDATED yet again Oct 25; if n is 16 mod 64, f(n)=4, and if n is divisible by 1024, f(n)≥5. Also if n is exactly divisible by 512 and has any prime factors that are 3 mod 4, f(n)≥5. It's all added to the summary.
ONE MORE THING Oct 29: If g(k) is the first appearance of color k, then for k≥3, g(k)≥2k-1+4, which proves that f(n)=O(log n).

I expect this is all already known (though I don't see any of these sequences -- whether you subtract off 1 from our convention or not -- in OEIS), but let's just write this all down anyway.

So the other day I encountered on the internet the following simple open problem I hadn't heard of before: Is there a finite coloring of the positive integers such that for all (m,n)≠(2,2), m+n gets a different color from m*n? The expected answer, of course, is no. (You could also do it as whole numbers if you add in an exception for (0,0) as well. Of course then this just means that 0 gets its own color and the rest is as before.)

Anyway I thought this was neat and pointed this out to Josh with no intention of actually thinking about this at all because I mean really, what would I do with it? I don't know anything about this sort of stuff. (Also, I have actual work to do.) But then he suggested: Obviously proving any finite coloring won't work would be too hard, but what if we just considered the greedy coloring (colors are the positive integers, each number gets the smallest color that doesn't violate the constraints from the previous one)? Could you at least show that ends up using infinitely many colors?

Well, after a bit of thinking and a bit of computation, I'm pretty sure the answer is "There is no way I have the time or knowledge to tackle this right now". But before I really stop thinking about this and get back to stuff I actually need to be doing, let me write down here what I have figured out so far about this greedy coloring.

Notation: f(n) will mean the color of n. I'll also just say "n gets m" to mean f(n)=m. I'm going to start my colors at 1 rather than 0, because that's what Josh did and so that's what I've been doing and I don't feel like changing it. (It also means that if you use the starting at 0 variant, and start your colors at 0 too, then 0 gets 0, and the rest is the same as without either sort of zero. However, I'll be excluding 0 as it's just inconvenient.)

Here then are the facts!

Firstly, unless n is 0 or 4 mod 16, f(n) is easily determined. If n is odd, f(n)=1, and conversely, if n is even, f(n)≥2. If n is 2 mod 4, or 12 mod 16, f(n) is 2. If n is 8 mod 16, f(n) is 3; more generally, if n is divisible by 8, f(n)≥3. It's only if n is 0 or 4 mod 16 that f(n) varies (and no, these do not seem to be periodic wrt some higher modulus -- these ones actually get bigger, after all!). If n is 4 mod 16, then f(n)≥2, as noted above; if n is 0 mod 16, then f(n)≥3, as also noted above. Furthermore, if n is divisible by 128, then f(n)≥4. I'll refer to the congruence classes mod 16 as "columns". The 4s column depends only on previous values of the 4s column (it can actually only "see" the 4s column, the odd columns, which are always 1, and the 12s column, which is always 2), while the 0s column depends on the previous values of both nontrivial columns (it can see all the columns).

Update 23 Oct: Also, if If n=16 (mod 32) and f(n)=3, then n=48 (mod 64). If n=4 (mod 16) and f(n)=2, then n=4 (mod 32) and n has no prime factors that are 3 mod 4.

Edit: So you can see what I'm talking about, here's the columns visually (top row in hex, * means it varies, starting at 1 instead of 0 for obvious reasons)

123456789ABCDEF0
121*12131212121*

That * beneath the 4 is at least a 2. It is at least a 3 if the number is 20 mod 32, or has any prime factors that are 3 mod 4.
That * beneath the 0 is at least a 3. It is at least a 4 if the number is divisible by 128, and is exactly 4 if the number is 16 mod 64. It is at least 5 if the number is divisible by 1024, or is exactly divisible by 512 and has any prime factors that are 3 mod 4.

...that's about as far as pure reasoning gets me. But I wrote a simple C program to compute these numbers and computed up to 2^25. (Or, I suppose, 2^25+3, seeing as I can easily fill those in. Of course the program can go higher, but 2^25 was relatively quick while 2^26 was turning out not to be, so I stopped there.) So using this, let's look at when values of f first appear!

1 first occurs at 1.
2 at 2.
3 at 8.
4 at 16.
5 at 64.
6 at 1024.
7 at 4080.
8 at 32000.
9 at 9625536.
10 does not occur within the range of my computation.

What's going on here? No idea! But we can state the conclusion: The values of f really do grow slowly. Josh said f(n)=o(nε), which I assume is easy but have not bothered to verify. He suggested perhaps f(n)=o(log n), though I have to wonder if maybe it's even slower still (well, you know what I mean). In any case, getting lower bounds? Yikes.

Now you'll notice all the numbers above (save for the initial 1, 2, 8) are 0 mod 16. What if we look at the first occurrences in the 4s column? That gets us:

1 does not occur
2 at 4
3 at 20
4 at 36
5 at 1188
6 at 22932
7 at 389844
8 at 13197780
9 does not occur within the range of my computation.

Or, these are even slower. Note, by the way, that if we explicitly restricted attention to the 0s column instead, we'd never see 1 or 2, and we'd see 4 before 3 -- a 4 at 16 and a 3 at 32. That sort of non-monotonicity can't happen in the 4s column due to its limited "vision". Of course it can't happen in the 0s column either past that point so long as the 0s column continues to get the actual firsts.

Regarding my note that these are slower, let's actually compare the two lists above by merging them. If we do, the list goes:
First 1
First 2
First 2 in 4s column
First 3
First 4
First 3 in 4s column
[First 3 in 0s column, if we care]
First 4 in 4s column
First 5
First 6
First 5 in 4s column
First 7
First 6 in 4s column
First 8
First 7 in 4s column
First 9
First 8 in 4s column

The first few are irregular but starting with the first 6 there's a definite pattern evident in the order they mesh together. Why? What does this mean? I have no idea! And now, having written this down, I have no intention of thinking about it any further.

-Harry

Date: 2011-10-21 05:29 am (UTC)
From: [identity profile] joshuazelinsky.livejournal.com
Huh. With much more limited computations I thought that my f(n)=o(log n) was a really bold statement. But with that data it is almost conservative. Certainly I'm now strongly believing that it is O(log^K n) for some fixed K.

By the way, proving that f(n)=O(n^eps) isn't at all tough. The observation in question is that the only ways that you can bump up the minimum number for f(n) is for pairs of divisors. That is, if f(n)=K, then for every 1=<m<=k one has some a and b with a*b=n and a+b colored as m. So, f(n) is at most the number of positive divisors of n which are bounded by n^(1/2), therefore bounded by the number of divisors d(n). Since d(n)=O(n^eps) we're done. Note also that this can't be obviously extended to proving O(log^K n) since d(n) does not satisfy this bound. It appears from your data that in practice you need a lot more divisors than this sort of method would naively expect. See the following table: 1 occurs at 1 with 1 divisor. 2 at 2 with 2. 3 at 8 with 4. 4 at 16 with 5. 5 at 64 with 7. 6 at 1024 with 11. 7 at 4080 with 40 8 at 32000 with 36 9 at 9625536 with 84. A few things to note: First, the minimum for 7 actually has more divisors than the minimum for 8. Second, as one would expect one sees a lot of small primes in the factorizations of the minimum numbers. But 17 shows up twice for 7 and for 9. Is this a coincidence or is something deeper happening? Also, there's the surprisingly large prime factor of 9625536 of 983. So even these jumps are very irregular.

Date: 2011-10-21 08:40 pm (UTC)
From: [identity profile] sniffnoy.livejournal.com
Additional note: Doing a quick analysis, the running time of the algorithm I used should be about linear (in the number to compute up to, of course, not in its logarithm). Call it O(n1+ε). So I guess it's not surprising that I was able to compute up to 2^25 so quickly. (I wonder if the apparently huge slowdown when I went to 2^26 was some sort of overflow bug?)

Date: 2011-10-22 04:14 am (UTC)
From: [identity profile] sniffnoy.livejournal.com
Sorry, screwed that up, I mean O(n log n). (Times maybe some tiny factors I'm not accounting for.)

Date: 2011-10-23 02:30 am (UTC)
From: [identity profile] joshuazelinsky.livejournal.com
Hmm, I'm not sure how you are getting that? For any given n don't you need to look at all the divisors of n? That should be a lot slower than just O(n logn).

Date: 2011-10-23 03:28 am (UTC)
From: [identity profile] sniffnoy.livejournal.com
Take a look at the code, the program doesn't work that way. :) For anyone else reading this, I'll add an actual description of the algorithm later when I don't have a headache... it's not hard, though. Or maybe I'll just post the code on my website.

Date: 2011-10-23 10:01 am (UTC)
From: [identity profile] sniffnoy.livejournal.com
OK, for everyone else, here's how it works: User calls the program with a number which I'll call n, the number to compute up to. Program allocates an array of linked lists of length n+1 (we want it to be 1-indexed for obvious reasons), i.e., to each number from 1 to n it associates a linked list of integers, the exclusion list, which throughout will be sorted in ascending order. (Each exclusion list initially contains a sentinel value of 0; this could be eliminated with a bit of rewriting, obviously, if you really cared about optimizing for memory, but if you actually wanted to do that, I think there would be much more effective ways, though probably at the cost of speed.)

The program then loops over the numbers from 1 to n; for each number k, it does the following:
1. Determine the smallest number not in the exclusion list of k -- this is f(k), so we print out the pair (k,f(k)).
2. Add f(k) to the exclusion list of k+1.
3. Loop over pairs i+j=k, starting with i=2 and incrementing i until either i>k/2, or i*j>n. (Observe that since i≤k/2, as i increases, so does i*j, so once i*j>n once we can stop the loop immediately.) For each one, add f(k) to the exclusion list of i*j. (Earlier versions of the code actually had f(k) for 1≤k≤4 hard-coded in as special cases, and started the loop at k=5; however, with things done in this order, this isn't necessary -- it doesn't matter that we're adding f(4) to the exclusion list of 4, because we've already printed out f(4)! Also note that with the earlier version, when the loop started at k=5, 5 would have an exclusion list of [] rather than [2], but since f(5)=1 is the correct value, this is irrelevant.)
4. We're now no longer using the exclusion list of k, so deallocate it. (Note that if you want to free up your memory sooner and do this before step 3, you'd have to switch the code back to having 1≤k≤4 as a special case or else you'd get a segfault at k=4.)

...and that's it. So why is that O(n log n)? Times possibly some tiny factors I'm not accounting for? Well, this is pretty easy, but let me explain anyway. :) For k≤2√n, i will get all the way up to k/2, so that part is O(√n²)=O(n). For k>2√n... well, when is i(k-i)=n? When i=(k-√(k²-4n))/2, of course. Which can be rewritten as 2n/(k+√(k²-4n)), i.e., somewhere between n/k and 2n/k. We're summing over 2√n<k≤n, but we may as well sum over all of 1≤k≤n (ignoring our previous estimate for 1≤k≤2√n, since it's lower), giving us O(n*Hn), where Hn is the n'th harmonic number, i.e., O(n log n).

Date: 2011-10-30 07:28 am (UTC)
From: [identity profile] sniffnoy.livejournal.com
In fact, make that Θ(n log n), since what I'm counting above is a little over half of the sum over k≤n of τ(k), which is asymptotic to n log n.
(deleted comment)

Date: 2011-11-01 01:11 am (UTC)
From: [identity profile] sniffnoy.livejournal.com
...you sound like a spambot. Please do something to verify that you're not or I'll delete your comment.

Date: 2011-11-02 03:45 am (UTC)
From: [identity profile] joshuazelinsky.livejournal.com
Hmm, their livejournal is completely empty. And it does actually note the dark color scheme. Clearly these bots are improving. I've seen claims (but have not yet seen an example myself) that some of them can take semi-scrambled phrases from posts and use them to give more of an appearance of an actual response.

Date: 2011-11-02 04:49 am (UTC)
From: [identity profile] sniffnoy.livejournal.com
May be coincidence. I just deleted a spam comment saying how informative this post was about skincare products.

Date: 2011-11-02 04:51 am (UTC)
From: [identity profile] joshuazelinsky.livejournal.com
Hmm, could both bots be working off of the repeated use of the word "color" in the post?

June 2025

S M T W T F S
1234567
891011121314
15161718192021
2223 2425262728
2930     
Page generated Jul. 19th, 2025 08:05 pm
Powered by Dreamwidth Studios