Back to (Parallel) Basics: Do you really want to do that? or Why doesn't the new Parallel.For support BigInteger?
 Got an interesting question on Twitter, and got a fabulous answer in email from Stephen Toub, who happens to be my most favorite multi-threaded person. Here's the question. "why doesn't the new parallel.for support BigInteger?"
Got an interesting question on Twitter, and got a fabulous answer in email from Stephen Toub, who happens to be my most favorite multi-threaded person. Here's the question. "why doesn't the new parallel.for support BigInteger?"
It's an interesting question for a few reasons. First, it's interesting because it's a question about the new parallel stuff in .NET 4 and Visual Studio 2010 that lets the developer take parallel problems that would be very hard and makes them extremely easy. For example, you can basically take a straight for loop and often make it run in parallel on multiple-cores. This is huge.
Second, speaking of huge things, BigInteger is huge. Extremely. How big? Unbounded. Like, arbitrarily large. I'd have preferred BigAssInteger, but that's offensive, so stop asking for it. ;)
Here's Stephen's emailed answer with my added commentary.
"I know of no real scenarios that require such a large, dense iteration space and that could stay running for long enough on a single box. Parallel.For does provide built-in support for Int64, which affords up to 2^64 iterations. If you were able to run such a loop where each iteration of the loop took a nanosecond to process (i.e. a billion iterations per second), it would take something like 500 years to complete on a single core. Even with a hundred cores in a single box this would take five years to complete."
He is extremely kind. Notice how he says "why the hell would you want to do that" without making me feel bad? The man's a teacher. Not only that, he offers an alternative that would run for a half-millennium without any judgment that my problem space might be wrong. He should be a college professor.
Second, if you really do need this support, it’s achievable with Parallel.ForEach and either iterators or a custom Partitioner<T>. For example, with iterators, you can write a handy method:
static IEnumerable<BigInteger> Range(BigInteger fromInclusive, BigInteger toExclusive)What? After telling me it's not desirable and I don't really want to do it, he actually DOES it. That's hot.
{
for(BigIntegeri=fromInclusive; i<toExclusive; i++) yield return i;
}
and then consume it in a method like:
public static void ParallelFor(BigInteger fromInclusive, BigInteger toExclusive, Action<BigInteger> body)
{
Parallel.ForEach(Range(fromInclusive,toExclusive), body);
}
which you can then use like:
ParallelFor(new BigInteger(from), new BigInteger(to), i=>
{
// ... loop body here ...
});
Finally, I’d love to understand if lack of BigInteger support is actually causing this customer issues, or if they’re just curious. And if it’s actually problematic, what’s the scenario?
Then, after all this he asks if there's a scenario for this that he might not have thought of. You can't know everything, so he wants to double check to make sure the library he worked on does everything the customer wants.
My man-crush on this dude couldn't be any bigger. Now if Stephen would just return my calls. The bromance continues. ;)
Hope you enjoyed this sample, and big thanks to @amarciniec for the great question.
About Scott
Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.
 
                         
                        
About Newsletter
Thanks for the interesting post.
:-)
When you say these numbers are "suspected" primes, that implies to me that they're known and not necessarily sequential, is that correct? If so, a solution based on Parallel.ForEach instead of Parallel.For is likely the approach you'd want to take anyway, since it allows you to work on an arbitrary IEnumerable<BigInteger> values (either an existing collection, or one defined by an iterator) instead of a specific sequential sequence of values as you'd use with a Parallel.For.
Or, is the idea that you'd be using the parallel loop to iterate over a "small" (i.e. <= Int64.MaxValue), sequential range of big BigInteger values, running some algorithm that helps you to root out the potential "suspects"? If that's the case, I do see how having an overload of For that accepts BigIntegers could be natural. For that case, you could still use the approach I outlined previously, where you write a C# iterator that yields all values in the range you care about. An alternate (and potentially better performing) approach would be to simply scale down the entire range into Int64 values, use a Parallel.For loop to iterate through the Int64 values, and add back the base prior to invoking the body, e.g.
public static void ParallelFor(
BigInteger fromInclusive, BigInteger toExclusive, Action<BigInteger> body)
{
var range = (long)(toExclusive - fromInclusive);
Parallel.For(0, range, i => body(fromInclusive + i));
}
I hope that helps. Thanks, again, please keep the questions/feedback coming. Also, make sure to check out the parallel programming blog at http://blogs.msdn.com/pfxteam as well as the Parallel Computing developer center on MSDN at http://msdn.com/concurrency.
@Adam is code not available to do this or is just something you want to do yourself? The largest prime found is millions of digits long so I would think there is a pretty solid distributed program somewhere to do this.
@Stephen - I sent an email comment to your blog on the topic, I'm not sure if it found its way to you. Essentially Bell Labs has an open research project where they list suspected, very large primes that still need proven by a determinalistic algorithm like AKS or ECPP (this can take days). What I'm trying to do is translate the AKS Conjecture Algorithm to use C#. That, as Pete said, involves running the algorithm against every number greater than 2 and less than the square root of the suspected prime. So a for loop is needed and it needs to support biginteger because even the square root of the primes we're testing are too big for anything less. I think your previous suggestion will work just fine though getting the BCL team to add support for Polynomial objects and functions would be nice. =oD
I was so impressed I went back and started reading past MSDN mag articles that he wrote.
Pete, thanks for the explanation, as well.
Cory, thanks for the very nice sentiment.
Jeff, I'm very glad to hear you find tasks to be ridiculously useful :) If you care to elaborate (either here or in an email to me personally), I'd love to hear more about your experiences and use cases.
Bas, yes, that was just a silly typo on my part (I look forward to the day when Outlook has a built-in C# snippet compiler ;) Scott fixed it.
Everyone, I'm always interested in learning about how developers are using parallelism, whether it's the new .NET 4 parallelism support, your own home-grown solutions, or anything else. Any information you care to send my way will be welcome and read deeply (if you don't hear back from me within a few days, please try sending your mail again... I always make it a point to respond, even if only briefly to express my thanks.)
using System;
using System.Collections.Generic;
using System.Linq;
using System.Numerics;
using System.Text;
using System.Threading.Tasks;
namespace Calculate_Primes
{
class Program
{
private const int _NUMBER_OF_DIGITS = 100;
static void Main(string[] args)
{
BigInteger floor = BigInteger.Parse("1" + string.Empty.PadLeft(_NUMBER_OF_DIGITS - 1, '0'));
BigInteger ceiling = BigInteger.Parse(string.Empty.PadLeft(_NUMBER_OF_DIGITS, '9'));
Console.WindowWidth = 150;
//var primes = Enumerable.Range(floor, ceiling).Where(n => Enumerable.Range(1, n).Where(m => (n / m) * m == n).Count() == 2);
Console.Clear();
_calculatePrimes(floor, ceiling, "C:\\100 digit primes.txt");
Console.Clear();
_calculatePrimes(floor, ceiling, "C:\\300 digit primes.txt");
}
static IEnumerable<BigInteger> Range(BigInteger fromInclusive, BigInteger toInclusive)
{
for (BigInteger i = fromInclusive; i <= toInclusive; i++)
yield return i;
}
static void ParallelFor(BigInteger fromInclusive, BigInteger toInclusive, Action<BigInteger> body)
{
Parallel.ForEach(Range(fromInclusive, toInclusive), body);
}
static void _calculatePrimes(BigInteger floor, BigInteger ceiling, string resultsFileFilepath)
{
using (System.IO.FileStream fs = new System.IO.FileStream(resultsFileFilepath, System.IO.FileMode.Create)) { }
using (System.IO.StreamWriter sw = new System.IO.StreamWriter(resultsFileFilepath))
{
ParallelFor(floor, ceiling, i =>
{
if (_isPrime(i))
{
lock (sw)
{
sw.Write(i.ToString() + System.Environment.NewLine);
sw.Flush();
}
}
});
}
}
static bool _isPrime(BigInteger number)
{
bool returnValue = true;
Console.WriteLine("Checking {0} for primality.", number.ToString());
if ((number < 2) || (number > 2 && number.IsEven) || (number > 2 && number.IsPowerOfTwo))
returnValue = false;
else
for (BigInteger i = 2; i * i <= number; i++)
{
if (number % i == 0)
returnValue = false;
}
if(returnValue)
Console.WriteLine(" {0} IS prime.", number.ToString());
else
Console.WriteLine(" {0} IS NOT prime.", number.ToString());
return returnValue;
}
}
}
Comments are closed.

Also, do you plan on having Stephen Toub on hanselminutes some time soon again? That should be really interesting. The last time was in 2007 and a lot has evolved since then!
Cheers