Scott Hanselman

Back to (Parallel) Basics: Do you really want to do that? or Why doesn't the new Parallel.For support BigInteger?

April 21, '10 Comments [17] Posted in Back to Basics | Programming | VS2010
Sponsored By

why doesn't the new parallel.for support BigInteger?Got an interesting question on Twitter, and got a fabulous answer in email from Stephen Toub, who happens to be my most favorite multi-threaded person. Here's the question. "why doesn't the new parallel.for support BigInteger?"

It's an interesting question for a few reasons. First, it's interesting because it's a question about the new parallel stuff in .NET 4 and Visual Studio 2010 that lets the developer take parallel problems that would be very hard and makes them extremely easy. For example, you can basically take a straight for loop and often make it run in parallel on multiple-cores. This is huge.

Second, speaking of huge things, BigInteger is huge. Extremely. How big? Unbounded. Like, arbitrarily large. I'd have preferred BigAssInteger, but that's offensive, so stop asking for it. ;)

Here's Stephen's emailed answer with my added commentary.

"I know of no real scenarios that require such a large, dense iteration space and that could stay running for long enough on a single box.  Parallel.For does provide built-in support for Int64, which affords up to 2^64 iterations.  If you were able to run such a loop where each iteration of the loop took a nanosecond to process (i.e. a billion iterations per second), it would take something like 500 years to complete on a single core.  Even with a hundred cores in a single box this would take five years to complete."

He is extremely kind. Notice how he says "why the hell would you want to do that" without making me feel bad? The man's a teacher. Not only that, he offers an alternative that would run for a half-millennium without any judgment that my problem space might be wrong. He should be a college professor.

Second, if you really do need this support, it’s achievable with Parallel.ForEach and either iterators or a custom Partitioner<T>.  For example, with iterators, you can write a handy method:

static IEnumerable<BigInteger> Range(BigInteger fromInclusive, BigInteger toExclusive)
{

for(BigIntegeri=fromInclusive; i<toExclusive; i++) yield return i;
}
What? After telling me it's not desirable and I don't really want to do it, he actually DOES it. That's hot.

and then consume it in a method like:

public static void ParallelFor(BigInteger fromInclusive, BigInteger toExclusive, Action<BigInteger> body)
{
Parallel.ForEach(Range(fromInclusive,toExclusive), body);
}

which you can then use like:

ParallelFor(new BigInteger(from), new BigInteger(to), i=>
{
// ... loop body here ...
});

Finally, I’d love to understand if lack of BigInteger support is actually causing this customer issues, or if they’re just curious. And if it’s actually problematic, what’s the scenario?

Then, after all this he asks if there's a scenario for this that he might not have thought of. You can't know everything, so he wants to double check to make sure the library he worked on does everything the customer wants.

My man-crush on this dude couldn't be any bigger. Now if Stephen would just return my calls. The bromance continues. ;)

Hope you enjoyed this sample, and big thanks to @amarciniec for the great question.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. I am a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web
Thursday, April 22, 2010 6:23:19 AM UTC
After reading your post, I went over and read some of the stuff he wrote on the pfx team blog. Quite interesting, and if I may add, parallels in .NET 4 in general is quite big. We'll have a lot of studying to do once more!

Also, do you plan on having Stephen Toub on hanselminutes some time soon again? That should be really interesting. The last time was in 2007 and a lot has evolved since then!

Cheers
Thursday, April 22, 2010 6:44:30 AM UTC
"Bromance" isn't a cromulent word!
Einar W. Høst
Thursday, April 22, 2010 8:11:04 AM UTC
Gay! (and that's completely OK of course :o)
Thursday, April 22, 2010 9:24:43 AM UTC
The MSDN article you link to BigInteger (http://msdn.microsoft.com/en-us/magazine/cc163696.aspx) has repition throughout. Very hard to read
Scott
Thursday, April 22, 2010 10:49:57 AM UTC
After reading this post I decided to look at the Parallel stuff in .Net 4. There is quite a lot there, a lot of potential to be explorered. My industry, law enforcement, requires a lot of processing. I may be able to take advantage of paralellism to help with some of the heavy load.

Thanks for the interesting post.
Thursday, April 22, 2010 12:28:44 PM UTC
Parallel's are pretty cool, but I've been delving lately in other classes in that same namespace, namely Tasks, they're ridiculously useful!
Thursday, April 22, 2010 12:35:41 PM UTC
I'm totally stoked that this got such a great answer. And to follow up, the reason to 'why the hell I'd want to do this' is that I'm trying to help out on a project that is proving suspected exteremely large prime numbers are prime determinalisticly. It is time consuming but the programs the crunch the numbers now don't take advantage of multi cores and every little bit helps when you're doing math on number with more than 6000 digits.
Thursday, April 22, 2010 3:12:31 PM UTC
Wow...I suspect that the heat coming off the cores of Adam's machine is what is really causing Global Warming.

:-)
Ryan
Thursday, April 22, 2010 5:05:14 PM UTC
Say, doesn't that "int" i have a high risk of causing an OverflowException?
Bas
Thursday, April 22, 2010 10:46:56 PM UTC
Adam, thanks for the additional information!

When you say these numbers are "suspected" primes, that implies to me that they're known and not necessarily sequential, is that correct? If so, a solution based on Parallel.ForEach instead of Parallel.For is likely the approach you'd want to take anyway, since it allows you to work on an arbitrary IEnumerable<BigInteger> values (either an existing collection, or one defined by an iterator) instead of a specific sequential sequence of values as you'd use with a Parallel.For.

Or, is the idea that you'd be using the parallel loop to iterate over a "small" (i.e. <= Int64.MaxValue), sequential range of big BigInteger values, running some algorithm that helps you to root out the potential "suspects"? If that's the case, I do see how having an overload of For that accepts BigIntegers could be natural. For that case, you could still use the approach I outlined previously, where you write a C# iterator that yields all values in the range you care about. An alternate (and potentially better performing) approach would be to simply scale down the entire range into Int64 values, use a Parallel.For loop to iterate through the Int64 values, and add back the base prior to invoking the body, e.g.

public static void ParallelFor(
BigInteger fromInclusive, BigInteger toExclusive, Action<BigInteger> body)
{
var range = (long)(toExclusive - fromInclusive);
Parallel.For(0, range, i => body(fromInclusive + i));
}

I hope that helps. Thanks, again, please keep the questions/feedback coming. Also, make sure to check out the parallel programming blog at http://blogs.msdn.com/pfxteam as well as the Parallel Computing developer center on MSDN at http://msdn.com/concurrency.
Thursday, April 22, 2010 11:28:02 PM UTC
@Stephen I think with suspected primes Adam is talking about are numbers statistically likely to be prime numbers so yes a known list of values to test. There is a pattern to primes which leads to a statistical likelyhood of a particular number being prime. However to check it you basically have to divide it by every number below the square root of the value (there are better ways now but still involving this kind of level of computation).

@Adam is code not available to do this or is just something you want to do yourself? The largest prime found is millions of digits long so I would think there is a pretty solid distributed program somewhere to do this.
Friday, April 23, 2010 12:58:32 PM UTC
@Pete Spot on, thank you and yes there is code to do it but again, it doesn't run in parallel and I would like to both tranlate it to C# for my own learning and two make C# more friendly to math to bring in more developers and strengthen the platform. My old college roommate was a math / physics guy and had to use c++ or mathmatica and I always felt bad that he had to use such ugly languages instead of .net.

@Stephen - I sent an email comment to your blog on the topic, I'm not sure if it found its way to you. Essentially Bell Labs has an open research project where they list suspected, very large primes that still need proven by a determinalistic algorithm like AKS or ECPP (this can take days). What I'm trying to do is translate the AKS Conjecture Algorithm to use C#. That, as Pete said, involves running the algorithm against every number greater than 2 and less than the square root of the suspected prime. So a for loop is needed and it needs to support biginteger because even the square root of the primes we're testing are too big for anything less. I think your previous suggestion will work just fine though getting the BCL team to add support for Polynomial objects and functions would be nice. =oD
Saturday, April 24, 2010 1:45:04 AM UTC
I had the pleasure of meeting Steven a few months back. He gave a 90 minute introduction to the new parallel capability in .Net 4. He blew the team away.

I was so impressed I went back and started reading past MSDN mag articles that he wrote.
Cory Cissell
Tuesday, April 27, 2010 7:58:37 AM UTC
Adam, unfortunately I didn't receive the email you sent through the blog. Please feel free to resend it directly to me at stoub at microsoft dot com. Regardless, thanks for the additional information; it definitely helps me to understand what you're facing.

Pete, thanks for the explanation, as well.

Cory, thanks for the very nice sentiment.

Jeff, I'm very glad to hear you find tasks to be ridiculously useful :) If you care to elaborate (either here or in an email to me personally), I'd love to hear more about your experiences and use cases.

Bas, yes, that was just a silly typo on my part (I look forward to the day when Outlook has a built-in C# snippet compiler ;) Scott fixed it.

Everyone, I'm always interested in learning about how developers are using parallelism, whether it's the new .NET 4 parallelism support, your own home-grown solutions, or anything else. Any information you care to send my way will be welcome and read deeply (if you don't hear back from me within a few days, please try sending your mail again... I always make it a point to respond, even if only briefly to express my thanks.)
Wednesday, April 28, 2010 9:07:53 AM UTC
@Adam A couple of months back I failed miserably to scale The Sieve of Atkins with Parallel.For, maybe you'll find something in there of interest http://alicebobandmallory.com/articles/2010/01/14/prime-factorization-in-parallel
Friday, October 22, 2010 4:23:49 PM UTC
Here's my usage scenario: I need an fast, efficient way to generate random prime numbers of at least 100 digits.
Tom
Friday, October 22, 2010 4:57:50 PM UTC
Using Senior Taub's knowledge, I have crafted the following--my apologies for not knowing how to get the code to format nicely:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Numerics;
using System.Text;
using System.Threading.Tasks;
namespace Calculate_Primes
{
class Program
{
private const int _NUMBER_OF_DIGITS = 100;

static void Main(string[] args)
{
BigInteger floor = BigInteger.Parse("1" + string.Empty.PadLeft(_NUMBER_OF_DIGITS - 1, '0'));
BigInteger ceiling = BigInteger.Parse(string.Empty.PadLeft(_NUMBER_OF_DIGITS, '9'));

Console.WindowWidth = 150;

//var primes = Enumerable.Range(floor, ceiling).Where(n => Enumerable.Range(1, n).Where(m => (n / m) * m == n).Count() == 2);

Console.Clear();
_calculatePrimes(floor, ceiling, "C:\\100 digit primes.txt");

Console.Clear();
_calculatePrimes(floor, ceiling, "C:\\300 digit primes.txt");
}

static IEnumerable<BigInteger> Range(BigInteger fromInclusive, BigInteger toInclusive)
{
for (BigInteger i = fromInclusive; i <= toInclusive; i++)
yield return i;
}

static void ParallelFor(BigInteger fromInclusive, BigInteger toInclusive, Action<BigInteger> body)
{
Parallel.ForEach(Range(fromInclusive, toInclusive), body);
}

static void _calculatePrimes(BigInteger floor, BigInteger ceiling, string resultsFileFilepath)
{
using (System.IO.FileStream fs = new System.IO.FileStream(resultsFileFilepath, System.IO.FileMode.Create)) { }

using (System.IO.StreamWriter sw = new System.IO.StreamWriter(resultsFileFilepath))
{
ParallelFor(floor, ceiling, i =>
{
if (_isPrime(i))
{
lock (sw)
{
sw.Write(i.ToString() + System.Environment.NewLine);
sw.Flush();
}
}
});
}
}

static bool _isPrime(BigInteger number)
{
bool returnValue = true;

Console.WriteLine("Checking {0} for primality.", number.ToString());

if ((number < 2) || (number > 2 && number.IsEven) || (number > 2 && number.IsPowerOfTwo))
returnValue = false;
else
for (BigInteger i = 2; i * i <= number; i++)
{
if (number % i == 0)
returnValue = false;
}

if(returnValue)
Console.WriteLine(" {0} IS prime.", number.ToString());
else
Console.WriteLine(" {0} IS NOT prime.", number.ToString());

return returnValue;
}
}
}
Tom
Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.