display | more...
Numerical Analysis, Computer Science

A class of mathematical problems that are extremely amenable to parallel processing. Many papers have been written on special purpose parallel algorithms to speed up the wallclock time of a given algorithm. These algorithms are invariably constrained by inherent seriality of the problem. Certain kinds of problems have such easy parallelization that one is almost "embarrassed" to talk about how simple it is to get many processors to be used efficiently. One example is genetic algorithms, in which multiple generations of simulations can be operated simultaneously.

A problem (more accurately, a solution or an algorithm) is called embarrassingly parallel if it can be parallelized merely by breaking it up into many small problems, and solving each separately (i.e., with no communication between the nodes solving each part). Parallelization is such a no brainer that one ought to be embarrassed to talk about it.

The canonical real world example here is ditch digging. To dig a 5 metre ditch, you hire a worker for 3 days (Disclaimer: I am not a dirt construction foreman; for all I know, 3 hours or 3 weeks may be involved). To dig a 100 metre ditch, you hire 20 workers for 3 days. You can still employ almost the same ditch-digging techniques. Note, by the way, that removing the dirt by truck might not parallelize so easily for such a small amount of dirt.

The canonical non-example is well digging. To dig a 5 metre well, you hire a worker for 3 weeks. But 20 workers cannot dig a 100 metre well in 3 weeks, nor can they dig a 5 metre well in 1 day!

Often, your algorithm or program is not inherently parallelizable. For instance, in computational biology or bioinformatics, parallelizing the Smith-Waterman algorithm between 1000 CPUs requires some thought. On the other hand, if you want to run 10000 Smith-Waterman comparisons (of roughly the same size), parallelizing that is embarrassing -- just run 10 on each CPU!

Some distributed processing projects have relied on such parallelizations. For instance, to break DES by brute force, you simply have to try each possible key. Unfortunately, the embarrassment of doing so has not yet become common knowledge, or perhaps the coordination of such a large scale effort is itself praiseworthy. But breaking RSA (factoring a large integer) is emphatically not embarrassing. While the various factoring sieves are embarrassingly parallel (and have been distributed in previous projects), a final step requires supercomputing power and is not embarrassing in the least...

In certain other cases the parallelization itself is embarrassing, but breaking up the problem into sub-problems requires much thought. In such cases, the proper way to brag is to explain at great length how one breaks up the problem into sub-problems. Then, one's disparaging remark "... which we solve easily, as it's embarrassingly parallel" is a great way to finish.

Log in or register to write something here or to contact authors.