The Little Screensaver That Could

IBM is building the world's fastest supercomputer to simulate one of the great mysteries in biology: how proteins assemble themselves. But a modest screensaver running on ordinary PCs has beaten them to it. By Andy Patrizio.
Image may contain Balloon and Ball
Like SETI@Home, Folding@Home is a volunteer program that uses the spare computing cycles of ordinary home computers running a special screensaver. But instead of looking for signs of alien life in radio signals from outer space, Folding@Home simulates the staggeringly complex process of how proteins fold.Folding@home

IBM is spending $100 million building the world's fastest supercomputer to do cutting-edge medical research, but a distributed computing effort running on ordinary PCs may have beaten Big Blue to the punch.

IBM's proposed Blue Gene, a massively parallel supercomputer, in hopes to help diagnose and treat disease by simulating the ultra-complex process of protein folding.

The monster machine will be capable of more than 1 quadrillion operations per second and will be 1,000 times faster than Deep Blue, the computer that defeated world chess champion Garry Kasparov in 1997, IBM said.

But Folding@Home, a modest distributed computing project run by Dr. Vijay Pande and a group of graduate students at Stanford University, has already managed to simulate how proteins self-assemble, something that computers, until now, have not been able to do.

Proteins, which control all cellular function in the human body, fold into highly complex, three-dimensional shapes that determine their function. Any change in the shape can alter the protein, turning a desirable protein into a disease.

Like SETI@Home, Folding@Home is a volunteer program that uses the spare computing cycles of ordinary home computers running a special screensaver. But instead of looking for signs of alien life in radio signals from outer space, Folding@Home simulates the staggeringly complex process of how proteins fold.

Folding@Home has about 15,000 volunteers. SETI@Home, the most popular distributed computing effort, has nearly 3 million.

Protein folding has never been simulated because of the computational complexity of the process. Proteins typically fold in 10,000 nanoseconds, but a single computer can simulate only 1 nanosecond of the folding process per day. At this rate, a complete protein fold would take 30 years to simulate.

But thanks to the combined computing power of its participants, the Folding@Home project has already folded one protein, a Beta Hairpin, at least 15 different times to make sure the results aren't a fluke.

Several other more complex proteins have also been put through the folding process, and the results are being prepared for peer review, Pande said.

Pande, an assistant professor of chemistry at Stanford, is about to publish the first results of the project in a forthcoming issue of the Journal of Molecular Biology.

This first fold isn't significant in and of itself, Pande said.

"Because it's small and simple, this isn't the poster child for curing diseases," he said. "What we've shown is proof of concept and being able to dig into the real stuff. The broader implications are being able to apply this experiment in the future."

For the long-term, Folding@Home plans to tackle the folding of more important proteins -- and more significantly, how they misfold.

"If we can understand the mechanism of misfolding, we can start to do structure design to inhibit misfolding," Pande said. "Developing a drug isn't something you do casually. The first stage is to identify what you are going to attack. A lot of these diseases start with misfolding, so we don't know what to attack. A computer model will give us an idea of what to attack."

IBM doesn't feel threatened by Folding@Home. In fact, the leader of the Blue Gene project thinks the two efforts will complement each other.

"The things the Folding@Home team is learning could turn out to be hugely beneficial to us," said Bill Tulleyblank, director of the Deep Computing Institute at IBM Research. "If they find some approximations that enable us to reduce the size of the problem, then we could solve it much faster than we could without those calculations."

However, Tulleyblank said that distributed computing projects such as Folding@Home can simulate the folding of only fairly simple proteins. Blue Gene will be able to simulate larger, more complex proteins.

Modeling complex proteins, where a fold depends on scores of interacting variables, will require a massively parallel machine, he said.

Blue Gene uses a massively parallel system with new, high-speed communications between processors, which is required for refined, highly detailed simulations that Blue Gene will do, but Folding@Home cannot, Tulleyblank said.

"The kind of problems we're doing is far beyond what they could hope to do on the distributed computing model," he said. "With the stuff we are doing, we are not able to split the program up independently. We have to deal with a tremendous number of interactions between the processes of the program. Everybody affects everybody else, so you need a fast way to shuttle everything around."