Molecular Matchmaking for Drug Discovery


Still from a video clip produced by the Computational Visualization Center at The University of Texas at Austin showing biomolecular machines manufacturing proteins.

For millennia, people have discovered new drugs either through educated guesswork or blind luck. But with the proliferation of advanced computing, a new paradigm has emerged whereby one can find drug targets through simulation and modeling.

Chandrajit Bajaj, professor of computer science at The University of Texas at Austin, has been integrally involved in these efforts for more than 20 years. As director of the Computational Visualization Center at the university's Institute for Computational Engineering and Sciences (ICES), Bajaj has systematically attacked each step of the computational drug discovery process and recently made dramatic improvements to the algorithms involved in sleuthing out new candidate compounds to treat diseases such as HIV and Dengue Fever.

The process a combination of modeling, simulation, analysis and visualization is accomplished by Bajaj through the expert application of biophysical algorithms and the use of the high-performance, parallel-processing supercomputers at the Texas Advanced Computing Center (TACC).


The human ribosome (a biological machine for producing proteins) comprises the various ribosomal proteins and ribosomal RNAs (ribonucleic acids). This graphic, shown through a magnifying watch dial, illustrates the atomistic complexity of the molecular machinery (akin to the co-meshing of gears in a wrist watch) and captures a snapshot of the process. [All images were created by members of the Center for Computational Visualization research group.]

"Computers are a good way to accelerate the process of drug design," said Bajaj. "It takes 10 years to proof out a drug and a billion dollars or more. Hence, computational drug discovery is not only timesaving, but economics tells you this is the way we should be going."

The work of discovering a breakthrough new drug begins with a careful analysis of the virus, bacteria or mutation that causes the illness. By shooting powerful X-rays through a sample, electron microscopes create nanoscale pictures of the relevant molecules in near-natural conditions.

Combine 100,000 of these cleaned-up images and you have a three-dimensional model that can tell you about the structure of the molecule you are exploring.

Structure is an important characteristic for drug discovery because it reflects the shape of the relevant molecules. Shape complementarity the degree to which two molecules fit together is a major factor in whether a drug compound will bind to the target molecule. When done correctly, the 3-D model allows one to understand, identify and test potential binding sites on a virus.

A recent paper by Bajaj, Samrat Goswami (Exa Corporation), and Qin Zhang (University of Texas at Austin) in the February 2012 edition of the Journal of Structural Biology showed that their algorithms were able to detect the secondary structures (-helices and -sheets) of proteins reconstructed from single particle cryo-electron microscopy. As a result of the team's improvements to the image reconstruction and modeling algorithms, Bajaj and his collaborators can now identify individual side-chains floppy but crucial limbs that extend from the central spine of  the molecule. This level of detail is required to accurately predict binding.

"If you don't get all the factors into simulation, you get the wrong answer and your predictions suffer," said Bajaj.

Once a target molecule's structure has been determined, it is then necessary to test potential drug compounds to see whether any of them might fit a binding site. In the case of the HIV virus, as studied by Bajaj, the goal is to find a molecule that can bind to a specific location on the virus' surface and signal to the virus that it has reached its destination. The potential drug would cause the virus to spill its contents in the extra-cellular medium where it would do no harm, rather than injecting its genetic material into a host cell.

During the past decade, computer scientists have found faster ways to search for things using computers. Call it the "Googlization" of research. Bajaj has taken these insights and applied them to the problem of drug target screening. Instead of using search algorithms to find the closest coffee shop, Bajaj and his research team use them to create an ordered list of targets based on the binding energies and biochemical dynamics of two molecules when they come into contact, which indicates the most compatible compounds.

The sheer number of possible combinations and configurations boggles the mind, and the wrong combinations can lead to serious side effects. But as computer hardware and algorithms improve, such problems are becoming tractable.

Bajaj uses practically every system in TACC's computational ecosystem, including Ranger and Lonestar (TACC's high-performance computing systems), Longhorn (TACC's remote visualization system) and Stallion (the super-high-resolution tiled display in the TACC/ACES Visualization Laboratory).

"What used to take months is now taking a few days," Bajaj said. "We are blessed at UT Austin with having all of these high-performance computers and the TACC organization, because without them we couldn't make progress."