Texas Advanced Computing Center Hires Former Hubble Scientist as New Director of Data Intensive Computing

The Texas Advanced Computing Center (TACC) at The University of Texas at Austin today announced that Niall Gaffney has joined the center in the newly created position of director of data intensive computing. Gaffney most recently served as the principal computer scientist at the Space Telescope Science Institute (STScI) for the Hubble Space Telescope data archive and the Hubble Legacy Archive Project. A University of Texas at Austin graduate, he begins his new role May 1.

Gaffney has managed some of the richest astronomical data ever recorded in terms of scientific and public impact. In his new role at TACC, Gaffney will oversee the center's "Big Data" strategy, which includes storage and storage systems, data collections, analytics (data mining and statistics), and architectures for data-driven science and data-intensive computing.


TACC Director Jay Boisseau said, "As a technologist with an astronomy background, Niall has taken some of the deepest, most sensitive cataloged data from telescopes and blended it with other available astronomical data to build a library for astronomers and the general public alike to use. He has a deep understanding of how to create tools from large data collections that researchers from all fields need now and will need in the coming years to drive fundamental research. Niall's expertise in managing and analyzing large scientific data collections will help TACC address fundamental challenges concerning core techniques and technologies, problems and cyberinfrastructure across disciplines based on national directives.

"What we're seeing is an explosion in digital data that is leading to the huge growth in data-driven science," Boisseau said. Data-driven science emphasizes how data is used, shared, valued and preserved the researcher begins with a large corpus of data, analyzes this data to look for patterns, and deduces a hypothesis. In this mode of science, most of the data is input data; it might be quite expensive to reproduce if lost (genome data, satellite-sensed data), and is necessary for reproducing and sharing work.

"Astronomy and bioinformatics are examples of two areas that fit squarely in the data-driven science paradigm any field that has data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications,"
Gaffney said. "At TACC, we want to make it easier for researchers to bring large collections of data into a system where large-scale data research can be done faster and in a more productive data-rich environment. The future outlook is very exciting."

During the past five years, TACC has been building its internal team to support big data efforts, including forming the Computational Biology Group and the Data Management and Collections Group. This growth enables TACC to establish leadership roles with several data-driven
projects including the iPlant project, a large-scale National Science Foundation (NSF) project in plant bioinformatics; the UT Research Cyberinfrastructure, which is advancing current and future research efforts across all UT institutions; and the National Archives and Records Administration, the official records and digital collections of the U.S. government, which reached 12 petabytes in 2010 and turned to TACC to develop a strategy for computationally-assisted archiving.

In 2012, the O'Donnell Foundation committed $10 million in funding to TACC to augment the center's infrastructure, with a focus on the needs of data-driven science. This will result in new investments in large-scale shared file systems, clusters optimized for storing and analyzing
large data sets using Hadoop and other analytics tools, and systems to host Web portals and gateways that provide access to scientific data.

"The University of Texas at Austin is proud to have TACC as one of the leading computational science institutions in the world," Juan M. Sanchez, vice president for research, said. "We look forward to Niall's contributions to help the university continue to grow its leadership position in computational science."

The planning and development of the Dell Medical School is going to be a key initiative for the university in the coming decade, and TACC will play a foundational role, Sanchez said. "TACC will be ready and able to handle the data-driven aspects that will stem from the current biomedical research on campus so we can help build a medical school for the 21st century."

Gaffney's most recent work at STScI included responsibility for creating and managing the systems for archiving, cataloging, discovering, visualizing and distributing data from the Hubble Space Telescope, the Kepler project, the Far Ultraviolet Spectroscopic Explorer and the upcoming James Webb Space Telescope. Previously, he worked for the McDonald Observatory, focusing on systems integration, remote
observation planning systems, and the data systems during the construction and commissioning of the Hobby-Eberly Telescope. He has authored numerous papers and publications on data software systems.

Gaffney earned his doctor's, master's and bachelor's degrees in astronomy from The University of Texas at Austin. He was a NASA Graduate Student Research Project Recipient and earned a McDonald Observatory Graduate Research Fellowship.

TACC provides world-class, high-performance computing systems and cyberinfrastructure for the U.S. open science community, including deploying and operating since its inception in 2001 a total of 10 NSF-funded supercomputers and advanced visualization systems for national programs.