The National Science Foundation (NSF) has awarded a $7 million grant to the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for a three-year project that will provide a new computing resource and the largest, most comprehensive suite of visualization and data analysis (VDA) services to the open science community.
The new compute resource, “Longhorn,” will provide unprecedented VDA capabilities and will enable the national and international science communities to interactively visualize and analyze datasets of near petabyte scale (a quadrillion bytes or 1,000 terabytes) for scientists to explore, gain insight and develop new knowledge.
According to Kelly Gaither, principal investigator and director of data and information analysis at TACC, the sudden onset of the widespread adoption of high performance computing (HPC) enabled by commodity clusters, and the scaling of systems to hundreds of teraflops and beyond made it urgent to ensure that this data deluge did not cause a bottleneck for visualization of very large datasets.
“The capabilities of VDA resources have not kept pace with the explosive rate of data production leading to a critical juncture in computational science,” Gaither says. “Interactive visualization, data analysis and timely data assimilation are necessary for exploring important and challenging problems throughout science, engineering, medicine, national security and safety, to name a few important areas.”
Alan Blatecky, acting deputy director of NSF’s Office of Cyberinfrastructure (OCI), said the importance of providing significant funding for this effort at this point in time is immeasurable.
“Science is being driven and defined by data,” he said, “and our ability to manage, manipulate, mine and visualize the results is fundamental to the discovery process.”
“Longhorn” System Capabilities
- Total Peak Performance (CPUs): 20.7 teraflops
- Total Peak Performance (GPUs): 500 teraflops single precision floating point operations
- Total Peak Rendering Performance: 154 billion triangles/sec
- Total Memory: 13.5 terabytes
- Total Disk: 210 terabyte global file system
System Components and Technologies
- 256 Dell R610 and R710 servers each with two Intel Xeon 5500 processors
- 512 CPUs with 2,048 Intel “Nehalem” (2.53GHz) quad-core processors
- 128 NVIDIA Quadroplex 2200 S4 units each with four Quadro FX 5800 GPUs with 122,880 CUDA processor cores; 2,048 gigabytes (two terabytes) of distributed graphics RAM
Visualization and Data Analysis Services
- A comprehensive collection of open source and commercial end-user VDA tools.
- Expert visualization support, including advanced interactive user support and training from a team comprising leading visualization researchers.
- A framework for rapidly integrating new visualization technologies from leading research teams to increase user capabilities throughout the project.
“TACC and the NSF have demonstrated that it’s possible to make long-term strategic investments in HPC technology, even during a challenging economic period,” said John Mullen, vice president of Dell Education, State and Local Government. “Thanks to that investment, Longhorn will allow scientists to tackle complex challenges with new advancements in CPU, GPU and networking technologies through an industry-standard HPC stack.”
Along with TACC, four of the leading VDA groups across the nation will provide leadership in offering advanced services to the national open science community: the Scientific Computing and Imaging Institute at the University of Utah, the Purdue Visual Analytics Center, the Data Analysis Services Group at the National Center for Atmospheric Research and the University of California Davis. The Southeastern Universities Research Association will ensure that the VDA training is delivered to underrepresented communities focusing on minority-serving institutions.
Said Gaither, “The team has world-class expertise conducting VDA research and development, provisioning large-scale systems, providing user support, and training and educating current and future scientists. The team also has vast experience providing VDA tools to the user communities of the National Science Foundation, Department of Homeland Security, Department of Energy and the National Institutes of Health.”
The project will provide the framework for leading researchers to collaborate and analyze their terascale datasets while still offering low barriers to entry and usability. The grant will stimulate and support new VDA research and technology transfer throughout the scientific research community.
“TACC and its partners will enable the analysis of data at the largest scales and will bring desktop capabilities to the user via a remote resource,” Gaither said. “We’re excited to be able to conduct extensive training in the use of VDA techniques and the deployment of next generation systems to overcome the current scarcity of such expertise and resources on campuses around the country.”
Longhorn and its related services will be running by Supercomputing 2009 which takes place in Portland, Ore. The system is scheduled to enter full production on Jan. 4, 2010. The grant expires in July 2012.
eXtremeDigital is the next phase in the National Science Foundation’s (NSF’s) ongoing effort to build a cyberinfrastructure that delivers high-end digital services, providing American researchers and educators with the capability to work with extremely large amounts of digitally represented information.
This award is funded under the American Recovery and Reinvestment Act of 2009 (ARRA) (Public Law 111-5) and is subject to the ARRA Terms and Conditions, dated May 2009, available on the NSF Web site.