The Texas Advanced Computing Center (TACC) at The University of Texas at Austin and a group of partners have received a $6 million National Science Foundation (NSF) grant to design, build and deploy Wrangler, a groundbreaking data analysis and management system for the national open science community.
TACC is already home to the Stampede supercomputer and hosted an earlier supercomputer called Ranger. The creation of Wrangler further enhances the university’s role as one of the nation’s top supercomputing sites.
“Wrangler advances the vision in data-centered science to tackle today’s most complex, extremely data-intensive challenges and issues,” said Bob Chadduck of the NSF Computer and Information Science and Engineering Directorate’s Division of Advanced Cyberinfrastructure. “NSF is proud to support this community-accessible, data-focused resource to advance science, engineering and education.”
The design and implementation of Wrangler responds to developments in technology and research practice that are collectively referred to as “Big Data” or the “Data Deluge,” encompassing a variety of needs related to research data storage, analysis and access in the sciences.
“Wrangler is designed from the ground up for emerging and existing applications in data-intensive science,” said Dan Stanzione, the lead principal investigator for the project and deputy director at TACC. “Wrangler will be one of the highest performance data analysis systems ever deployed and will be the most replicated, secure storage for the national open science community.”
Wrangler will be capable of executing up to 275 million IOPS (input/output operations per second). In addition, the 10 petabyte disk storage system of Wrangler will be fully replicated to Indiana University, a partner in the project, providing data access reliability and security.
“This combination of unmatched transaction performance, massive bandwidth and capacity, and full data replication far exceed what is currently available to the open science community,” Stanzione said.
Dell Inc. and DSSD Inc. are the strategic partners providing the technology that make up the core of Wrangler.
In addition to hosting part of the system, Indiana University will participate in operations and training, and will help users optimize their network performance between their home institutions and Wrangler. The Computation Institute, a joint initiative of the University of Chicago and Argonne National Laboratory, will integrate their Globus Online service within the Wrangler project to make transferring data to and from Wrangler simple and fast.
“Wrangler will meet critical needs for managing, moving and analyzing massive and diverse data sets in disciplines including energy, weather and the global climate, basic biology, health and medicine, and will also support citizen science from astronomy to marine biology to zoology,” said Craig Stewart, co-principal investigator and executive director of the Pervasive Technology Institute at Indiana University. “We anticipate Wrangler will support more than 1,000 researchers and students every year and will serve as a model for smaller-scale data systems on campuses that will improve U.S. research capabilities.”
“Globus is committed to facilitating open science,” said Ian Foster, director of the Computation Institute and professor of computer science at the University of Chicago and Argonne National Laboratory. “The Wrangler project demonstrates what is possible by connecting institutions and people using services like Globus Online that let researchers focus on research.”
Wrangler’s performance and storage capabilities for Big Data applications will be enhanced through tight integration to TACC’s Stampede supercomputer and to NSF Extreme Science and Engineering Discovery Environment (XSEDE) resources across the country.
“Each new large-scale system, especially ones that bring new classes of capabilities, has significant impacts on society,” Stanzione said. “Wrangler is sure to enable groundbreaking research, and many communities are ready and committed to adopt the system on Day One.”