Emblem Sub Level Top PUBLICATIONS
Archived Press Releases
Emblem Sub Level Logo TeraGrid
Emblem Sub Level Bottom
September 21, 2001

SDSC’s Fran Berman Talks About the TeraGrid
by Christopher Rogala, managing editor, HPCwire

The NSF recently announced a plan to link computers in four major research centers with a comprehensive infrastructure called the TeraGrid. The project will create the world’s first multi-site computer facility, the Distributed Terascale Facility (DTF). SDSC director Fran Berman agreed to answer some questions for HPCwire concerning the significance of the DTF project both for SDSC and for the United States’ computing infrastructure as a whole.

HPCwire: How long has the DTF project been in development? How and by whom was the plan developed?

BERMAN: The DTF project is based on an emerging direction that has come from the scientific community and which is nicely represented by NSF’s cyberinfrastructure. The principals at SDSC, NCSA, Caltech and Argonne have all been working for many years together putting forth this vision in different venues and TeraGrid gave us the opportunity to join the PACI programs for this extraordinary collaboration. When we began to put together the proposal, I had just started as Director of SDSC and NPACI and my counterpart from NCSA and long-time colleague, Dan Reed, was instrumental in making the project a real partnership.

HPCwire: Are the TeraGrid capabilities dedicated solely toward large-scale research initiatives? What will the TeraGrid mean to the “average” scientist who uses high-performance computing?

HPCwire: Please elaborate on the announcement that “SDSC will lead the TeraGrid data and knowledge management effort.” What projects will constitute its prime KM focus? Which industrial partners will be cooperating? What will be the most concrete long-term benefits?

BERMAN: The next decade in computation is emerging as the “Data Decade.” One of the critical trends from the last decade has been the immense ammount of data from sensors, instruments, experiments, etc. that is now available for large-scale applications and is fundamental for the next generation of scientific discoveries. TeraGrid will provide an aggregate capability of over 0.5 petabytes of on-line disk and SDSC’s node will be configured to be the most effective data-handling platform anywhere. We will also focus on developing fundamental and advanced data services for the TeraGrid that will enable application developers and users to leverage the full potential of TeraGrid’s resources

In particular, SDSC’s node of the TeraGrid will include a 4-teraflops cluster with 2 terabytes of memory and 225 terabytes of disk storage to support the most data-intensive applications and allow researchers to use NPACI-developed data grid middleware to manage and analyze the largest-scale data sets, ranging from astronomy sky surveys, brain imaging data, and collections of biological structure and function data.

In addition to IBM, Intel, and Qwest, SDSC will be working with Sun Microsystems to deploy a next-generation Sun server in a data-handling environment that will support a thousand transactions per second where each transaction may require moving gigabytes of data.

HPCwire: Is the DTF itself significantly scalable? To what extent? Are there currently plans to add centers to the DTF?

BERMAN: The TeraGrid will initially be deployed at four sites: SDSC, NCSA, Caltech and Argonne National Lab. The plan is to ensure that the software works at a production level and then to build TeraGrid out. The DTF was proposed as the cornerstone of a National Grid effort, so it will be important to be able to add nodes and capabilities in a smooth and effective way. The scale and success of a National Grid is critical for the science community and we will need to ensure that TeraGrid is usable. Members of PACI partnerships and many other sites have expressed interest in becoming nodes on the TeraGrid and this indicates how pervasive the need for a National Grid effort is.

HPCwire: In terms of both the computing systems being integrated and the optical network itself, how much existing hardware and technology is being used and how much is being built from the ground up?

BERMAN: Aside from existing clusters at NCSA which will be integrated into the hardware funded by the NSF award, the TeraGrid will be primarily new hardware. Our intention is to make every effort to use hardware and software that is or will be industry-standard, off-the-shelf, and open-source. The TeraGrid compute clusters will use Intel processors and run the open-source Linux operating system. The Storage-Area Network at SDSC will be built from off-the-shelf components. The middleware will include Globus and the SDSC Storage Resource Broker, both of which are deployed at many sites worldwide and have become the de facto standards in grid computing. And of course, the many other software components developed by TeraGrid partners - for example, schedulers and accounting systems - will be made available to the community.

As for networking, the initial 40-gigabit-per-second backbone represents leading-edge technology, but it’s only a matter of time before the technology becomes widely deployed. The TeraGrid network will connect to the global research community through Abilene and STAR TAP and to the California and Illinois research communities via CalRen-2 and I-WIRE.

HPCwire: The TeraGrid will use Linux clusters joined by a cross-country network backbone. What principal measures will be implemented to manage and monitor the Linux configuration at various sites?

BERMAN: Both NPACI and the Alliance have considerable experience in configuration management. The NPACI Rocks clustering toolkit, developed by SDSC’s Phil Papadopoulos and UCB’s David Culler, provides a facility to monitor, manage, update and deploy scalable high-performance Linux clusters in minutes. Rocks is open source, already available, and currently used by GriPhyN, Compaq, universities, and over 2 dozen other venues for Linux cluster configuration management. NCSA’s recently-announced “in-a-box” software packaging initiatives are further examples of configuration tools that can form components of an overall deployment strategy. In addition, SDSC and NCSA are active participants in the proposed NSF GRIDS Center (led by PACI partners and PIs Kesselman and Foster), which will be leading the community in overall grid software stack integration and packaging. The TeraGrid team understands how important it will be make the software core available to the larger scientific community through these and other technology transfer vehicles.

HPCwire: Judging by the news releases, strategic administration of the TeraGrid is as distributed as its resources. How will critical operational policy directions be determined, and what is your role in that process?

BERMAN: TeraGrid will be administered by a distributed TeraGrid Operations Center (TOC) which will be comprised of staff at SDSC, NCSA, Caltech and Argonne. The TOC staff will maintain and administer the TeraGrid, establish operational policies, and provide 24x7 support. All of the TeraGrid principals will collaborate to coordinate the project. I will serve as Chair of the TeraGrid Executive Committee, consisting of all the TeraGrid PIs and co-PIs. This group will work as an ensemble to help ensure the implementation of the vision we set out for TeraGrid.

HPCwire: Ruzena Bajcsy, NSF assistant director for Computer and Information Science and Engineering, has stated that “the DTF can lead the way toward a ubiquitous ‘Cyber-Infrastructure’…” Do you agree that this project is the first step toward the development of such an infrastructure? What is the next step? Please describe your vision of a “ubiquitous Cyber-Infrastructure”?

BERMAN: I do agree. NSF’s cyberinfrastructure recognizes the critical need for a sustainable national infrastructure that combines computing, communication and storage technologies into an extensible software substrate fundamental to advances in science and technology. The TeraGrid forms a critical foundation for this infrastructure.

The next step involves growing out the TeraGrid into a true National Grid. This will involve a serious commitment to a sustainable and persistent human infrastructure required to ensure the smooth operation, and development of services and policies required to make the a national grid infrastructure operational and truly usable by the science and engineering community.

I believe that the ultimate target for the next decade is not a TeraGrid but a “PetaGrid” - which adds to the TeraGrid additional grids (e.g. the IPG, the DOE Science Grid, the EU Grid, etc.) as well as the emerging infrastructure of low-level devices (sensornets, PDAs, wireless networks, etc.). The PetaGrid will enable us to go from sensor to supercomputer and will bring forward a new generation of applications including individualized medicine, real-time disaster response applications, etc. We are building such a “PetaGrid” prototype between SDSC and Cal-IT2 at UCSD. Our TeraGrid efforts will be a critical part of this vision.

HPCwire: Does the DTF, in fact, constitute a de facto push by the NSF toward virtual unification of SDSC and NCSA?

BERMAN: Both PACI partnerships pursue a common goal - deploying an infrastructure to meet the ever increasing demands of the national academic community for high-end information technology resources. To reach this goal, each partnership focuses on unique development issues and application areas, which maximizes the impact of the PACI program as a whole. Simultaneously, however, NPACI and the Alliance collaborate to ensure that, in the end, the nation will have a unified, robust, and scalable infrastructure. TeraGrid provides an opportunity for a partnering of partnerships where each PACI partnership can play a critical role and lead in complementary areas.

HPCwire: How would you characterize your leadership of SDSC? How does it differ from that of your predecessor, Sid Karin? What are your greatest challenges at this time, and how are you dealing with them?

BERMAN: I’ve been at SDSC / NPACI a little over 6 months and it has been an exciting time for all of us. My leadership style is team-oriented and the whole center has been involved in a visioning and strategic planning process that is almost complete now. Sid has been a terrific “Director Emeritus” and has been very helpful to me. My backround is more Grid-oriented than Sid’s and I think I bring that focus to the center. In addition, I have worked for many years with multi-disciplinary application teams and am interested in reinforcing SDSC’s user-oriented focus.

It’s tempting to approach this job as a “super-PI” rather than a Director, and one of my biggest challenges has been to approach things as a Director and to work effectively with a large-scale management infrastructure. Time management is also an immense challenge as we are involved in a huge number of exciting projects. I’m having a great time though and have incredible admiration and respect for the outstanding staff and researchers at SDSC and NPACI.

Copyright 1993-2001 HPCwire. Redistribution of this article is forbidden by law without the expressed written consent of the publisher. For HPCwire subscription information send e-mail to sub@hpcwire.com. Tabor Griffin Communications’ HPCwire is also available at www.tgc.com/hpcwire.html