The Globally Interconnected Object Databases (GIOD) Project
A data thunderstorm is gathering on the horizon with the next generation of
particle physics experiments. The amount of data is overwhelming. Even
though the prime data from the CERN CMS detector will be reduced by a
factor of more than 107, it will still amount to over a Petabyte (1015
bytes) of data per year accumulated for scientific analysis.
The task of finding rare events resulting from the decays of massive new
particles in a dominating background is even more formidable. Particle
physicists have been at the vanguard of data-handling technology, beginning
in the 1940's with eye scanning of bubble-chamber photographs and
emulsions, through decades of electronic data acquisition systems employing
real-time pattern recognition, filtering and formatting, and continuing on
to the PetaByte archives generated by modern experiments. In the future,
CMS and other experiments now being built to run at CERN's Large Hadron
Collider (LHC) expect to accumulate of order of 100 PetaBytes within the
next decade.
The scientific goals and discovery potential of the experiments will only
be realized if efficient worldwide access to the data is made possible.
Particle physicists are thus engaged in large national and international
projects that address this massive data challenge, with special emphasis on
distributed data access. There is an acute awareness that the ability to
analyze data has not kept up with its increased flow. The traditional
approach of extracting data subsets across the Internet, storing them
locally, and processing them with home-brewed tools has reached its limits.
Something drastically different is required. Indeed, without new modes of
data access and of remote collaboration we will not be able to effectively
"mine" the intellectual resources represented in our distributed
collaborations.
Collaborators
CERN; Caltech, USA; Hewlett Packard, USA
Contact
Julian Bunn
CERN and Caltech, USA
julian@cacr.caltech.edu
http://pcbunn.cithep.caltech.edu