Data-intensive Computing

From ScenarioThinking
Jump to navigation Jump to search

Description:

In high energy physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, oceanography and many other disciplines, people encounter applications involving numerous, loosely coupled jobs that both access and generate large data sets. When the data to be accessed and processed are voluminous, we refer to the computation as data intensive. These large-scale data-intensive problems normally need to harness geographically distributed resources and greater processing power in multiple network-linked heterogeneous computer architectures exploit the best features of each for a given problem.

Enablers:

  • Recent advances in database, such as data mining, data warehouse.
  • Technogical advances in parallel programming
  • Web services
  • .NET technologies
  • Software for interactive control of programs and instruments
  • Scientific applications in areas such energy physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, and oceanography.

Inhibitors:

  • Limitted large computational systems, data storage and specialized experimental facilities.
  • Scheduling difficulty in distributed environment: i.e resource utilization, response time, global and local allocation policies.

Paradigms:

Data-intensive computations arise in many domains of scientific and engineering research. Itself is not a driving force that would change people's view on the world, however, it does driven the development of Grid technology, because of its demanding requirement for large exchange and storage of datasets, and response time, which forces a concept of building common platform between geographically distributed processors.
At the same time, with the advances in the development and maturity of data-intensive computation itself, many formidable problems in areas such as physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, and oceanography may in the future be sloved, which in turn would bring new research discoveries and reasonably new perspectives of the world into existance.

Experts:

PNNL [1]
ORNL [2]

Timing:

The development of data-intensive computation is more or less involved with the development of each of its application areas. It's hard to find it as a separate discipline and get clear milestones.

Web Resources:

San Diego Supercomputer Center [3]
OSU Department of Bioimedical informatics [4]
Metacomputing and Data-Intensive Applications[5]
Defining Data-intensive Applications[6]


>>back>>