Difference between revisions of "Data-intensive Computing"

From ScenarioThinking
Jump to navigation Jump to search
 
(9 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Description:==
==Description:==
In high energy physics, bioinformatics, and other disciplines, we encounter applications involving numerous, loosely coupled jobs that both access and generate large data sets, these are large-scale data-intensive problems that normally need to harness geographically distributed resources.
In high energy physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, oceanography and many other disciplines, people encounter applications involving numerous, loosely coupled jobs that both access and generate large data sets. When the data to be accessed and processed are voluminous, we refer to the computation as '''''data intensive'''''. These large-scale data-intensive problems normally need to harness geographically distributed resources and greater processing power in multiple network-linked heterogeneous computer architectures exploit the best features of each for a given problem.


==Enablers:==
==Enablers:==
*Technogical adavnces in GRID<br>
*Recent advances in database, such as data mining, data warehouse. <br>


*Technogical advances in parallel programming<br>
*Technogical advances in parallel programming<br>


*Increasing demand for large computational systems, data storage and specialized experimental facilities.<br>
*Web services<br>


*Collaborative engineering<br>
*.NET technologies <br>


*Need for browsing of remote datasets <br>
*Software for interactive control of programs and instruments<br>


*Need for Usage of remote software<br>
*Scientific applications in areas such energy physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, and oceanography.<br>


*Large-scale parameter studies<br>
==Inhibitors:==
 
* Limitted large computational systems, data storage and specialized experimental facilities.
*[[Very large-scale simulation]]<br>
 
*[[Data-intensive Computing]]<br>
 
*[[Virtual Integration]]
<br>


==Inhibitors:==
* Scheduling difficulty in distributed environment: i.e resource utilization, response time, global and local allocation policies.
- Extending the retirement age to another 10 years so people will have to work more


==Paradigms:==
==Paradigms:==
There has been enormous concern about the consequences of human population growth for the environment and for social and economic development. But this growth is likely to come to an end in the foreseeable future.
Data-intensive computations arise in many domains of scientific and engineering research. Itself is not a driving force that would change people's view on the world, however, it does driven the development of Grid technology, because of its demanding requirement for large exchange and storage of datasets, and response time, which forces a concept of building common platform between geographically distributed processors. <br>
At the same time, with the advances in the development and maturity of data-intensive computation itself, many formidable problems in areas such as physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, and oceanography may in the future be sloved, which in turn would bring new research discoveries and reasonably new perspectives of the world into existance. <br>


==Experts:==
==Experts:==
United Nations
PNNL [http://www.pnl.gov/news/2004/04-64.htm] <br>
US Department of Health and Human Services
ORNL [http://www.ornl.gov/]


==Timing:==
==Timing:==
Improving on earlier methods of probabilistic forecasting, here we show that there is around an 85 per cent chance that the world's population will stop growing before the end of the century. There is a 60 per cent probability that the world's population will not exceed 10 billion people before 2100, and around a 15 per cent probability that the world's population at the end of the century will be lower than it is today. For different regions, the date and size of the peak population will vary considerably.
The development of data-intensive computation is more or less involved with the development of each of its application areas. It's hard to find it as a separate discipline and get clear milestones.


==Web Resources:==
==Web Resources:==
San Diego Supercomputer Center  [http://www.npaci.edu/online/v7.6/DataStar_03-19.html] <br>
OSU Department of Bioimedical informatics [http://bmi.osu.edu/areas_and_projects/high-level_programming_methodologies.cfm] <br>
Metacomputing and Data-Intensive Applications[http://www.cacr.caltech.edu/Publications/techpubs/PAPERS/cacr142p.html]<br>
Defining Data-intensive Applications[http://www.sdsc.edu/GatherScatter/GSfall95/2data-intensive.html]<br>


[http://scenariothinking.org/wiki/index.php/Technological_Driving_Forces >>back>>]
[http://scenariothinking.org/wiki/index.php/Technological_Driving_Forces >>back>>]

Latest revision as of 21:46, 16 March 2005

Description:

In high energy physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, oceanography and many other disciplines, people encounter applications involving numerous, loosely coupled jobs that both access and generate large data sets. When the data to be accessed and processed are voluminous, we refer to the computation as data intensive. These large-scale data-intensive problems normally need to harness geographically distributed resources and greater processing power in multiple network-linked heterogeneous computer architectures exploit the best features of each for a given problem.

Enablers:

  • Recent advances in database, such as data mining, data warehouse.
  • Technogical advances in parallel programming
  • Web services
  • .NET technologies
  • Software for interactive control of programs and instruments
  • Scientific applications in areas such energy physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, and oceanography.

Inhibitors:

  • Limitted large computational systems, data storage and specialized experimental facilities.
  • Scheduling difficulty in distributed environment: i.e resource utilization, response time, global and local allocation policies.

Paradigms:

Data-intensive computations arise in many domains of scientific and engineering research. Itself is not a driving force that would change people's view on the world, however, it does driven the development of Grid technology, because of its demanding requirement for large exchange and storage of datasets, and response time, which forces a concept of building common platform between geographically distributed processors.
At the same time, with the advances in the development and maturity of data-intensive computation itself, many formidable problems in areas such as physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, and oceanography may in the future be sloved, which in turn would bring new research discoveries and reasonably new perspectives of the world into existance.

Experts:

PNNL [1]
ORNL [2]

Timing:

The development of data-intensive computation is more or less involved with the development of each of its application areas. It's hard to find it as a separate discipline and get clear milestones.

Web Resources:

San Diego Supercomputer Center [3]
OSU Department of Bioimedical informatics [4]
Metacomputing and Data-Intensive Applications[5]
Defining Data-intensive Applications[6]


>>back>>