Difference between revisions of "Data-intensive Computing"

From ScenarioThinking
Jump to navigation Jump to search
 
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Description:==
==Description:==
E-Science is a whole new concept just emerged.  
In high energy physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, oceanography and many other disciplines, people encounter applications involving numerous, loosely coupled jobs that both access and generate large data sets. When the data to be accessed and processed are voluminous, we refer to the computation as '''''data intensive'''''. These large-scale data-intensive problems normally need to harness geographically distributed resources and greater processing power in multiple network-linked heterogeneous computer architectures exploit the best features of each for a given problem.
 
In the future, '''''e-Science''''' will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet. Typically, a feature of such collaborative scientific enterprises is that they will require access to very large data collections, very large scale computing resources and high performance visualisation back to the individual user scientists. <br>
 
The Grid is the architecture proposed to bring all these issues together and make a reality of such a vision for e-Science. Ian Foster and Carl Kesselman, inventors of the Globus approach to the Grid define the Grid as an enabler for Virtual Organisations: ‘An infrastructure that enables flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions and resources.’ As a result, e-Science can be deemed as the driving force for Grid's future development.


==Enablers:==
==Enablers:==
*Technogical adavnces in GRID<br>
*Recent advances in database, such as data mining, data warehouse. <br>


*Technogical advances in parallel programming<br>
*Technogical advances in parallel programming<br>


*Increasing demand for large computational systems, data storage and specialized experimental facilities.<br>
*Web services<br>


*Collaborative engineering<br>
*.NET technologies <br>


*Need for browsing of remote datasets <br>
*Software for interactive control of programs and instruments<br>


*Need for Usage of remote software<br>
*Scientific applications in areas such energy physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, and oceanography.<br>


*Large-scale parameter studies<br>
==Inhibitors:==
 
* Limitted large computational systems, data storage and specialized experimental facilities.
*[[Very large-scale simulation]]<br>
 
*[[Data-intensive Computing]]<br>
 
*[[Virtual Integration]]
<br>


==Inhibitors:==
* Scheduling difficulty in distributed environment: i.e resource utilization, response time, global and local allocation policies.
- Extending the retirement age to another 10 years so people will have to work more


==Paradigms:==
==Paradigms:==
There has been enormous concern about the consequences of human population growth for the environment and for social and economic development. But this growth is likely to come to an end in the foreseeable future.
Data-intensive computations arise in many domains of scientific and engineering research. Itself is not a driving force that would change people's view on the world, however, it does driven the development of Grid technology, because of its demanding requirement for large exchange and storage of datasets, and response time, which forces a concept of building common platform between geographically distributed processors. <br>
At the same time, with the advances in the development and maturity of data-intensive computation itself, many formidable problems in areas such as physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, and oceanography may in the future be sloved, which in turn would bring new research discoveries and reasonably new perspectives of the world into existance. <br>


==Experts:==
==Experts:==
United Nations
PNNL [http://www.pnl.gov/news/2004/04-64.htm] <br>
US Department of Health and Human Services
ORNL [http://www.ornl.gov/]


==Timing:==
==Timing:==
Improving on earlier methods of probabilistic forecasting, here we show that there is around an 85 per cent chance that the world's population will stop growing before the end of the century. There is a 60 per cent probability that the world's population will not exceed 10 billion people before 2100, and around a 15 per cent probability that the world's population at the end of the century will be lower than it is today. For different regions, the date and size of the peak population will vary considerably.
The development of data-intensive computation is more or less involved with the development of each of its application areas. It's hard to find it as a separate discipline and get clear milestones.


==Web Resources:==
==Web Resources:==
San Diego Supercomputer Center  [http://www.npaci.edu/online/v7.6/DataStar_03-19.html] <br>
OSU Department of Bioimedical informatics [http://bmi.osu.edu/areas_and_projects/high-level_programming_methodologies.cfm] <br>
Metacomputing and Data-Intensive Applications[http://www.cacr.caltech.edu/Publications/techpubs/PAPERS/cacr142p.html]<br>
Defining Data-intensive Applications[http://www.sdsc.edu/GatherScatter/GSfall95/2data-intensive.html]<br>


[http://scenariothinking.org/wiki/index.php/Technological_Driving_Forces >>back>>]
[http://scenariothinking.org/wiki/index.php/Technological_Driving_Forces >>back>>]

Latest revision as of 21:46, 16 March 2005

Description:

In high energy physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, oceanography and many other disciplines, people encounter applications involving numerous, loosely coupled jobs that both access and generate large data sets. When the data to be accessed and processed are voluminous, we refer to the computation as data intensive. These large-scale data-intensive problems normally need to harness geographically distributed resources and greater processing power in multiple network-linked heterogeneous computer architectures exploit the best features of each for a given problem.

Enablers:

  • Recent advances in database, such as data mining, data warehouse.
  • Technogical advances in parallel programming
  • Web services
  • .NET technologies
  • Software for interactive control of programs and instruments
  • Scientific applications in areas such energy physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, and oceanography.

Inhibitors:

  • Limitted large computational systems, data storage and specialized experimental facilities.
  • Scheduling difficulty in distributed environment: i.e resource utilization, response time, global and local allocation policies.

Paradigms:

Data-intensive computations arise in many domains of scientific and engineering research. Itself is not a driving force that would change people's view on the world, however, it does driven the development of Grid technology, because of its demanding requirement for large exchange and storage of datasets, and response time, which forces a concept of building common platform between geographically distributed processors.
At the same time, with the advances in the development and maturity of data-intensive computation itself, many formidable problems in areas such as physics, bioinformatics, computational astronomy, computational biology, material sciences, archeology, and oceanography may in the future be sloved, which in turn would bring new research discoveries and reasonably new perspectives of the world into existance.

Experts:

PNNL [1]
ORNL [2]

Timing:

The development of data-intensive computation is more or less involved with the development of each of its application areas. It's hard to find it as a separate discipline and get clear milestones.

Web Resources:

San Diego Supercomputer Center [3]
OSU Department of Bioimedical informatics [4]
Metacomputing and Data-Intensive Applications[5]
Defining Data-intensive Applications[6]


>>back>>