|
XGRID تکنولوژی
|
||
|
پیاده سازی سیستم های توزیع شده |
Using DRMAA with Unicluster Express
Distributed Resource Management Application API (DRMAA) is a high-level API that allows Grid applications to submit, monitor and control jobs to one or more DRM systems. Grid Engine comes with support for C/C++ and java, and one can also download bindings for ruby and python. There is also a nice collection of HowTos that should provide a great start for anyone looking to start writing DRMAA applications. The latest version of Unicluster Express (UCE) bundles Grid Engine 6.1u3, which is installed under $GLOBUS_LOCATION/sge. The $GLOBUS_LOCATION refers to the UCE installation directory (/usr/local/unicluster by default), and all of the DRMAA libraries and java files are located in the $GLOBUS_LOCATION/sge/lib directory. In order to run DRMAA applications, one has to set $LD_LIBRARY_PATH to point to the appropriate (architecture dependent) directory. For my development (64-bit linux) cluster with default UCE installation I used the following setup:
$ source /usr/local/unicluster/unicluster-user-env.sh $ export LD_LIBRARY_PATH=/usr/local/unicluster/sge/lib/lx24-amd64 $ export JAVA_HOME=/opt/jdk $ export PATH=$JAVA_HOME/bin:$PATH
A very simple example of a java DRMAA application that submits a job to Grid Engine is shown below:
$ cat SimpleJob.java
import org.ggf.drmaa.DrmaaException;
import org.ggf.drmaa.JobTemplate;
import org.ggf.drmaa.Session;
import org.ggf.drmaa.SessionFactory;
public class SimpleJob {
public static void main(String[] args) {
SessionFactory factory = SessionFactory.getFactory();
Session session = factory.getSession();
try {
session.init("");
JobTemplate jt = session.createJobTemplate();
jt.setRemoteCommand("/home/veseli/simple_job.sh");
String id = session.runJob(jt);
System.out.println("Your job has been submitted with id " + id);
}
catch (DrmaaException e) {
System.out.println("Error: " + e.getMessage());
}
}
}
One can compile and run the above example using something like the following:
$ javac -classpath /usr/local/unicluster/sge/lib/drmaa.jar SimpleJob.java $ java -classpath .:/usr/local/unicluster/sge/lib/drmaa.jar SimpleJob Your job has been submitted with id 14 $ qstat -f queuename qtype used/tot. load_avg arch states ---------------------------------------------------------------------------- all.q@horatio.psvm.univa.com BP 1/1 0.36 lx24-amd64 14 0.55500 simple_job veseli r 06/20/2008 12:24:59 1 ---------------------------------------------------------------------------- all.q@romeo.psvm.univa.com BP 0/1 0.39 lx24-amd64 ---------------------------------------------------------------------------- all.q@yorick.psvm.univa.com BP 0/1 0.45 lx24-amd64 ---------------------------------------------------------------------------- headnodes.q@petruchio.psvm.uni IP 0/1 0.15 lx24-amd64 ---------------------------------------------------------------------------- special.q@horatio.psvm.univa.c BIP 0/1 0.36 lx24-amd64
I should point out that DRMAA is designed to be independent of any particular DRM. Those users that need job submission features or flags specific to Grid Engine can either use the “native specification” attribute, or they can use the “job category” attribute together with “qtask” files. In order to set native specification attribute in java one would use setNativeSpecification() method of the JobTemplate class (before the job submission line in the code):
jt.setNativeSpecification("-q special.q");
This method, however, makes your application dependent on the specific DRM you are working with at the moment. The above line will be interpreted correctly by Grid Engine, but may not be understood by other DRMs. In most cases a better solution is to use the job category attribute instead, and specify the DRM-dependent flags in the qtask file. For example, in order to submit your job to a particular Grid Engine queue in the java code one would have something like
jt.setJobCategory("special");
and use the qtask file to translate the “special” job category into appropriate Grid Engine flags:
$ cat ~/.qtask special -q special.q
The cluster global qtask file (defines cluster wide defaults) in UCE resides at $GLOBUS_LOCATION/sge/default/common/qtask. As shown above, user-specific qtask files that override and enhance cluster-wide definitions are found at ~/.qtask.
منبع : http://gridgurus.typepad.com
Aromatic Clouds?
If you weren’t at OSGC you missed a number of interesting presentations. From my perspective, one of the most intriguing technologies was EUCALYPTUS: Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems.
Before I go on, I would like you to notice that anybody who is able to make an acronym out of eucalyptus has some time on their hands. Fortunately, they used this time to implement an open-source infrastructure for Elastic Computing. In particular, the goal of the project is to, "foster community research and development of Elastic/Utility/Cloud service implementation technologies, resource allocation strategies, service level agreement (SLA) mechanisms and policies, and usage models."
In my opinion, the most interesting facets of this project are:
My hope is that this project will not only get us all thinking about what we really need from a Cloud but also what we could improve... I plan to start working with this software as soon as it is available later this month.
منبع : http://gridgurus.typepad.com
About Grid Engine Advanced Reservations
Advanced reservation (AR) capability is one of the most important new features of the upcoming Grid Engine 6.2 release. New command line utilities allow users and administrators to submit resource reservations (qrsub), view granted reservations (qrstat), or delete reservations (qrdel). Also, some of the existing commands are getting new switches. For example, the “-ar
$ qconf -sul arusers deadlineusers defaultdepartment
$ qconf -su arusers name arusers type ACL fshare 0 oticket 0 entries NONE
The “arusers” ACL can be modified via the “qconf -mu” command:
$ qconf -mu arusers veseli@tolkien.ps.uud.com modified "arusers" in userset list
$ qconf -su arusers name arusers type ACL fshare 0 oticket 0 entries veseli
Once designated as a member of this list, the user is allowed to submit ARs to Grid Engine:
[veseli@tolkien]$ qrsub -e 0805141450.33 -pe mpi 2 Your advance reservation 3 has been granted
[veseli@tolkien]$ qrstat
ar-id name owner state start at end at duration
-----------------------------------------------------------------------------------------
3 veseli r 05/14/2008 14:33:08 05/14/2008 14:50:33 00:17:25
[veseli@tolkien]$ qstat -f queuename qtype resv/used/tot. load_avg arch states --------------------------------------------------------------------------------- all.q@tolkien.ps.uud.com BIP 2/0/4 0.04 lx24-x86
For the sake of simplicity, in the above example we have a single queue (all.q) that has 4 job slots and a parallel environment (PE) mpi assigned to it. After reserving 2 slots for the mpi PE, there are only 2 slots left for running regular jobs until the above shown AR expires. Note that the "–e" switch for qrsub designates requested reservation end time in the format YYMMDDhhmm.ss. It is also worth pointing out that the qstat output changed slightly with respect to previous software releases in order to accommodate display of existing reservations. If we now submit several regular jobs, only 2 of them will be able to run:
[veseli@tolkien]$ qsub regular_job.sh
Your job 15 ("regular_job.sh") has been submitted
...
[veseli@tolkien]$ qsub regular_job.sh
Your job 19 ("regular_job.sh") has been submitted
[veseli@tolkien]$ qstat -f
queuename qtype resv/used/tot. load_avg arch states
---------------------------------------------------------------------------------
all.q@tolkien.ps.uud.com BIP 2/2/4 0.03 lx24-x86
15 0.55500 regular_jo veseli r 05/14/2008 14:34:32 1
16 0.55500 regular_jo veseli r 05/14/2008 14:34:32 1
############################################################################
- PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
############################################################################
17 0.55500 regular_jo veseli qw 05/14/2008 14:34:22 1
18 0.55500 regular_jo veseli qw 05/14/2008 14:34:23 1
19 0.55500 regular_jo veseli qw 05/14/2008 14:34:24 1
However, if we submit jobs that are part of the existing AR, those are allowed to run, while jobs submitted earlier are still pending:
[veseli@tolkien]$ qsub -ar 3 reserved_job.sh
Your job 20 ("reserved_job.sh") has been submitted
[veseli@tolkien]$ qsub -ar 3 reserved_job.sh
Your job 21 ("reserved_job.sh") has been submitted
[veseli@tolkien]$ qstat -f
queuename qtype resv/used/tot. load_avg arch states
---------------------------------------------------------------------------------
all.q@tolkien.ps.uud.com BIP 2/4/4 0.02 lx24-x86
15 0.55500 regular_jo veseli r 05/14/2008 14:34:32 1
16 0.55500 regular_jo veseli r 05/14/2008 14:34:32 1
20 0.55500 reserved_j veseli r 05/14/2008 14:35:02 1
21 0.55500 reserved_j veseli r 05/14/2008 14:35:02 1
############################################################################
- PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
############################################################################
17 0.55500 regular_jo veseli qw 05/14/2008 14:34:22 1
18 0.55500 regular_jo veseli qw 05/14/2008 14:34:23 1
19 0.55500 regular_jo veseli qw 05/14/2008 14:34:24 1
The above example illustrates how ARs work. As long as particular reservation is valid, only jobs that are designated as part of it can utilize resources that have been reserved. I think that AR will prove to be extremely valuable tool for planning grid resource usage, and I’m very pleased to see it in the new Grid Engine release.
منبع : http://gridgurus.typepad.com
Steaming Java
When Rich asked us to walk through a software development process, I immediately thought back to a conversation that I had with my friend Leif Wickland about building high-performance Java applications. So I immediately emailed him asking him for his best practices. We have both produced code that is as fast, if not faster than C compiled with optimization (for me it was using a 64-bit JRE on a x86_64 architecture with multiple cores).
That is not to say that if you were to spend time optimizing the equivalent C-code that it would not be made to go faster. Rather, the main point is that Java is a viable HPC language. On a related note, Brian Goetz of Sun has a very interesting discussion on IBM's DeveloperWorks, Urban performance legends, revisited on how garbage collection allows faster raw allocation performance.
However I digress… Here is a summary of what we both came up with (in no particular order):
منبع : http://gridgurus.typepad.com
Grid Interoperability and Interoperation
The high expectations raised by grid computing have favored the development and deployment of a growing number of grid infrastructures and middlewares. However, the interaction between these grids is still limited, so reducing the potential large-scale application of grid technology, in spite of efforts made by grid community. In this sense, the Open Grid Forum (OGF) is developing open standards for grid software interoperability, while the OGF's Grid Interoperation Now Community Group (GIN-CG) is coordinating a set of interoperation efforts among production grids. It is therefore clear that, according to OGF (as Laurence Field explains in his article entitled "Getting Grids to work together: interoperation is key to sharing"), there is a big difference between these two terms:
Since most common open standards to provide grid interoperability are still being defined and only a few have been consolidated, grid interoperation techniques, like adapters and gateways, are needed. An adapter is, according to different dictionaries of computer terms, “a device that allows one system to connect to and work with another”. On the other hand, a gateway is conceptually similar to an adapter, but it is implemented as an independent service, acting as a bridge between two systems. The main drawback of adapters is that grid middleware or tools must be modified to insert the adapters. Gateways can be accessed without changes on grid middleware or tools, but they can become a single point of failure or a scalability bottleneck.
GridWay provides support for some of the few established standards like DRMAA, JSDL or WSRF to achieve interoperability but, in the meanwhile, it also provides components to allow interoperation, like Middleware Access Drivers (MADs) acting as adapters for different grid services, and the GridGateWay, which is a WSRF GRAM service encapsulating an instance of GridWay, thus providing a gateway for resource management services.
GridWay 4.0.2, coinciding with the release of Globus Toolkit 4 and its new WS GRAM service, introduced an architecture for the execution manager module based on a MAD (Middleware Access Driver) to interface several grid execution services, like pre-WS GRAM and WS GRAM, even simultaneously. That architecture was presented in the paper entitled "A modular meta-scheduling architecture for interfacing with pre-WS and WS Grid resource management services" (E. Huedo, R. S. Montero and I. M. Llorente). GridWay 5.0 took advantage of this modular architecture to implement an information manager module with a MAD to interface several grid information services, and a transfer manager module with a MAD to interface several grid data services. Moreover, the scheduling process was decoupled from the dispatch manager through the use of an external and selectable scheduler module.

The resulting architecture, which is shown above, provides direct interoperation between different middleware stacks. In fact, we demonstrated at OGF22 the interoperation of three important grid infrastructures, namely EGEE (gLite-based), TeraGrid and OSG (both Globus-based), being coordinately used through a single GridWay instance by means of the appropriate adapters. To set an example, the application was written using the DRMAA OGF standard. GridWay documentation provides a lot of information on how to integrate GridWay in the main middleware stacks, like gLite, pre-WS and WS Globus, or ARC, and provides information on how to develop new drivers for other middlewares.

Regarding the GridGateWay, it is being used for provisioning resources from several infrastructures. For example, the German Astronomy Community Grid (GACG or AstroGrid-D) uses a GridGateWay as a central resource broker, providing metascheduling functionality to Globus-based submission tools (e.g. for workflow execution) without modification. GridAustralia also uses a GridGateWay as a WSRF interface for its central GridWay Metascheduler instance, allowing reliable, remote job submission.

Picture by AstroGrid-D
More information about the GridGateWay component is provided in its web page, as well as in this blog entry, which shows how to build Utility Computing infrastructures with this Globus-based gateway technology.
Reprinted from blog.dsa-research.org
منبع : http://gridgurus.typepad.com
Grid Engine 6.2 Beta Release
Grid Engine 6.2 will come with some interesting new features. In addition to advance resource reservations and array job interdependencies, this release will also contain a new Service Domain Manager (SDM) module, which will allow distributing computational resources between different services, such as different Grid Engine clusters or application servers. For example, SDM will be able to withdraw unneeded machines from one cluster (or application server) and assign it to a different one or keep it in its “spare resource pool”. It is also worth mentioning that Grid Engine (and SDM) documentation is moving to Sun’s wiki. The 6.2 beta release is available for download here.
منبع : http://gridgurus.typepad.com
Support for parallel jobs in distributed resource management software is probably one of those features that most people do not use, but those who do appreciate it a lot. Grid Engine supports parallel jobs via parallel environments (PE) that can be associated with cluster queues. New parallel environment is created using the qconf -ap
$ qconf -sp simple_pe pe_name simple_pe slots 4 user_lists NONE xuser_lists NONE start_proc_args /bin/true stop_proc_args /bin/true allocation_rule $round_robin control_slaves FALSE job_is_first_task FALSE urgency_slots min
In the above example, “slots” defines number of parallel tasks that can be run concurrently. The “user_lists” (“xuser_lists”) parameter should be a comma-separated list of user names that are allowed (denied) use of the given PE. If “user_lists” is set to NONE, any user that is not explicitly disallowed via the “xuser_lists” parameter. The “start_proc_args” and “stop_proc_args” represent command line of startup and shutdown procedures for the parallel environment. These commands are usually scripts customized for a specific parallel library intended for a given PE. They get executed for each parallel job, and are used, for example, start any necessary daemons that enable parallel job execution. The standard output (error) of these commands are redirected into
#!/bin/sh
#$ -S /bin/sh
slaveCnt=0
while read host slots q procs; do
slotCnt=0
while [ $slotCnt -lt $slots ]; do
slotCnt=`expr $slotCnt + 1`
slaveCnt=`expr $slaveCnt + 1`
ssh $host "/bin/hostname; sleep 10" > /tmp/slave.$slaveCnt.out 2>&1 &
done
done < $PE_HOSTFILE
while [ $slaveCnt -gt 0 ]; do
wait
slaveCnt=`expr $slaveCnt - 1`
done
echo "All done!"
After saving this script as "master.sh" and submitting your job using something like "qsub -pe simple_pe 3 master.sh" (where 3 is the number of parallel slots requested), you should be able to see your "slave" tasks running on the allocated machines. Note, however, that you must have password-less ssh access to the designated parallel compute hosts in order for the above script to work.
منبع : http://gridgurus.typepad.com
The Role of Open Source in Grid Computing
Grid Guru Ian Foster has a great piece in International Science Grid This Week. He talks about the significance of choosing open source licenses in the history of Globus, leading to a field dominated by open source software.
منبع : http://gridgurus.typepad.com
The MapReduce Panacea Myth?
Everywhere I go I read about how the MapReduce algorithm will and continues to change the world with its pure simplicity… Parallel programming is hard but MapReduce makes it easy... MapReduce: ridiculously easy distribute programming… Perhaps one day programming tools and languages will catch up with our processing capability but until then, MapReduce will allow us all to process very large datasets on massively parallel systems without having to bother with complicated interprocess communication using MPI.
I am a skeptic, which is not to say I have anything against a generalized framework for distributing data to a large number of processors. Nor does it imply that I enjoy MPI and its coherence arising from cacophonous chatter (if all goes well). I just don’t think MapReduce is particularly "simple". The key promoters of this algorithm such as Yahoo and Google have serious-experts MapReducing their particular problem sets and thus they make it look easy. You and your colleagues need to understand your data in some detail as well. I can think of a number of examples of why this is so.
First, let’s say that you are tasked with processing thousands of channels of continuously recorded broadband data from a VLBI based radio-telescope (or any other processing using beam-forming techniques for that matter). You cannot simply chop the data into nice time-based sections and send it off to be processed. Any signal processing that must be done to the data will produce terrible edge effects at each of the abrupt boundaries. Your file-splits must do something to avoid this behavior such as padding additional data on either side of the cut. This in turn will complicate the append phase after the processing is done. Thus you need to properly remove the padded data – if the samples do not align in a coherent way, then you will introduce a spike filled with energy into your result.
Alternatively, you might have been tasked with solving a large system of linear equations. For example say you are asked to produce a regional seismic tomography map with a resolution down to a few hundred meters using thousands of earthquakes each with tens of observations. You could easily produce a sparse system of equations that creates a matrix with something on the order of one million columns and several tens if not hundreds of thousands of rows. Distributed algorithms for solving such a system are well known but require our cranky friend MPI. However we can map this problem to several independent calculations as long as we are careful no to bias the input data as in the previous example. I will not bore you with the possibilities but suffice it to say that researchers have been producing tomographic maps for many years by carefully selecting the data and model calculated at any one time.
I know what many of you are thinking – I’ve read it before: MapReduce is meant for "non-scientific”"problems. But is a sophisticated search-engine any different? What makes it any less "scientific" than the examples I provided? Consider a search-engine that maintains several (n) different document indexes distributed throughout the cloud. A user then issues a query which is mapped to n servers. Let’s assume for the sake of time, each node returns its top m results to the reduce phase. These m results are then sorted and returned to the user. The assumption here is that there is no bias in the distribution of indexed documents relevant to a user’s query. Perhaps one or more documents beyond the first m found in one particular index are far more relevant than the other (n+1) * m results from the other indexes. But the user will never know. Should the search engine return every single result to the reduce phase at the expense of response time? Is there a way to distribute documents to the individual indexes to avoid well-known (but not all) biases? I suggest that these questions are the sorts of things that give one search-engine an edge over another. Approaches to these sorts of issues might well be publishable in referred journals. In other words, it sounds scientific to me.
I hope that by now you can see why I say that using MapReduce is only simple if you know how to work with (map) your data (especially if it is wonderfully-wacky). There is an inherent risk of bias in any map reduce algorithm. Sadly this implies that processing data in parallel is still hard no matter how good of a programmer you are nor how sophisticated your programming language is.
منبع : http://gridgurus.typepad.com
Open Evolution
Proprietary standards can bring success at first but cannot last. At least that is the conclusion we are forced to draw from two interesting articles in the 22 March issue of the Economist: Break down these walls and Everywhere but nowhere. I highly recommend that you read them particularly if you think that Ian’s Grid definition requiring open-standards is debatable.
The core lesson comes from the original big players in the nascent internet such as AOL, CompuServe, and Prodigy. These companies provided their users with electronic mail (not necessarily what we consider email today), chat rooms, discussion boards, and access to a wide-range of information. However these services were restricted to users of each particular service. You simply could not access information from one provider if you subscribed to another.
However, it was not long before products based upon open standards that provided these same services (and more) became more attractive to users simply because they allowed people to venture outside of the closed communities to which they subscribed. Once these users got out, they never turned back. The original content-providers became nothing more than access points to the web. Consequently these service providers quickly lost their luster and thus their valuation. Only AOL was able to (and still struggles to) survive, having redefined itself as a web-portal with paid advertising – just like the services that nearly killed it.
Today, the hottest products in the digital world are the social-networking sites like Facebook and MySpace as well as virtual worlds such as Second Life. Their popularity and usefulness to individuals has given them significant momentum in the marketplace as the “next big-thing”. Consequently these companies have been given enormous valuations despite having no business model beyond the fact that they have hordes of captive-users. While these products typically come with an API so that users can add useful and interesting features, it is no substitute for true-operational freedom. People want to interact others without having to switch systems or maintain two distinct profiles.
How long will it be before social-networking products appear that are not only based upon open-standards but also offering better features and more accessibility? You can bet that it will be soon given the amount of potential money involved. Then the reckoning will come and these companies, once flying high, will either be forced to adapt or perish.
What does this teach us about computing beyond the desktop, howsoever you wish to define it, be that a Grid, Cloud, or whatnot? Personally, I think it is clear: we must develop to open-standards or perish. I cannot see how the Grid market is immune to pressures of interoperability and freedom of choice. To paraphrase the Economist, why stay within a closed community when you can roam outside its walled garden, into the wilds of open computing!!!
I hope to see you all at the Open Source Grid and Cluster Conference.
منبع : http://gridgurus.typepad.com
There's an Analyst Lurking in that Business
I recently read an editorial from Grid Today (GT) based upon conversations with Forrester’s Frank Gillett suggesting that interest in Grid computing is waning. I will not dispute the veracity of this claim; rather I will leave that to the people such as the HPC Today editorial staff who have access to the Forrester report. Irrespective of the actual level of interest that buyers have in the Grid, I was rather baffled by the reasons that Grid Today provided for the general "malaise".
The first reason that GT offers is that, "grid computing is, in general, beneficial to vertically specific applications." More specifically, they indicate that there are limited sets of applications that could benefit from grid computing. I am assuming that the set of applications that they are referring to are those which require high-performance parallel calculations as well as any algorithm that can use the Map-Reduce pattern to distribute the computational load across many servers.
So which classes of applications do not work well on the grid? Clearly Service Oriented Architectures (SOA) works well on the Grid. In fact the Globus Toolkit, a popular software toolkit for building grids, uses SOA at its core.
Yet I believe that any n-tier application run on a Grid has many advantages. For example, imagine a web-based application with a supporting relational database that is required to scale under significant user loads including the number of connections but also the complexity of the requested services. Also imagine that clusters of users in different regions will use this application.
First of all, it would be nice for us to provide the data-services of this application using a SOA. Doing so allows us to expose the data through a single access-layer. Thus any program can access the data using the same business rules without tying it to a single-application interface. Secondly, if users require any complex reports or other heavy-duty calculations, a single web-server might easily be overwhelmed and thus forced out of the rotation until the process completes. A better solution would be for the web-server to farm these sorts of operations out to the Grid – maybe even using a Map-Reduce pattern. Furthermore adding Grid capacity is an easy way to handle high-peak loads of the application. These resources could be used by other projects during the off-peak periods. Lastly, the grid could coordinate resources that are proximate to the regional user-clusters and thus reduce communication latency for any data that needs to be exchanged without having to keep copies of the web or data-infrastructure throughout the enterprise.
If there are advantages to running your n-tier applications on the grid, it is not much of a stretch architecturally to extend that to other classes of application. I could not imagine implementing a SaaS (Software as a Service) application on anything but a grid. Having said that, I don’t believe that an application needs to be complicated to run better on a Grid. Rather, I think any application that users rely on is a good candidate.
Many "desktop" applications not only can be run on the grid but also are more appropriate to do so. Data centric applications are the prime candidates that come to mind. First of all, keeping results on your desktop all but kills collaboration between users because it is likely on an high-latency low-availability network, may be a separate security-domain and thus inaccessible to many users, and could be shutdown at any time. In addition, if an application reads and/or writes significant amounts of important data, it is best to keep it in the data-center on reliable and, more-importantly, regularly backed-up storage. Of course, the application could write across the typical high-latency low-availability desktop network into the datacenter, but that is fraught with problems. Personally I believe that perhaps the most significant source of user frustration is "network drives" – but I digress. If an application’s calculations take any significant resources, the user’s desktop quickly becomes a bottleneck. Even if the user’s machine is beefy enough to handle running a job while still allowing access to email, they are still hardware limited. In particular, if the application can be submitted in batch to the grid, the user could literally submit dozens if not hundreds of individual calculations and get the results in a fraction of the time it would take on their desktop. Lastly, running jobs at the datacenter frees users from using a single desktop. Rather, they can manage their computing from any location, which provides them significantly more freedom.
All of this brings me to GT’s second key assertion: that the term Grid has been, "bandied about so much that no one knows what it means or what business benefits they might derive from it." This is indeed the core challenge. My experience is that very few business proponents specify software-architectures. Generally they could care less whether a salesperson is pushing SOA, Grid, Cloud, SaaS, or whatnot. These are the concerns of people who support business-lines: CTOs, IT support-managers, etc.
Chances are you are not dealing with these sorts of technical folks when you are drafting a proposal. Rather, you are likely speaking with a business-analyst. The ones I know are not easily charmed by buzzwords (even if their bosses or peers are). They are more than aware that terms mean different things to different vendors and their staff.
Frankly they don’t care about your pet technology. Instead, they have a set of goals and a given budget. They are measured on how well the project met the user’s needs, how under-budget it came in, and how much time it took. If any one proposal that they have happens to align with other business initiatives of which they are aware, then they will consider the advantages as well as the costs of implementing it. We all know that individual business groups tend to go their own ways, particularly in large companies. We are not going to corral them with the "Grid".
Yet there is plenty of hope for us. We Grid proponents should focus on providing small group-level systems that are quickly setup, scale easily, and meet the customer’s defined business goals. These implementations do not need to fall under the traditional association that Grid has with high-performance computing (HPC): HPC is not often amongst the business goals. However if the group Grid is built using open-standards, has a resource manager, and allows for the provisioning of global management systems (e.g. authentication domains), it is easy for the technical types to incorporate this small-Grid into an enterprise-wide effort. This is how we can sell the Grid.
منبع : http://gridgurus.typepad.com
OpenNEbula and VWS
Few days ago authors of the GridWay Metascheduler released Technology Preview of their OpenNEbula Virtual Infrastructure Engine (ONE), which enables deployment and management of virtual machines on a pool of physical resources. The software is very similar to the Globus Virtual Workspace Service (VWS), both in architecture and functionality. Both systems provide new service layer on top of the existing virtualization platforms (currently they support only the Xen hypervisor). This layer extends functionality of the underlying Virtual Machine Monitors (VMMs) from a single machine to a VM provisioning cluster. Both ONE Engine and VWS utilize passwordless SSH access to manage pool of nodes running VMMs, and allow system administrators to deploy new VMs, to start/shutdown and suspend/resume already deployed VMs, as well as to migrate VMs from one physical host to another. The most notable difference between ONE and VWS is that VWS is built on top of the GT infrastructure, and runs within the GT java container. This allows, for example, using RFT for stage-in/stage-out requests to be sent along with the workspace creation requests. On the other hand, the ONE Engine is a standalone service and its installation requirements include only a few software packages that are already present in most linux distributions.
منبع : http://gridgurus.typepad.com
Ten More Reasons to go to Oakland
Rich Wellner came up with four reasons to attend the Open Source Grid and Cluster Conference, to be held in Oakland May 12-16. I outdid him and came up with 10:
1) Globus program is fantastic, including tutorials, advanced technical presentations, contributed talks, and community events on every aspect of Globus.
2) Gobs of other material on Sun Grid Engine and Rocks, and other open source grid and cluster software.
3) Gathering: A great opportunity to meet colleagues, peers, collaborators from the grid and cluster community. The only grid meeting in the US the rest of this year--the next two OGFs are in Spain (June) and Singapore (September).
4) GT4.2: You'll get to learn about the exciting new features in Globus Toolkit 4.2. New execution, data, security, information, virtualization, and core services.
5) Gratfication (immediate) as you get to provide your input on future directions for Globus, Sun Grid Engine, Rocks, and other open source systems--and maybe sign up to contribute to those developments.
6) Grid solutions: You'll get to meet the people using Globus to build enterprise grid solutions in projects like caBIG, TeraGrid, Earth System Grid, MEDICUS, and LIGO, and learn about solution tools like Introduce, MPI-G, Swift, Taverna, and UniCluster.
7) Gurus: You get to grill the Globus gurus--or, if you prefer, show off your own Globus guru status.
8) Great price: $490 registration is substantially cheaper than OGF or HPDC, for example, and the hotel rate is reasonable ($149).
9) Gorgeous location: Oakland is easy to get to -- SFO (with easy BART train ride), Oakland, and San Jose airports also nearby. Just a 10 minute train ride to download San Francisco. A lovely time to be in the Bay Area.
10) Gorilla and guerilla free: None of the corporate marketing talks that diluted the last GridWorld conference--apart from two sponsor talks, this is pure tech, and highly useful tech at that!
منبع : http://gridgurus.typepad.com
A Grid OS?
I have recently been working on a test plan for a framework designed to deliver applications to grid users. The framework is useful for the specific environment in which the customer operates. However it has led me to imagine something more generic that anybody who manages a Grid intended for use by a diverse community would find useful.
You need to have a solid software infrastructure consisting of compilers, libraries, middleware, languages, and services. Your customers want to be able to run the applications that suit their goals best with as little fuss as possible. These include off-the-shelf, commercial customizations, open-source, freeware, supported in-house, and individually built software packages.
While there may be few interoperability issues within a small group or company, you can bet that not all programs will play well with others. Some applications will require very specific libraries and middleware while others will prove to be quite flexible. Some applications require supporting software for 64-bit architectures while others need 32-bit. Other software has different feature-sets on different hardware (e.g. SPARC versus x86) as well as software (e.g. Linux versus IRIX) systems. Still other applications, particularly those that are on long development cycles, tend to use older feature sets whose behavior may have changed or been eliminated from subsequent package releases. Meanwhile your in-house developers might be working on the bleeding-edge and therefore use software that is too unstable for the general user community. Face it: very few software developers expect their products to co-exist with others.
This is a big challenge for anybody who is expected to create a shared-computing environment for a big user community. Typically system administrators will create an operating-system image based upon anticipated usage patterns, security, stability, feature-sets, and availability. They will have specific builds for their web-farm, mail-servers, storage-nodes, and (most importantly) for our Grid computation nodes. They would also like to be proactive and keep their systems up to the latest security and bug-fix patch levels. In addition, they are going to try to provide the best product they can; therefore they would like to provide the most feature-rich infrastructure with which they feel-comfortable. However, and most importantly, they will use a package manager to maintain software releases on their machines. Why would any system manager want to reinvent the wheel when it comes to building software when the vendors will do it for them?
This last practice has a significant impact on the software you will find on the Grid. If the hardware vendor has a build for the software you use, chances are that is what you will get. These package managers tend to keep only one version of a particular software package on a system at a time. Consequently if a newer version of a package is desired, the older one is removed. Even if they tried to make multiple packages coexist, files would be overwritten. There are a few "compat" versions but these are exceptions.
Clearly, when your mandate is to provide a shared computing environment that has a significant number of processing nodes as well as users, you will have to provide a more substantive infrastructure. At this point you could either build specialized virtual machines for each operating environment or you can create a shared infrastructure that any image can use. Utility-computing players like Amazon have you create your own machine image (AMI) but I think it is unreasonable to expect application users to have the skills to create a proper operating environment.
The second option, creating a shared infrastructure that any image can use could be considered a grid operating system from scratch vis-à-vis Linux from scratch. This type of framework would force us to place our software into a categorized structure capable of differentiating operating systems, hardware architectures, and application versions. This infrastructure should not replace the standard installs for the operating system in order to avoid conflicts – providing application support for a grid is orthogonal to managing a compute node.
All of this needs to work without overtaxing your customers (i.e. application users). The typical user doesn’t care which operating environment they are provided as long as their software runs. Rather they would prefer to be able to call their application as if it were the only version using the only installed system libraries and middleware on the only supported compute node configuration. Basically if a user wishes to use an application, they simply want to call it by name: for example python and perhaps python-2.3.7 or python-2.4.5 should they require a particular version.
A big component of your effort in creating the proposed framework is providing the correct versions of libraries and middleware to your customers’ frontline applications; this is a task that demands specialized configuration scripts whose job is to set-up the operating environment to match the user request and the operating environment. There are a few tools out there that are quite capable of accomplishing something like this. However there is nothing that I am aware of whose goal it is to specifically deliver applications on a grid. Instead this class of tools provides far more flexibility than what is necessary, let alone wanted.
Ultimately I think that the best thing for the industry would be to establish a standard Grid directory structure for placing software in shared environments (e.g. /
منبع : http://gridgurus.typepad.com
عالم پير، گريد جوان - Grid Computing و بزرگترين ماشينهاي علمي ساخت بشر
نويسندگان: Fabrizio Gagliardi و Francois Grey ؛ ترجمه: سيدمصطفي ناطقالاسلام
ماهنامه شبکه - شهريور ۱۳۸۶ شماره 79
اشاره :
اگر همه چيز مطابق برنامه پيش برود، سال آينده بزرگترين ماشين علمياي كه تاكنون ساخته شده است، در مجتمع زيرزميني پرپيچ و خمي در سوئيس، نزديك ژنو، به بهرهبرداري خواهد رسيد. تصادمگر بزرگ هادرون (LHC) كه در عمق بيش از صد متري زير زمين قرار دارد، دو باريكه پروتون را در جهتهاي مخالف هم در يك تونل دايرهاي 27 كيلومتري شتاب خواهد داد. اين دو باريكه، در حالي كه تقريباً به سرعت نور رسيدهاند، به صورت متقابل (شاخ به شاخ) با هم برخورد ميكنند و رگباري از بقاياي زيراتمي را توليد ميكنند كه دانشپيشگان انتظار دارند ذراتي مرموز را كه قبلاً هرگز مشاهده نشدهاند، در ميان آنها بيابند. اين امر ميتواند منجر به تغيير در درك بنيادي ما از جهان گردد. دستكم، اميد است كه چنين شود. پژوهشگران سازمان تحقيقات هستهاي اروپا (سرن)، جايي كه LHC به بهرهبرداري خواهد رسيد، ميدانند كه يافتن ذرات مادي گريزاني كه آنها در جستوجويش هستند، كار بسيار دشواري خواهد بود. براي يافتن اين ذرات، پژوهشگران بايد تودههاي مهيبي از دادههاي مربوط به برخوردها را غربال نمايند: انتظار ميرود فوران دادهها در LHC به طور متوسط، سالانه به پانزده ميليون گيگابايت برسد؛ اين مقدار بيشتر از ميزان دادهاي است كه براي پر كردن شش ديويدي استاندارد در دقيقه لازم است. به اين ترتيب مرتب كردن و تحليل نمودن اين كوه دادهها كاري است فراتر از توان هر ابركامپيوتري در جهان. پس در همان حال كه تيم LHC براي تكميل نمودن ماشين غولپيكر زيرزميني در تكاپو است، روي سطح زمين گروه ديگري از فيزيكپيشگان و متخصصان علوم كامپيوتر در حال حل نمودن مسئلهاي مستقل هستند: فراهم آوردن زيرساختي محاسباتي كه از پس سيلاب دادههاي LHC برآيد. راهحلي كه آنان يافتهاند مجموعهاي پهناور از كامپيوترهاي قدرتمند كه حدوداً در دويست مركز پژوهشي در سراسر دنيا گستردهاند و به گونهاي مرتبط و پيكربندي شدهاند كه همچون يك سيستم واحد پردازش موازي كار كنند. اين نوع زيرساخت يك گريد پردازشي (computing grid) خوانده ميشود.
Cluster, and so Grid site, administrators have to deal with the following requirements when configuring and scaling their infrastructure:
In order to overcome these challenges, we propose a new virtualization layer between the service and the physical infrastructure layers, which seamless integrates with existing Grid and cluster middleware stacks. The new virtualization layer extends the benefits of VMMs (Virtual Machine Monitors) from a single physical resource to a cluster of resources, decoupling a server not only from the physical infrastructure but also from the physical location. In the particular case of computing clusters, this new layer supports the dynamic execution of computing services, working nodes, from different computer clusters on a single physical cluster.
OpenNebula is the name of a new open-source technology that transforms a physical infrastructure into a virtual infrastructure by dynamically overlaying VMs over physical resources. So computing services, such as working nodes managed by existing LRMs (Local Resource Managers) like SGE, Condor, OpenPBS..., could be executed on top of the virtual infrastructure; so allowing a physical cluster to dynamically execute multiple virtual clusters.
The separation of resource provisioning, managed by OpenNebula, from job execution management, managed by existing LRMs, provides the following benefits:
Consequently, this approach provides the flexibility required to allow Grid sites to execute on-demand VO-specific working nodes and to isolate and partition the physical resources. Additionally, the architecture offers other benefits to the administrator of the cluster, such as high availability, support for planned maintenance and changing capacity availability, performance partitioning, protection against malicious use of resources...
The idea of a virtual infrastructure which dynamically manages the execution of VMs on physical resources is not new. There exist several VM Management proprietary solutions to simplify the use of virtualization, so providing the enterprise with the potential benefits this technology may offer. Examples of products for the centralized management of the life-cycle of a VM workload on a pool of physical resources are: Platform VM Orchestrator, IBM Virtualization Manager, Novell ZENworks, VMware Virtual Center, and HP VMManager.
The OpenNebula Virtual Infrastructure Engine differentiates from those VM management systems in its highly modular and open architecture to meet the requirements of cluster administrators. The OpenNebula Engine provides a command line interface for monitoring and controlling VMs and physical resources quite similar to that provided by well-known LRMs. Such interface allows its integration with third-party tools, such as LRMs, service adapters, VM image managers...; to provide a complete solution for the deployment of flexible and efficient computing clusters. The service layer decoupling from the infrastructure layer allows an straightforward extension of the previous idea to any kind of service. In this way any physical infrastructure can be transformed into a very effective provisioning platform.
A Technology Preview of OpenNebula is available for download under the terms of the Apache License, version 2.0.
Ignacio Martín Llorente
Reprinted from blog.dsa-research.org
منبع : http://gridgurus.typepad.com

My esteemed Grid Gurus moderator, Rich Wellner, asked "what is the most creative use for grid technology that you've ever seen?" This is a difficult question to answer, but I will attempt to do so anyway.
I choose the work of George Karniadakis, Suchuan Dong, Nick Karonis, and their colleagues on modeling blood flow in the human body. Why I like it is the wacky (sorry, wonderful) way in which they mapped this apparently highly tightly coupled problem onto the distributed sites of the NSF TeraGrid. Quoting one of their papers:
Motivated by a grand-challenge problem in biomechanics, we are striving to simulate blood flow in the entire human arterial tree. The problem originates from the widely accepted causal relationship between blood flow and the formation of arterial disease such as atherosclerotic plaques. These disease conditions preferentially develop in separated and recirculating flow regions such as arterial branches and bifurcations. Modeling these types of interactions requires significant compute resources to calculate the three-dimensional unsteady fluid dynamics in the sites of interest. Waveform coupling between the bifurcations, however, can be reasonably modeled by a reduced set of one-dimensional
equations that capture the cross-sectional area and sectional velocity properties. One can therefore simulate the entire arterial tree using a hybrid approach based on a reduced set of one-dimensional equations for the overall system and detailed 3D Navier-Stokes equations at arterial branches and bifurcations.
In other words, they mapped different parts of the human body (chest, legs, arms, head, and their arterial branches) to different TeraGrid sites, linking them by a simple, non-communication intensive 1-D problem.
The tools used to make this happen were MPICH-G2 (recently renamed as MPIG) and of course Globus.
منبع : http://gridgurus.typepad.com
The emergence of cloud computing as a resource on the grid has led to a huge resurgence in interest in utility computing. Looking at the history of utility computing allows us to identify three canonical interaction models that also apply to cloud computing.
Metascheduling
Initial cloud offerings like Amazon Elastic Compute Cloud created the nomenclature around clouds. Going back before the term "cloud" was coined we see a similar offering from Sun with their utility computing offering. In both cases users submit work to the service and eventually get results returned. How the request gets prioritized, provisioned and executed is at the discretion of the service provider. In many ways this is similar to how a typical cluster works. A user selects a cluster, submits a job and waits for a response. What node is used to execute his request is largely out of his control. While acknowledging there are substantial difference between a cluster and a cloud, another similarity reveals itself when thinking about how users interact with compute resources in companies that operate multiple clusters.
As companies began adding additional clusters, users quickly demanded a facility to submit their jobs to a high level service that would manage the interactions with all the clusters that were available. Most users didn't want to have to themselves use multiple monitoring tools to access multiple clusters and use the information gathered to make a decision about where to submit their job. What they wanted was a single interface to submit jobs to and a service that would make policy based decisions about which cluster to ultimately submit the request.
The situation today is similar. Multiple cloud and utility computing vendors exist and users don't want to spend their time gathering information about the state of each in order to decide where to submit their jobs. Further, administrators and managers need to be able to enforce policy. There are several reasons for requiring this behavior, but probably the easiest to explain is that there are costs associated with resource usage at the cloud vendors and organizations require control over how that money is spent.
The answer to all these needs is to place a metascheduler between the users and the various resources. Users can then use a single interface for all their jobs regardless of where they are ultimately going to be executed.
[A metascheduler] enables large-scale, reliable and efficient sharing of computing resources (clusters, computing farms, servers, supercomputers…), managed by different LRM (Local Resource Management) systems, such as PBS, SGE, LSF, Condor…, within a single organization (enterprise grid) or scattered across several administrative domains (partner or supply-chain grid). -- GridWay
Virtual machines
Clouds are only as useful as the software running in them. Therefore, the next important interaction model is that between users and virtual machines.
Users often need very specific software stacks. This includes the application they are running, support libraries and, in some instances, specific versions of operating systems. Analysts are saying that there are now at least 35 companies addressing the needs of users in managing these interactions. This includes software to implement the enactment layer, manage images, policy engines, user portals and analytics functions.
One of the questions yet to be answered in the cloud community is how to allow users to make use of several clouds on a day to day basis. As this market continues to mature, look for many of the same challenges (e.g. security, common APIs, WAN latencies) that the grid community has been tackling for over a decade to become increasingly important to cloud users.
Application virtualization
In the context of clouds, application virtualization gains significant power by being able to add or remove instances of applications on demand. This is currently being done in the context of data center management using proprietary tools. Clouds present a cool new opportunity to do the balancing act on a regional basis. As more clouds are built and standard interfaces made available, users will be able to load balance to multiple clouds operating in different countries or cities as demand grows and shrinks.
These three models represent established, powerful interaction modes that are being used in production in a variety of settings today. It will be interesting over the next year to see which cloud operators adopt which models and how many lessons they take from existing non-cloud implementation versus trying to reinvent the wheel in a new way.
منبع : http://gridgurus.typepad.com
The grid.org team has just declared UniCluster 3.2 to be stable.
In particular, being able to install over an existing Grid Engine installation is super cool. This is a feature that I've been excited about for a long time as it brings globus, ganglia and the UniCluster monitoring application to the existing 10,000 or so Grid Engine clusters.
منبع : http://gridgurus.typepad.com
شرکت ردهت اقدام به معرفی نسخهی جدیدی از سیستم عاملهای خود نموده است که در این نسخه٬ محاسبات موازی٬ محاسبات روی گرید٬ و مسائلی از این دست که مورد علاقه محاسباتی کاران است بسیار قابل حصول شده است.
نسخهی بتای این سیستم عامل٬ اکنون قابلِ دانلود است.
به بخشی از توضیحات مربوطه توجه کنید:
Red Hat Enterprise MRG supports the full spectrum of distributed tasks, including:
* High-speed, reliable, or large file messaging
* Parallel & cycle-stealing scheduling
* High Performance Computing (HPC) and High Throughput Computing (HTC)
* Distributed workload management
Red Hat Enterprise MRG can run across multiple platforms but also takes deep advantage of Red Hat Enterprise Linux capabilities like clustering, IO, and virtualization for optimal performance and qualities of service.
مرجع:
http://www.redhat.com/mrg/?intcmp=70160000000HEmC
http://www.redhat.com/mrg/grid
منبع :
جهش اطلاعاتي دنياي رايانه
اين نوشته يک گزارش علمي است از پروژه ملي ژاپن به اسم NAREGI که مخفف National Research Grid Initiative مي باشد که طي يک همايش بين المللي در معرض داوري دانشمندان و متخصصان سراسر دنيا قرار گرفت. نگارنده خود نيز در اين همايش شرکت داشت که جهت اطلاع دوستان محقق داخلي و مسولان مربوطه اين گزارش را تقديم مي نمايد. هدف اصلي پروژه NAREGI اين است که به قدرت محاسباتي پتا(10 به توان 15) فلاپ بر ثانيه دست بيابند. اين ميزان قدرت محاسباتي معادل يک ميليون پينتيوم 4 است. اين پروژه از سال 2003 تا 2007 با بودجه اي از حدود 20 ميليون دلار بر سال تعريف شده است. يعني براي کل پروژه 5 ساله 100 ميليون دلار بودجه پيش بيني شده است. شايد چيزي مثل درآمد نفتي يک روز ما!
قرار است با اين قدرت محاسباتي پتافلاپ بر ثانيه چه کاري بکنند؟ اين طوري که پروفسور H. Nakamura رييس انيستيتوي علوم مولکولي نارا- ژاپن ميگفت اهداف به اين شکل است: 1- به وجود آوردن و نهادينه کردن علم جديدي به اسم علم نانو (نه تکنولوژي نانو) به عنوان يکي از علوم پايه. به نظر نگارنده روحيه متواضع و بي ادعاي ژاپني به سختي مي تواند ادعايي از اين نوع داشته باشد و اگر يک دانشمند ژاپني چنين ادعايي بکند بايد خيلي آن را جدي گرفت! 2- مفهوم GRID را به عنوان ابزار طبيعي اين علم درآورند. GRID يک محيط محاسباتي ناهمگون است که مساله ترجمه کدهاي کامپيوتري بين سيستم عاملهاي مختلف را مرتفع خواهد کرد. در حقيقت GRID خود يک سيستم عامل است که توسط ابزارهايي به اسم middleware امکان ارتباط بين کدهاي توسعه يافته تحت سيستم عاملهاي مختلف را فراهم مي کند. براي توسعه سيستم عامل GRID و middlewareهاي مربوطه حدود 300 مهندس کامپيوتر مطابق گفته معاون وزير علوم ژاپن در اين پروژه مشارکت دارند. تاکنون ژاپنيها توانسته اند به طور موفقيت آميزي چند برنامه را به عنوان مثالي از عملي بودن اين محيط محاسباتي با استفاده از سيستم GRID حل نمايند: شبيه سازي مولکولهاي بزرگ از حدود پروتيينها در خلا با قدرت محاسباتي فعلي به صورت کاملا کوانتومي وجود ندارد. وجود حلال (يعني 10 به توان 23 مولکول آب!) مساله را از اين نيز پيچيده تر ميکند.ولي با استفاده از قدرت فعلي محاسباتي از حدود 17 ترا (10 به توان 12) فلاپ بر ثانيه ژاپنيها توانسته اند برخي واکنشهاي شيميايي در محلولها و من جمله فرايندهاي حياتي مربوط به پروتيينها را شبيه سازي کنند. اين ميزان قدرت محاسباتي توسط حدود سه هزار رايانه در سرتاسر ژاپن تامين ميشود که توسط شبکه فوق سريع به هم مربوط اند.
هدف غايي اين است که بتوانند قطعات الكترونيك مقياس نانومتر (مشتمل بر حدود يک ميليون الکترون) را بدون ساختن قطعات در آزمايشگاه بر روي سيستم GRID شبيه سازي کنند. البته کاربرد سيستم GRID به شبيه سازيهاي کوانتومي و فرايندهاي شيميايي يا کاربردهاي آن در ماده بيولوژيک منحصر نيست. نکته جالب توجه اين است که حدود 40 کمپاني مهم ژاپني نظير هيتاچي و تويوتا نيز در اين پروژه مشارکت دارند. مثلا يکي از کاربردهاي بالقوه اين محيط محاسباتي ميتواند شبيه سازي تصادف خودروها با جزييات بيشتر و دقيق تر باشد.
نکاتي هم از بحث با برخي دانشمندان اروپايي و آمريکايي که به اين همايش دعوت شده بودند نقل ميکنم: پروفسور Sandro Sorella از مرکز SISSA در ايتاليا معتقد است که هرچقدر که تعداد زيادتري کامپيوتر از مراکز مختلف را بتوان تحت تکنولوژي GRID به هم متصل کرد، به همان ميزان نيز متقاضي استفاده از شبکه و اجراي برنامه در شبکه افزايش خواهد يافت که عملا فرقي بين استفاده از GRID يا عدم استفاده از آن وجود ندارد. پروفسور Takami Tohyama از انيستيتوي تحقيقات مواد دانشگاه توهوکو ژاپن در جواب اين سوال من که شما تميز ترين کد قطري سازي دقيق دنيا را در طي 15 سال گذشته توسعه داده ايد حاضريد اجازه دهيد کس ديگري آن سوي دنيا کد شما را کامپيايل کرده و از آن استفاده کند گفت که اين يک روياست! تحقق آن سخت به نظر ميرسد. يک پروفسور روسي الاصل از کانادا هم که متخصص محاسبات بزرگ مقياس است معتقد بود که اگر به فرض به هدف پتافلاپ برسند فقط قدرتشان 1000 برابر قدرت محاسباتي نوعي خوشههاي 512 تايي است که به معناي 10 برابر شدن ابعاد فضايي يک سيستم 3 بعدي است و اشتهاي ما را براي سيستمهاي بزرگتر برخواهد انگيخت. چون هنوز اين ميزان كافي نيست.
پروفسور G. Baskaran از انيستيتوي علوم رياضي مدرس هندوستان معتقد است راه حل مسايل پيچيده در فيزيک ماده چگال يا ماده بيولوژيک کامپيوترهاي بزرگ نيست! وقتي يک مساله جديدي با ميزان پيچيدگي جديد فرا روي ما قرار ميگيرد، براي حل آن نياز به ابداع «مفهوم» جديد داريم. به نظر نگارنده نيز اين استاد بزرگوار در حالت کلي فرمايششان متين است. اما به هر صورت کشور ما براي بسياري از مسايل به قدرت محاسباتي از حدود چند ده ترافلاپ بر ثانيه (معادل چند هزار پنتيوم 4) براي برخي پروژههاي ملي نياز دارد.
اگر قرار است که اين ميزان قدرت محاسباتي با چيزي مثل چند ده ميليون دلار حاصل شود براي مملکت ما کار سختي نيست. فقط کافي است که کار را به کاردان بسپارند!! مملکت ما در حال حاضر در شيمي داراي دانشمنداني است که توسط استانداردهاي بين المللي به عنوان دانشمند پر استناد معرفي شده اند. در فيزيک هم تا آنجايي که نگارنده خود به عنوان محقق فيزيک مطلع است دانشمندان قابلي در اين کشور وجود دارند. درعلوم کامپيوتر با اينکه رشته تخصصي بنده نيست ولي بچههايي که قادرند در سطح اسباب بازي (مسابقه فوتبال روباتها) در دنيا اول شوند به وضوح مي توانند در سطوحي جدي تر و کاربردي تر از اين حرفها شانه به شانه دوستان ژاپني ما پيش بروند. در رشتههاي ديگر نيز مطمئنا کساني هستند که در صورت اعتماد به آنها قاردند کارهاي مهمي انجام دهند.
نکته آموزندهاي که از اين همايش ژاپني مي توان آموخت اين است که روحيه پاسخگويي دانشمندان ژاپني ايجاب ميکند که به ازاي پول 20 ميليون دلار بر سالي که تاکنون استفاده کرده اند، چند تا از برجسته ترين دانشمندان شيمي (به عنوان مثال پر استناد ترين دانشمند شيمي دنيا در اين همايش شرکت داشت)، فيزيک و متخصصان محاسبات بزرگ مقياس را از آمريکا و اروپا دعوت کنند تا در حضور آنها به سنت حسنه «پاسخگويي» بپردازند! هيچ چيز سري هم وجود ندارد! همه به صورت آزاد دعوت شده اند تا نظر دهند. اگر کسي اشکالي در سيستم و رهيافت علماي ژاپني به ذهنش برسد و به آنها تذکر دهد با لبخند مليح ژاپني و ادب و تشکر فراوان آنها مواجه خواهد شد.
منبع : http://www.bashgah.net
HTC and Cloud and Grid Computing
The HyperText Computing (HTC) paradigm is not a “complete solution” to the challenges and opportunites afforded by Cloud and Grid computing — however this post argues that the HTC is part of the solution. My angle into this question is via a recent blog post.
This is how Tim Foster, in a recent post at Grid Gurus, concludes his discussion of current and future trends of Cloud and Grid computing (emphasis mine):
In building this distributed “cloud” or “grid” (“groud”?), we will need to support on-demand provisioning and configuration of integrated “virtual systems” providing the precise capabilities needed by an end-user. We will need to define protocols that allow users and service providers to discover and hand off demands to other providers, to monitor and manage their reservations, and arrange payment. We will need tools for managing both the underlying resources and the resulting distributed computations. We will need the centralized scale of today’s cloud utilities, and the distribution and interoperability of today’s grid facilities.
The concepts that Tim highlights: “on-demand provisioning”, “configuring integrated virtual systems”, providing “precise capabilities” and a focus on the needs of the “end-user” are all addressed by the HyperText Computing (HTC) paradigm. HTC also addresses the need to view central resources through the same lens as localised ones.
The HyperText Computing (or Request Based Distributed Computing - RBDC) — is a small extension of http and our conceptions of server, proxy and client. It creates a distributed computing platform that is built from an end-user perspective outwards just as http does for information. It is built on a recognition of the equivalence between http resources and the code that when executed will return the resource. RBDC unifes programming models by applying browser based sandboxed Virtual Machines (VM) to our conception of proxies and servers.
Key benefits of RBDC are ultra-lightweight distributed computing, run-time code mobility, and backwards compatibility with http.
A fuller description of RBDC may be found here.
Http offers location transparency for retrieving data, a small http extension can also provide location transparency for code execution.
منبع : http://www.davidpratten.com
The Globus Alliance has been selected as a Google Summer of Code 2008 mentoring organization. Google Summer of Code (GSoC) is a program that offers student developers stipends to write code for various open source projects. Google works with several open source, free software, and technology-related groups to identify and fund several projects over a three month period. Historically, the program has brought together over 1,500 students with over 130 open source projects to create millions of lines of code. The program, which kicked off in 2005, is now in its fourth year. If you are a student and would be interested in participating in GSoC with Globus as your mentoring organization, please take a look at our GSoC Ideas page. This page lists projects that Globus has proposed for GSoC, but it is not a closed list. If you have an idea for a cool project that uses or extends Globus technologies, please take a look at our list of Globus GSoC mentors and contact the one which most closely matches your interests. Take into account that student proposals must be submitted by March 31st and that you must meet Google's student eligibility criteria.
If you have any questions about our participation in GSoC, please contact the Globus GSoC administrators.
منبع : http://gridgurus.typepad.com
Choosing a distributed resource management (DRM) software may not be a simple task. There are a number of open source or commercial software packages available, and companies usually go through product evaluation phase in which they consider factors like software license and support costs, maintenance issues, their own use cases and existing/planned infrastructure, etc. After following this (possibly lengthy) procedure, and finally making the decision, purchasing and installing the product, you should also make sure that the DRM software configuration fits your cluster usage and needs. In particular, designing the appropriate queue structure, configuring resources, resource management and scheduling policies are some of the most important aspects of your cluster configuration. At first glance devoting your company's resources into something like queue design might seem unnecessary. After all, how can one go wrong with the usual "short", "medium" and "long" queues? However, the bigger your organization is and the more diverse computing needs of your users are, the more likely it is that you would benefit from investing some time into designing and implementing queues more efficiently. My favorite example here involves high priority jobs that must be completed in a relatively short period of time, regardless of how busy the cluster is. Such jobs must be allowed to preempt computing resources from other lower priority jobs that are already running. Better DRMs usually allow for such use case (e.g., by configuring "preemptive scheduling" in LSF, or using "subordinated queues" in Grid Engine), but this is clearly something that has to be well thought through before it can be implemented. In any case, when configuring DRM software, it is important to keep in mind that not all jobs (or not all users for that matter) are created equal... خلاصه شده از وبلاگ زیر
آدرس : http://salmancg.blogfa.com/cat-6.aspx
فایل اول مربوط به آشنایی مقدماتی با تکنولوژی گرید هستش.
فایل دوم مربوط به بررسی دقیق معماری گرید هستش، این فایل ترجمه Part دوم کتاب Introduction to Grid از شرکت IBM هستش که کار ترجمه رو (به ترتیب حروف الفبا) من و خانم اسماعیل زاده و آقای بنائی و خانم طیرانی و آقای فاتحی انجام دادیم.
اینجا از جمال عزیز برا پیگیریهاش و بیژن گل برا فعالیتاش در زمینه اطلاع رسانی، تشکر می کنم.
فایل ppt پروژه شیوه ارائه (Grid Technology)از لینک زیر هم می تونید یه کتاب خیلی معتبر در زمینه Grid دانلود کنید. این کتاب از انتشارات شرکت IBM هستش.
My team is looking for some folks to join up and help us bring grid technology to our customers. Drop me a line!
منبع : http://gridgurus.typepad.com
I once worked with this person who wrote programs that only wrote to a single file. Once this program was put into the grid environment it would routinely create files that were hundreds of gigabytes in size. Nobody considered this to be a problem because the space was available and the SAN not only supported files of that size, but also performed amazingly well considering the expectations. While this simplifies the code and data management, there are a number of reasons why this is not a good practice.
Why would anybody do such a thing? All your data are belong to us?
منبع : http://gridgurus.typepad.com
Edna Nerona in an IBM Developer Work article, recommends a list of reading material for grid developers and researchers. Some of the important links are being provided here, For rest see the actual article.
منبع : http://www.gridblog.com/comments.php?id=242_0_1_0_C
"Grid computing is the latest to join the bandwagon of managed services. It's a good way of avoiding an expensive infrastructure investment", writes Bob Violino in his article Grid Computing Comes Around in this edition of Global Services magazine.
This article focuses on grid computing as a managed service. “What differentiates grid managed services from straight hosting is that the entire technology substrate that enables grid computing [software, hardware, storage] has already been deployed by the service provider,” says Ahmar Abbas, MD, Grid Technology Partners, a consulting firm in Falls Church, Va. “The client needs to just focus on the application enablement so that it can utilize the grid infrastructure.” Also different is the concept of paying for CPU utilization rather than a monthly fee for hosting infrastructure.
منبع : http://www.gridblog.com/comments.php?id=240_0_1_0_C
A few weeks back I blogged about Amazon's Simple Storage Service (S3) and how it was gaining traction in the market.
Well, it turns out that Amazon has even greater ambitions than just providing loads of hosted and managed storage! Today they announced their Elastic Compute Cloud (Amazon EC2).
The key to the technology seems to be the Amazon Machine Image (AMI). Users can create AMI based on their particular application or system profile. These are uploaded to the S3 service and are brought on line when required.
I can see some immediate business continuity / disaster recovery applications. Though not quite sure how load balancing occurs across multiples AMI instances that are brought live as application servers.
Another great step by Amazon to turn is technology platform into a revenue generating engine!
منبع : http://www.gridblog.com/comments.php?id=239_0_1_0_C
Independent Software Vendors (ISVs) that venture into the SaaS world have taken on two distinct sets of responsibilities. First, like traditional software companies, SaaS vendors are responsible for continually delivering innovative and relevant software products. Second, SaaS vendors must also develop, manage and support the infrastructure that is used to provide the software to the end user, under a regime of demanding service level agreements and associated penalties. Here's a look at the challenges (and rewards) ahead.
Read full article in ITWorld
منبع : http://www.gridblog.com/comments.php?id=238_0_1_0_C
Recently, I have been busy at work so stopped blogging for a while. During the period of inactivity, there had been numerous news items and grid related activities that I have been starring in my reader and mails. The following are highlights of this somewhat longer post.
Mark Linesch, who will lead the group, said the OGF would "open new doors to scientific discovery, business value and commercial adoption worldwide."
Experts welcomed the end of the groups' prolonged sparring over definitions and semantics.
SynfiniWay proved to have the most complete and integrated Grid computing solution for aerodynamics analyses at Airbus, combining service-oriented applications with open workflow capabilities for efficient support of complex dynamic processes.
Fujitsu Systems Europe has also been contracted to develop the services around the aerodynamic applications, and to integrate SynfiniWay within the existing user desktop tools for transparent grid access.
منبع : http://www.gridblog.com/comments.php?id=236_0_1_0_C
What do Microsoft and SmugMug have in common? Both rely on the Amazon Simple Storage Service (S3) for cheap and reliable web-scale storage. With Amazon S3, growing companies now have the resources to look and feel like a Fortune 500 enterprise.
Today, Amazon announced a variety of customers that together are storing more than 800 million data objects using Amazon S3. On one end of the spectrum there is Microsoft, which is utilizing S3 to dramatically reduce its storage costs without compromising scale or reliability. On the other end are small businesses such as SmugMug that are depending on the S3 benefits of scale and cost-efficiently previously only available to large companies.
منبع : http://www.gridblog.com/comments.php?id=235_0_1_0_C
We're combining the best of GlobusWorld, Grid Engine Workshop and Rocks-a-Palooza into one killer event in Oakland this May. Here's why you should come to the Open Source Grid and Cluster Conference:
This should be a fantastic conference, I'll look forward to meeting you there.
The term "cloud computing" seems to be attracting lots of attention these days. If you google it, you'll find more than half a million results, starting with Wikipedia definitions and news involving companies like Google, IBM, and Amazon. There is definitely no shortage of blogs and articles on the subject. While reading some of those, I've stumbled upon an excellent post by John Willis, in which he shares what he learned while researching the "clouds".
One interesting point from John's article that caught my eye was his regard of virtualization as the main distinguishing feature of "clouds" with respect to the "old Grid Computing" paradigm ("Virtualization is the secret sauce of a cloud."). While I do not disagree that virtualization software like Xen or VMware is an important part of today's commercial "cloud" providers, I also cannot help noticing that various aspects of virtualization were part of grid projects from their beginnings. For example, SAMGrid, one of the first data grid projects that served (and still serves!) several of Fermilab's High Energy Physics experiments since the late 1990's, allowed users to process data stored in multiple sites around the world without requiring users to know where the data will be coming from, and how will it be delivered to their jobs. In a sense, from physicist's perspective experiment data was coming out of the "data cloud". As another example, "Virtual Workspaces Service" has been part of the Globus Toolkit (as incubator project) for some time now. It allows an authorized grid client to deploy an environment described by the workspace metadata on a specified resource. Types of environments that can be deployed using this service range from atomic workspace to a cluster.
Although I disagree with John's view on the differences between the "old grid" and "new cloud" computing, I still highly recommend the above mentioned article, as well as his other posts on the same subject.
منبع : http://gridgurus.typepad.com
Advancements in networking and cheaper computing technology have enabled the Internet to be used for resource sharing, instead of just document sharing. The resources can include computing, storage and network resources. The dynamic nature of the Internet in terms of node/network failures poses challenges for large scale resource sharing. Further, the resources are autonomic, implying that they may join or leave the system dynamically. Thus, solutions for Internet scale resource sharing must enable dynamic application dependability and middleware reconfigurability. This means that the underlying resource sharing middleware must ensure dependability of applications in spite of resource/network dynamics. Further, the middleware components themselves must adapt to these dynamics (middleware reconfigurability).
Peer-to-Peer (P2P) systems such as Gnutella, Freenet, Pastry etc. provide reconfigurability and scalability. However, these file sharing P2P systems may not be directly usable for sharing compute resources on the Internet. This is because they do not consider the proximity of resources or their capabilities. We explore the use of P2P system concepts to build a reconfigurable and scalable middleware for Internet scale resource sharing. In unstructured P2P systems such as Gnutella and Freenet, the overlay is built in an uncontrolled fashion, possibly with self-organizing behaviour. They provide flexibility for finding resources by supporting arbitrary queries for searching. They may be inefficient, as they use flooding for the search. Further, unstructured P2P systems cannot provide guarantees about finding the data. In contrast, structured P2P systems assign static identifiers to peers and impose a overlay structure based on the node identifier. A routing structure based on distributed data structures (Distributed Hash Table, as in Chord and Pastry) is also imposed. The structured P2P systems can provide data location guarantees and are efficient for searching (O(log(n)) time, for n nodes). However, they support only limited and exact matching queries.
We propose Vishwa, a two layered P2P middleware for resource sharing in the Internet. It is a scalable and dynamically reconfigurable middleware. It provides a dependable execution environment for grid applications. The task management layer of the middleware is responsible for initial task deployment on the best available under-utilized nodes as well as the runtime migration of tasks to handle load dynamics. The task management layer is realized as an unstructured P2P layer and allows logical resource clustering based on proximity. The unstructured overlay allows neighbour lists to be constructed based on application specific criteria, whereas in structured overlay, the neighbour lists are only based on node identifiers. Thus, the task management layer of Vishwa constructs the neighbour list based on resource capabilities. If you want to know more about Vishwa, please follow the link below for the presentation or try the Vishwa technical report on the publications page.
Slides:
Vishwa Vishwa Compared with Globus Toolkit
We have extended Vishwa to a data management platform named as Virat. Large amounts of scientific data are being produced, for instance see Grid Physics Project or the Compact Muon Solenoid (CMS). Distributed computations on this data must be scheduled. Hundreds to thousands of geographically distributed users need access to data for performing computation. So, there is a need to replicate the data at appropriate locations to handle node/network failures and minimize computation time and/or bandwidth. There must be ways of describing the data in the form of meta-data to allow geographically distributed access to the data. The meta-data must also be replicated for fault-tolerance. Thus, replica management of data as well as meta-data is important. There must also be efficient mechanism to search/query the data. Another important requirement in a data grid is the discovery of data/compute resources based on proximity and node capabilities. We have designed and developed Virat to address the above issues and the orthogonal non-functional properties of scalability and fault-tolerance.
A platform that can be used for building such generic services must address key issues such as scalability, middleware reconfigurability, dependability, replication and resource/data discovery mechanisms. Existing shared object spaces cannot be used directly as such a platform because they do not scale up. Inefficient mechanisms for handling failures and object lookups and the use of centralized components inhibit their scalability. Virat focuses on the integration of shared object spaces with Peer-to-Peer systems. Virat provides a shared object space abstraction over a wide area distributed system. It is built using a unique two-layered P2P architecture that combines the advantages of structured and unstructured P2P systems. The unstructured layer facilitates capability based neighbourhood formation and allows cluster-level replication of data (and meta-data) to handle failures and to maintain consistency. The structured layer allows failure data to be recovered even across zones/clusters in O(log(N)). Performance studies (over Intranet and WAN testbeds) using a prototype implementation suggests that Virat can scale to millions of objects. For more details on Virat please check the following slides or follow the publications link for Virat papers.
I think that one of the most exciting consequences of the rise of multicore is the possibility of overcoming the limitations of the WAN by processing where you collect your data. It is exceptionally difficult and/or expensive to move large amounts of data from one distant site to another regardless of the processing capability you might gain. Paul Wallis has an excellent discussion about the economics and other key issues that the business community faces with computing on "The Cloud" in his blog Keystones and Rivets.
So how do cores help us get passed the relatively high costs of the WAN? The first signs of this trend will be wherever significant amounts of data are collected out in the field. Currently you have a number of options, none of them great, for retrieving your data for processing. These include:
There never really was much consideration given to processing the data in situ because the computational power just was not there. Multicore processors have allowed us to rethink this.
For example, consider one of the most sought after goals in a hot industry: near-real time monitoring of a reservoir for oil-production and/or for CO2 sequestration. (see the Intelligent Oilfield, IPCC Special Report on Carbon dioxide Capture and Storage) The areas where this is most desired tend to be fairly remote such as offshore or in the middle of inhospitable deserts. There is no network connectivity to speak of to these areas let alone enough to move data from a large multi-component ocean-bottom seismic array like those found in the North Sea.
Consequently, a colleague of mine and I were tasked with how we might implement the company’s processing pipelines in the field. Instead of processing the data using hundreds of processors and an equivalent number of terabytes of storage everything needed to fit on ***maybe*** as much as a single computer rack. Our proposal had to include power conditioning and backup, storage, processing nodes, management nodes (e.g. resource managers), as well as nodes for user interaction. Electrical circuit size limitations also limited our choices. Needless to say, 30-60 processors just was not enough capacity to seamlessly transition the algorithms from our primary data center. The only way it could be done was by developing highly specialized processing techniques: a task which could take years.
Now that we are looking at 8 cores per processor with 16 just around the corner everything has changed. Soon, it will be possible to provision anywhere from 160-320 processors under the same constraints as before. It is easy to imagine another doubling of this shortly thereafter. Throw in some virtualization for a more nimble environment and we will be able to do sophisticated processing of data in the field. In fact, high-quality and timely results could alleviate much of the demand for more intensive processing after the fact.
Who needs the WAN and all of its inherent costs and risks? Why pay for expensive connectivity when you could have small clusters with hundreds of processors available in every LAN? If remote processing becomes commonplace because of multicore, we might see the business community gravitate towards the original vision of the Grid.
منبع : http://gridgurus.typepad.com
This is a repost of a reply I wrote to a LinkedIn question
Mark Mathson gave a great answer and blog link in his reply, but it's worth going down one additional level of detail.
A cloud is operated by something. That something is software and people need to be able to interoperate with that software. So the question is twofold.
1) What does that software do.
2) What does the interaction model look like.
Part one is mostly undefined. The term cloud computing is only a few months old at this point and there is no definition that I've seen that describes in detail what the services are and how they work. Since cloud computing is a subset of grid computing we can make some educated guesses as to how this will turn out.
o There will have to be a security model. This model will be complex enough that I'm calling out additional specifics. Currently there is no model specified in any definition of cloud computing.
o That model includes delegation. In the early development of the grid we had a security model without delegation and it was a non-starter. Anytime you need to request something of a service you need to delegate authority to that service.
o That model will have to be multi-institutional. By this I mean that the model must allow people from different communities to be able to access the resources within the cloud without having to join a common security domain. The owner of the resources will have to be able to make local decisions about who is allowed to use his resources.
o Monitoring will be complex, but must run on a common backplane. In the grid community we have hierarchical, distributed monitoring that allows canonical services and a variety of applications to push monitoring information upstream to consumers. No definition of cloud computing currently has any monitoring specification.
o Data handling will be a challenge. In the grid community we discovered early on that moving data between facilities was a bottleneck due to some decisions made in developing TCP decades ago. We worked around these to develop protocols that move data at near theoretical maximum rates even in WAN environments. We also found that people who want to move a lot of data find it cumbersome to manage the processes to do that themselves. We developed 'fire and forget' mechanisms to moving data. A user can make a request, walk away and check the results the next day. As a side note, this behavior requires delegation to work in a secure fashion.
All of the above have to be dealt with before one even begins to contemplate the VM issues that seem to dominate the cloud computing discussions.
The second part is about how the user will interact. That one is much more trivial to answer. Our users already interact in a variety of ways. Some examples include browsers, native applications, java applications, remote desktops and display technologies like x-windows.
All of those will continue to be in play in a cloud based architecture because each has significant structural, administrative and performance advantages that have led to their survival for a long time.
The cloud won't be about what window a user interacts with, it will be about the plumbing that makes that window useful.
منبع : http://gridgurus.typepad.com
The open source justification is no longer the new path that few organizations have walked. I remember in the mid-90's when I switched from Solaris x86 to BSD and then to linux trying to explain what I was doing to co-workers. At that point I wasn't even trying to justify a decision to migrate some production machines, I was just exploring alternatives on my workstations. Still, I got far more confusion and skepticism then nods of understanding.
Today the world is different. People use open source for a wide variety of things. Most folks understand the landscape and regularly use total cost of ownership and risk mitigation as important parts of their final decision. What's still missing, in some cases, is the ability to take advantage of a unique opportunity that open source give you at an infrastructure layer.
Grid software is fundamentally concerned with managing very complex business needs in a manner that allows humans to understand what is going on with their systems. As such one of the most important aspects is the ability to integrate that infrastructure with applications in a manner that allows developers and system integrators to present simpler interfaces to their users.
With proprietary systems there are often APIs that allow this to be done. However, in no instance that I've seen are these APIs on the 'critical path' for the company making the software. They are always offered essentially as a patch that some powerful customer needed and now is slowly leaking out to the rest of the customer base. These systems also tend to be highly unstable and each version carries changes in the API. These changes are frequently radical and nearly always undocumented until a customer comes across something that has stopped working and raises a stink with the vendor.
Open source software tends to work differently, especially at an infrastructure layer. The components are built by folks who are 'eating their own home cooking' and understand the implications of a change in interface. As such, they tend to be infrequent and, when they do occur, highly justifiable. The reduction in quantity of changes is helpful, but because there is no vendor forcing an upgrade, the fact that you can adopt a new version when the timing is right for your organization is also a big plus.
The world has changed. And it's changed for the better for data center managers globally.
منبع : http://gridgurus.typepad.com
|
|
|
|
Quick links to GridLab Workpackages:
TB WP1 WP2 WP3 WP4 WP5 WP6 WP7 WP8 WP9 WP10 WP11 WP12 WP13
منبع : http://www.gridlab.org
Access for Mobile Users |
|
|
Small and flexible mobile devices are increasingly used for web access to various remote resources. This working package wants to provide grid access mechanisms for such devices. This requires adaption of existing access technologies like portals for low bandwidth connectivity and low level end user hardware. The mobile nature of such devices also requires flexible session management and data synchronization. This work package will enhance the scope of present grid environments to the emerging mobile domain. Utilizing the new higher bandwidth mobile interconnects, very useful and previously impossible scenarios of distributed and collaborative computing can be realized.

The main goal of our efforts is to give the Grid users a possibility to access their applications and resources from any place using mobile devices. According to our approach the devices are incorporated only as the clients of Grid services (not peers). Moreover, because of limitations of mobile devices this approach assumes adopting a gateway between the client and the Grid. These limitations forced us to pay special attention to build flexible user interfaces as well.
We developed (or co-developed) also several specialized mobile-oriented Grid services. In some cases we provided only a mobile wrap-up of the heavyweight Grid services placed in the gateway. Some of them are build from the scratch as a specialised mobile service.
Our mobile client is tighty coupled with the gateway. This "connection" means that features in the mobile client are mapped to corresponding plugins in the gateway. Those plugins are responsible for interacting with Grid services in the name of the mobile client.
The below schema presents our approach: we give mobile users access to Grid services via gateway. The mobile client running on a mobile device together with the gateway make up the Minimal Grid Interface.

راديو آلمان: سريعترين ابررايانه براي اهداف غير نظامي وارد شبكه شد
پايگاه اينترنتي راديو "دويچه وله" نوشت:سريعترين ابر رايانه براي اهداف غير نظامي به ارزش ۱۵ميليون يورو بهطور رسمي وارد شبكه آلمان شد.
به گزارش اين پايگاه اينترنتي، "توماس ليپرت" رئيس مركز رايانهاي يوليش اعلام كرد: در فهرست برترين ابررايانهها، يوجين سريعترين ابر رايانهاي است كه براي اهداف غير نظامي مورد استفاده قرار ميگيرد.
سريعترين ابررايانهي جهان با نام "يوجين" كه براي اهداف غيرنظامي ساخته شده است پنجشنبه گذشته در مركز تحقيقاتي يوليش در نزديكي شهر كلن آلمان، كار خود را بهطور رسمي آغاز كرد.
اين منبع گزارش كرد: "ابررايانهي مركز تحقيقات هستهاي ايالات متحدهي آمريكا در فلوريدا كه كاربرد نظامي دارد، تنها ابررايانهاي است كه از يوجين سريعتر است.
كيس يوجين بزرگتر از آن است كه زير يك ميز كار جا بگيرد. اين ابررايانه كه ابعاد آنهم برازندهي نامش است، از ۶۵هزار پردازنده تشكيل شده است و اين پردازندهها در ۱۶محفظه قرار داده شدهاند كه هر محفظه به بزرگي يك كيوسك تلفن عمومي است.
يوجين به همراه يومپ و يوبل، دو ابررايانهي ديگر مركز تحقيقاتي يوليش، در يك سالن بزرگ جاي گرفته است.
بازديدكنندگان از اين سالن بايد گوشي روي گوشهاي خود بگذارند زيرا تعداد زيادي دستگاه تهويه و دمنده در اين سالن، هوا را در قفسهها تهويه ميكنند.
هر محفظه در حدود ۳۰كيلووات گرما توليد ميكند، دستگاههاي تهويه دائم كار ميكنند تا دماي ۱۶درجهي سانتيگراد را براي يوجين فراهم كنند، اين دما، دمايي است كه اين ابر رايانه بهترين بازده را در آن دارد.
يوجين قادر است در هر ثانيه ۲۲۳هزار ميليارد محاسبه انجام دهد، رقمي غير قابل تصور كه معادل سرعت محاسبهي ۲۰هزار رايانه است.
ميتوان اينطور تصور كرد كه هر كدام از هفت ميليارد نفر جمعيت كرهيزمين در يك ثانيه و بهطور همزمان، ۳۰هزار محاسبهي رياضي را انجام دهند.
البته اين محاسبه بهطور حتم ۱+۱نخواهد بود بلكه محاسبات پيچيدهتر اعشاري يا چيزي شبيه آن. چنين بازدهي بدون شك از توان يك پردازنده خارج است.
در يوجين هر پردازنده، يك بخش از كار را انجام ميدهد، بههمين دليل مهمترين نكته در يك ابررايانه اين است كه شبكهاي منسجم بتواند نتيجهي محاسبات پردازندهها را بهصورت يك خروجي ارائه دهد.
شبكهي ارتباطي بين پردازشگرها بايد پايدار و بسيار سريع باشد تا بتواند دادهها را دائم از يك پردازنده به ديگري منتقل كند.
از اين پس حدود ۲۰۰گروه تحقيقاتي آلماني و اروپايي ميتوانند در پروژههاي خود روي كمك يوجين حساب كنند.
يك هيئت نظارت مستقل تصميم ميگيرد كه كدام پروژه اولويت دارد.
شبيهسازيهاي رايانهاي اكنون مدتهاست كه در كنار نظريه و آزمايش، ركن سوم دانش را تشكيل ميدهند.
يكي از كارشناساني كه پاياننامهي دكتري خود را در مركز محاسبات يوليش مينويسد، معتقد است: وقتي دارويي توليد ميشود، آزمايشهاي بيشماري بايد روي آن صورت گيرد تا وارد بازار شود.
انجام اين آزمايشها در دنياي واقعي، هزينهي بسيار هنگفتي در بر خواهد داشت، بدين ترتيب با شبيهسازيهاي رايانهاي دستكم مشخص ميشود كه كار در چه جهتي پيش ميرود.
وي در مثال ديگري ادامه ميدهد: زمانيكه يك ستارهشناس نياز به يك آزمايش دارد، آوردن يك ستاره، روي ميز كار آزمايشگاه فيزيك، براي او بسيار بسيار گران تمام خواهد شد، وانگهي اينكار ميلياردها سال نيز طول خواهد كشيد.
بههمين خاطر آزمايشهاي رايانهاي از اهميت خاصي برخوردارند.
كارشناس ديگري در حال انجام آزمايشي در زمينهي فيزيك كوانتوم است. وي و همكارانش در حال بررسي قويترين نيروي طبيعت هستند.
نيروي قدرتمند هستهي اتم كه كوارك را به نوترون يا پروتون پيوند ميدهد و آنها را در هستهي اتم نگه ميدارد.
آزمايش در اين زمينه، در آزمايشگاه و با روشهاي موجود، قابل انجام نيست. جاييكه فيزيك از پاسخ باز مانده است، رايانه تنها راهحل موجود بهنظر ميرسد.
مدير موسسهي يوليش، مزيت اصل اين ابررايانه را در مقايسه با همنوعان خود، صرفهجويي قابل توجه در مصرف انرژي در برابر توانايي آن در محاسبه ميداند. او از اين نظر يوجين را يك ابر رايانهي سبز ميداند.
اينكه آيا يوجين جايگاه ممتاز خود را در ميان ابررايانههاي ديگر طولانيمدت حفظ خواهد كرد، جاي سوال است.
هماكنون مهمترين رقيب يوجين، "رنجر" ، ابررايانهي دانشگاه تگزاس آمريكا است كه به زودي كار خود را آغاز خواهد كرد.
به گفتهي يكي از مديران موسسهي يوليش، هر سال دو بار ليست ۵۰۰ ابررايانهي برتر دنيا منتشر ميشود كه در طول سالهاي اخير هر بار همواره ۲۵۰تا ۳۰۰رايانهي جديد وارد ليست شدهاند.
بنابراين نبايد از آمدن رايانهي سريعتري تعجب كرد."
SRB – The SDSC Storage Resource Broker – supports shared collections that can be distributed across multiple organizations and heterogeneous storage systems. The SRB can be used as a Data Grid Management System (DGMS) that provides a hierarchical logical namespace to manage the organization of data (usually files).
The SRB software infrastructure can be used to enable Distributed Logical File Systems, Distributed Digital Libraries, Distributed Persistent Archives, and Virtual Object Ring Buffers. The most common usage of SRB is as a Distributed Logical File System (a synergy of database system concepts and file systems concepts) that provides a powerful solution to manage multi-organizational file system namespaces.
SRB presents the user with a single file hierarchy for data distributed across multiple storage systems. It has features to support the management, collaboration, controlled sharing, publication, replication, transfer, and preservation of distributed data. The SRB system is middleware in the sense that it is built on top of other major software packages (file systems, archives, real-time data sources, relational database management systems, etc). The SRB has callable library functions that can be utilized by higher level software. However, it is more complete than many middleware software systems as it implements a comprehensive distributed data management environment, including end-user client applications ranging from Web browsers to Java class libraries to Perl and Python load libraries.
منبع : http://www.sdsc.edu/srb/index.php/Main_Page
OGSA-DAI components are either data access components or data integration components. A Distributed Query Processing (DQP) system is an example of a data integration component and can potentially provide effective declarative support for service orchestration as well as data integration. The service-based DQP framework described in [1],[2] provides an approach that:
The service-based DQP framework consists of the following two services:
Grid Distributed Query Service (Coordinator). The Grid Distributed Query Service (GDQS), or coordinator, is the main interaction point for the clients. When a coordinator is set up, it obtains the metadata and computational resource information that it needs to compile, optimise, partition and schedule distributed query execution plans over multiple execution nodes in the Grid. The implementation of the coordinator builds on a previous work on the Polar* distributed query processor for the Grid [3],[4] by encapsulating its compilation and optimisation functionality. The coordinator is currently implemented as a set of OGSA-DAI data service resources and activities.
As well as using the services provided by OGSA-DAI data services, the coordinator is itself implemented as an OGSA-DAI data service, and thus can be discovered and invoked in the same way as other OGSA-DAI data services. Consequently, the Grid stands to benefit from OGSA-DQP, through the provision of facilities for declarative request formulation that complement existing approaches to service orchestration, via uniform interfaces and interaction semantics.
Figure 1 provides an overview of the interactions during the instantiation and set-up of a OGSA-DQP coordinator as well as those that take place when a query is received and processed via a set of evaluators. The components in this figure and the numbered interactions between each component are now described. The 3-dot sequence in this figure can, as usual, be read as `and so on, up to'. This description of OGSA-DQP is intended to give a high level overview of the system.

1: An OGSA-DQP coordinator consists of two types of OGSA-DAI data service resources: GDQS factory data service resources and GDQS data service resources. Initially, an installed coordinator service will expose only a GDQS factory data service resource. This data service resource is then used to create GDQS data service resources which can be used by a client to execute queries.
In this first step in the interaction between a client and OGSA-DQP, the client uses a deployed GDQS factory data service resource to create a configured GDQS data service resource. The client interacts with the GDQS factory data service resource by sending an OGSA-DAI perform document which specifies that a DQPFactory activity should be executed. The DQPFactory activity is able to interact with a GDQS factory data service resource in order to dynamically deploy a GDQS data service resource. The DQPFactory activity is parameterised by an XML document which specifies exactly how the deployed GDQS data service resource should be configured. Configuration parameters include the databases and evaluators which can be utilised by the data service resource which is to be created. The result of this interaction is that a GDQS data service resource is created and initialised. The coordinator service now exposes this dynamically deployed GDQS data service resource and it is automatically assigned a resource ID by OGSA-DAI.
2: During the initialisation of the GDQS data service resource, the schemas of the databases it will use are imported by contacting the OGSA-DAI data services which wrap these databases.
3: The client receives the result of the perform document submitted in step 1. This result contains the resource ID needed by the client to identify the created GDQS data service resource in subsequent interactions with this data service resource.
[Note] steps 1-3 need not take place if a GDQS data service resource already exists which imports the databases and analysis services required by a client (if this is the case, the client should contact the existing GDQS data service resource directly). Each GDQS data service resource is able to process multiple concurrent queries and the GDQS data service resource is not terminated by a client following a query session. Steps 1-3 represent a setup process which is necessary to configure a GDQS data service resource for use by one or more clients.
4: The client submits a perform document containing a query. Queries are written in OQL and are executed by the OQLQueryStatement activity. The GDQS data service resource uses the Polar* query compiler to parse, optimise and schedule the query. A query plan is created, consisting of a number of partitions. Each partition specifies an individual evaluator's role in the query plan.
5: Query partitions are sent to the relevant evaluator services.
6: Some evaluators interact directly with OGSA-DAI data service to obtain data.
7: Other evaluators may interact with other evaluators to implement their role in the execution of the query.
8 - 9: Results propagate back from the evaluators to the coordinator and eventually back to the client.
[Note] OGSA-DQP is also able to invoke Web services from within queries. This is not illustrated in Figure 1 in order to preserve the clarity of the figure and its associated description. Also omitted from the figure are the resource properties made available by the GDQS data service resource. Following initialisation, the GDQS data service resource provides a resource property enabling the client to obtain a description of the database schemas imported by OGSA-DQP.
[1] M. N. Alpdemir, A. Mukherjee, N.W. Paton, P.Watson, A. A. Fernandes, A. Gounaris, and J. Smith. Service-based distributed querying on the grid. In the Proceedings of the First International Conference on Service Oriented Computing, pages 467-482. Springer, 15-18 December 2003.
[2] M.Nedim Alpdemir, Arijit Mukherjee, Norman W. Paton, Paul Watson, Alvaro A.A. Fernandes, Anastasios Gounaris, and Jim Smith. OGSA-DQP: A service-based distributed query processor for the Grid. In Simon J. Cox, editor, Proceedings of UK e-Science All Hands Meeting Nottingham. EPSRC, 24 September 2003.
[3] J. Smith, A. Gounaris, P. Watson, N. W. Paton, A. A. A. Fernandes, and R. Sakellariou. Distributed Query Processing on the Grid. In Proc. Grid Computing 2002, pages 279-290. Springer, LNCS 2536, 2002.
[4] J. Smith, A. Gounaris, P. Watson, N. W. Paton, A. A. A. Fernandes, R. Sakellariou, Distributed Query Processing on the Grid, Intl. J. High Performance Computing Applications, Vol 17, No 4, 353-368, 2003 (Extended Version of Grid 2002 paper selected for publication in special issue).
منبع : http://www.ogsadai.org.uk/about/ogsa-dqp
Sun Expands Grid Application Offerings
February 19, 2008
By Paul Shread
Sun Microsystems has added 14 new applications to its Network.com Application Catalog of online grid-enabled applications available from the Sun Grid compute utility service on a pay-per-use basis.
Sun also launched a new partner program, Sun Network.com Connection, for independent software vendors (ISVs) to create on-demand service offerings at lower risk and cost, with access to new sales channels. Sun also added the Netherlands to the list of 25 countries where the services can be utilized.
The latest additions bring the grid service's total number of "Click and Run" applications to 39.
Mark Herring, Sun's senior director of software marketing, said Network.com "is evolving into a virtual on-demand data center that allows businesses of any size to leverage compute infrastructure without the cost of ownership and with the flexibility of scaling up or down compute resources in real time as business demands change."
The latest applications include Blender, open source tools for modeling, rendering, animation, post-production, creation and playback of interactive 3D content. Sun is sponsoring Blender Foundation's open movie "Peach," a short 3D animation by artists and developers in the Blender community, with grants of CPU hours in Network.com.
Other new open source applications include Zeus (a life sciences application), GAP (a computational mathematics application) and OOFEM (a computer aided engineering application).
In addition to the Solaris 10-based grid platform, Network.com provides developers and open source communities with tools, resources and an active grid developer community that helps them build and test on-demand applications.
Sun's Network.com provides access to compute infrastructure on a pay-per-use basis via its Sun Grid compute utility at $1 per CPU hour. منبع : http://www.gridcomputingplanet.com
TOP500 lists computers ranked by their performance on the LINPACK Benchmark. It is clear that no single number can reflect the performance of a computer. Linpack is, however, a representative benchmark to evaluate computing platforms as High Performance Computing (HPC) environments, that is in the dedicated execution of a single tightly coupled parallel application. On the other hand, an HTC application comprises the execution of a set of independent tasks, each of which usually performs the same calculation over a subset of parameter values. Although, the HTC model is widely used in Science, Engineering and Business, there is not representative bechmark and model to evaluate the performance of computing platforms as HTC environments. At first sight, it could be agued that there is no need for such a performance model. We agree on this for static and homogeneous systems. However, how can we evaluate a system consisting of heterogeneous and/or dynamic components?.
Benchmarking of Grid infrastructures has always been a highly polemic area. The heterogeneity of the components and the high number of layers in the middleware stack make difficult even to define the aim and scope of the benchmark. A couple of years ago we wrote a paper entitled "Benchmarking of High Throughput Computing Applications on Grids" (R. S. Montero, E. Huedo and I. M. Llorente) for the Parallel Computing Journal presenting a pragmatic approach to evaluate the performance of a Grid infrastructure when running High Throughput Computing (HTC) applications. We demonstrated that the complexity of a whole Grid infrastructure can be represented by only two performance parameters, which can be used to compare infrastructures. The proposed performance model is independent from the middleware stack and valid for any computing infrastructure, so being also applicable for the evaluation of clusters and HPC servers.
Our proposal is to follow an approach similar to that used by Hockney and Jesshope to characterize the performance of homogeneous array architectures on vector computations. A first-order description of a Grid can be made by using the following formula for the number of tasks completed as a function of time:
n(t)=R*t-N
Note that given the heterogeneous nature of a Grid, the execution time of each task can differ greatly. So the following analysis is valid for general HTC applications, where each task may require distinct instruction streams. The coefficients of the line are called:
The above linear relation can be used to define the performance of the system (tasks completed per second) on actual applications with a finite number of tasks:
r(n)=R/(1+N/n)
This linear model can be interpreted as an idealized representation of a heterogeneous Grid, equivalent to an homogeneous array of 2N processors with an execution time per task 2* N/R.
The half-performance length (N), on the other hand, provides a quantitative measure of the heterogeneity in a Grid. This result can be understood as follows, faster processors contribute in a higher degree to the performance obtained by the system. Therefore the apparent number of processors (2N), from the application's point of view, will be in general lower than the total processors in the Grid (P). We can define the degree of heterogeneity (m) as 2N/P. This parameter varies form m = 1 in the homogeneous case, to m = 0 when the actual number of processors in the Grid is much greater than the apparent number of processors (highly heterogeneous).
N is an useful characterization parameter for Grid infrastructures in the execution of HTC applications. For example, let us consider two different Grids with a similar asymptotic performance. In this case, by analogy with the homogeneous array, a lower N parameter reflects a better performance (in terms of wall time) per Grid resource, since the same performance (in terms of throughput) is delivered by a smaller ‘‘number of processors''.
We propose the OGF DRMAA implementation of the ED benchmark in the NAS Grid Benchmark suite, with an appropriate scaling to stress the computational capabilities of the infrastructure, as benchmark to apply the performance model. The ED benchmark comprises the execution of several independent tasks. Each one consists in the execution of the SP flow solver with a different initialization parameter for the flow field. These kind of HTC applications can be directly expressed with the DRMAA interface as bulk jobs.
DRMAA represents a suitable and portable API to express distributed communicating jobs, like the NGB. In this sense, the use of standard interfaces allows the comparison between different Grid implementations, since neither NGB nor DRMAA are tied to any specific Grid infrastructure, middleware or tool. DRMAA is implemented with the following available Resource Manager systems: Condor, LSF, Globus GridWay, Grid Engine and PBS.
In the paper we present both an intrusive and a non-intrusive methods to obtain the performance parameters. The light-weight non-intrusive probes provide continual information on the health of the Grid environment, and so a way to measure the dynamic capacity of the Grid, which could eventually be used to generate global meta-scheduler strategies.
We have demonstrated in several publications how the first-order model reflects performance of complex infrastructures running HTC applications. So, why don't we create a TOP500-like ranking of infrastructures?. The ranking could be dynamic, obtaining the parameters with the non-intrusive probes. We have all the ingredients:
Ignacio Martín Llorente
Reprinted from blog.dsa-research.org
منبع : http://gridgurus.typepad.com
You have built and installed your shiny new cluster, installed the Grid Engine software, configured the queues, and announced to the world that your new system is ready to be used. What next? Well, think about your monitoring options…
As users start submitting jobs and hammering the system in every possible way, things will inevitably break on occasion. When something goes wrong in the system, you will want to know about the problem before you start receiving help desk calls and user emails.
The first step in developing an effective strategy for monitoring Grid Engine is learning how to use the available command line tools and how to look for possible issues in the system. Some of the things that you should always pay attention to include:
• queues in the unknown state; instance queue in an unknown state usually means that execution daemon is down on that particular host
• queues and jobs in the error state
• configuration inconsistencies
• load alarms
All of the above information can be easily obtained using the qstat command (e.g., try something like “qstat -f -qs uaAcE -explain aAcE”). It is also not difficult to script basic GE monitoring tasks and come up with a simple infrastructure that is able to alert system administrators to any new or outstanding problems in the system.
As your user base grows, so will your monitoring needs, and you will likely want to extend your monitoring tools. You should consider looking into existing software packages like xml-qstat, which uses XSLT transformations to render Grid Engine command line XML output into different output formats. Alternatively, you can also develop set of your own XSL stylesheets that are customized to your needs, and use widely available command line tools such as xsltproc to generate monitoring web pages from the “qstat -xml” output.
Another interesting Grid Engine monitoring option is the Monitoring Console that comes with Cluster Express (CE). Its main advantage is that it integrates monitoring data from several different sources: Ganglia (system data), Grid Engine Qmaster and ARCo database (job data). However, even though the Cluster Express by itself is easy to install, at the moment integrating the CE Monitoring Console with existing Grid Engine installation requires a little bit of work. I am told that this will be much simplified in the upcoming CE release. In the meantime, if you are really anxious to try the CE Monitoring GUI on your Grid Engine cluster, do not hesitate to send me an email…
منبع : http://gridgurus.typepad.com
Up until I read Ian Foster’s Cloud Computing post, I had paid little attention to what the term meant to people. Personally, I had already chalked up the idea as a rebranding of Grid computing. So I asked a number of friends what they thought the differences between the two were. Of course many people not actively involved in the community are not familiar with either concept. (I find the fact that computer professionals know what the latest buzz is around SaaS, and SOA but do not seem to consider how and where they might land these systems peculiar.) In any event, here is a summary of the answers I received:
While I found it rather interesting that while there was some overlap of technical perspectives, I did not get any answers that were identical. I believe that the last description explains this situation nicely while also offering the most interesting take on the topic. A Cloud is a nuanced term that invokes the idea of something beautiful which also evolves rapidly, contains a lot of power, and then is gone. Meanwhile the grid, like the utilities it was conceived from, is known for its reliability, ability to tap into reserve power on a moments notice, as well as their accommodating levels of service. Notice how the first two answers all adhere to this concept? I don’t think this is an accident nor do I think that this escaped the attention of the marketing departments of industry-leaders like Amazon and Google, both of whom operate in what they term Cloud space. While it is distinctly possible that the term organically evolved, it is interesting that they chose to stick with it.
Once more, I found it particularly noteworthy that not one person I queried mentioned the amount of data to be processed. Foster and the Business Week article he references, as well as many others, suggest that we need to think in terms of a great deal more data than we have before. For example, Google wants their people to think in terms of a thousand times more data than that to which they are accustomed.
Heck, I was thinking about writing about the so-called “Data Tsunami” myself – but not in terms of thinking about significantly larger datasets. The datasets we were working with a decade ago were suitably massive for what we were trying to accomplish. Like today, it was not economically feasible to keep it all online at once. The fact is that the incredible leaps in computational capacity have led us to build more complicated problems that demand still more data. As such, a thousand times more data is probably still not enough. If only the networking and storage companies had kept up with the leaps in processing capacity (I was on a gigabit network five years ago and I am using a gigabit network today >sigh<).
Consequently we still have to use the tried and true standard operating procedure of:
For example, the Large Hadron Collider will be examining billions of collisions per second but will only store a few hundred per second for later processing (see recent article on Dr. Heuer). We have always needed to think in terms of a thousand times more data than we can possibly process or to become accustomed. Basically the scope of what is economically feasible has changed dramatically over the last few decades, while we continue to be quite resource constrained. Which brings us back to the concept of capacity computing, whether in the form of a transitory Cloud, a steadfast Grid, or even the comfortable @home project. The key here is that people are continuing to push passed the boundaries of what is feasible.
منبع : http://gridgurus.typepad.com
Fortune 400 businesses, and many smaller ones, run clusters in many different locations around the world.
As facility, management and other costs continue to become larger and large shares of corporate IT budgets, networking costs continue to fall. The result is that data center consolidation becomes a more reasonable goal.
I've seen this in a few of my customers. Beginning last year people started looking more toward grid technology to help them manage this. As the economy has tightened more people have considered this. Particularly as part of a plan toward cost reduction by moving to open source tools.
The general pattern is that the IT group decides they need to find ways to more effectively manage large and disconnected sets of resources. They turn to grid computing to help them manage that cloud and in the process realize that they have a lot of special purpose machines that are being quite underutilized and that they have enormous duplication of effort in the management of those data centers.
As we've entered into a bear market, many companies are taking a second look at their IT costs and looking for ways to tighten their belts. The combination of open source and grid/cloud computing models offers the ability to do that with open source offering a lower cost software acquisition model and grid computing allowing reduction in IT staff through centralization.
I've also been working with folks on the lost art of environment management. But more on that in a future blog...
منبع : http://gridgurus.typepad.com
Gridbus/GRIDS Lab Annual Report
The GRIDS Lab and the Gridbus Project is pleased to release Annual
Report of its key activities and outcomes during the academic year 2007.
Please browse:
http://www.gridbus.org/reports/GRIDS-Lab-AnnualReport2007.pdf
منبع : http://www.gridbus. org
TeraGrid '08
TeraGrid '08 Welcomes Papers, Posters, Demos, and Visualizations!
The 3rd annual conference will showcase TeraGrid's impact in research and education through presented papers, demos, posters, and visualizations. TG08 will foster collaborations among leading researchers, developers, and educators that build on the growing TeraGrid infrastructure. TG08 will also provide information and training to enable current and future users to achieve maximum impact using TeraGrid resources and services. All interested individuals and organizations are invited to participate. Visit the Call for Participation.
For complete conference details, visit http://www.tacc.utexas.edu/tg08/index.php
منبع : http://www.teragrid.org
درباره TeraGrid
TeraGrid is an open scientific discovery infrastructure combining leadership class resources at eleven partner sites to create an integrated, persistent computational resource.
Using high-performance network connections, the TeraGrid integrates high-performance computers, data resources and tools, and high-end experimental facilities around the country. Currently, TeraGrid resources include more than 750 teraflops of computing capability and more than 30 petabytes of online and archival data storage, with rapid access and retrieval over high-performance networks. Researchers can also access more than 100 discipline-specific databases. With this combination of resources, the TeraGrid is the world's largest, most comprehensive distributed cyberinfrastructure for open scientific research.
TeraGrid is coordinated through the Grid Infrastructure Group (GIG) at the University of Chicago, working in partnership with the Resource Provider sites: Indiana University, Oak Ridge National Laboratory, National Center for Supercomputing Applications, Pittsburgh Supercomputing Center, Purdue University, San Diego Supercomputer Center, Texas Advanced Computing Center, University of Chicago/Argonne National Laboratory, the National Institute for Computational Sciences, the Louisiana Optical Network Initiative, and the National Center for Atmospheric Research.
منبع : http://teragrid.org/about
|
|
| ||||||||||||||||||||||||||||||||||||||||||
|
|
|
| |||||||||||||||||||||||||||||||||||||||||
|
|
| ||||||||||||||||||||||||||||||||||||||||||
Grid, more Grid
I’m a bit behind some of the other early movers…
3tera. Taking grid and virtualization in a different direction. They provide services for entire virtual clusters, virtual data centers, and more.
If implementing massive super computers and data centers becomes little more than filling in a sales web form, watch out hardware, hosting, and desktop sellers.
Perhaps google will get some competition now that massive CPU resources are being made available to anyone with an idea.
January 14, 2008 (Reuters) -- A supercomputer that could help answer some of science's biggest questions is being unveiled today.
With the power of 12,000 desktop PCs, the mammoth machine called Hector is the U.K.'s fastest computer and one of the most powerful in Europe. It can make 63 trillion calculations each second, allowing scientists to conduct research into everything from climate change to new medicines.
The machine is housed in 60 wardrobe-sized cabinets in the University of Edinburgh's advanced computing center near the Scottish capital. After years of development, Chancellor Alistair Darling is due to attend the official launch ceremony for the machine, which cost £113 million.
Hector, which stands for High-End Computing Terascale Resource, was made U.S. manufacturer Cray Inc.
"Hector will enable us to do research that we simply could not do in any other way," said Jane Nicholson, a researcher at the Engineering and Physical Sciences Research Council, the public body that acts as the project's managing agent. "We want to push forward the boundaries of knowledge."
Researchers plan to tap into the computer's power to study ocean currents, build tiny parts for advanced computers and make warplanes less visible to radar. Other projects include research into superconductors, combustion engines and new materials. Scientists working in fields ranging from cosmology and atomic physics to disaster simulation and health care will also use the computer.
Despite its vast power, Hector falls short of the power produced by the world's biggest computer: Blue Gene/L. Housed at the Lawrence Livermore National Laboratory in California, Blue Gene is used to study nuclear weapons without the need for underground testing.
Editing by Steve Addison.
(IDG News Service) -- The One Laptop Per Child project suffered a blow this week, with Chief Technology Officer Mary Lou Jepsen quitting the nonprofit to start a for-profit company to commercialize technology she invented with OLPC.
Jepsen, who joined OLPC as its first employee in 2005 after Nicholas Negroponte started the effort, will pursue an opportunity to chase after "her next miracle in display technology," OLPC said in an e-mail sent on Sunday.
Jepsen was responsible for hardware and display development for the rugged and power-saving XO laptop, designed for use by children in developing countries. Though the laptop has struggled to find buyers, it has been praised for its innovative hardware features and environmentally friendly design.
Her last day with the organization is Dec. 31, though she will continue consulting with OLPC, according to the e-mail. Dec. 31 is also the end of OLPC's Give One Get One program, in which two XO laptops can be purchased for about US$400, with a user getting one laptop and the other being donated.
Satisfied that XO laptops were shipping in volume, Jepsen noted in an e-mail that she was starting a for-profit company to commercialize some of the technologies she invented at OLPC.
"I will continue to give OLPC product at cost, while providing commercial entities products they would like at a profit," Jepsen wrote in an e-mail.
"I believe that the work I led in the design of the XO laptop is just the first step in changing computing," she wrote.
Powered by solar power, foot pedal or pull-string, the laptop doesn't rely on an electrical outlet to run, making it useful for situations where power is unreliable or unavailable. The laptop consumes between 2 watts to 8 watts of electricity from a specially designed lithium-ferro phosphate battery depending on usage, compared to 40 watts on commercial laptops depending on usage.
The laptop's battery lasts up to 21 hours because of custom-designed, efficient power-saving features implemented at the hardware and software level. Batteries in commercial laptops may explode at high temperatures, while XO's batteries can run and recharge in temperatures around 100 degrees Fahrenheit (38 degrees Celsius), Jepsen said in earlier interview.
OLPC is also designing a cow-powered generator that works by hooking cattle up to a system of belts and pulleys.
For connectivity, the laptop has mesh-networking features for Internet access.
An earlier version of this article incorrectly stated the OLPC's power consumption features, and the eighth paragraph was updated on 1/2/08 to reflect the accurate information.
November 29, 2007 (IDG News Service) -- SAN FRANCISCO -- A former employee of a small California canal system has been charged with installing unauthorized software and damaging the computer used to divert water from the Sacramento River.
Michael Keehn, 61, former electrical supervisor at the Tehama Colusa Canal Authority (TCAA) in Willows, Calif., faces 10 years in prison on charges that he "intentionally caused damage without authorization to a protected computer," according to Keehn's Nov. 15 indictment. He did this by installing unauthorized software on the TCAA's Supervisory Control and Data Acquisition (SCADA) system, the indictment states.
Keehn accessed the system on or about Aug. 15, according to the indictment. He is set to appear in federal court on Dec. 4 to face charges of computer fraud.
As an electrical supervisor with the authority, he was responsible for computer systems and is still listed as the contact for the organization's Web site.
With a staff of 16, the TCAA operates two canals, the Tehama Colusa Canal and the Corning Canal, that provide water for agriculture in central California, near the city of Chico. Both systems are owned by the federal government.
The security of SCADA systems, which are used to control heavy machinery in industry, has become a hot-button topic in recent years. In September, video of an Idaho National Laboratory demonstration of a SCADA attack was aired on CNN, showing how a software bug could be exploited to destroy a power generator.
In the video, the turbine was gradually worn out and left shuddering and smoking. Sources familiar with the hack say this was done by turning the generator off and on while it was out of phase with the power grid, putting excessive stress on the turbine and causing its components to wear out.
It's not clear how much damage the attack on the authority's SCADA system could have caused, but in 2000 a disgruntled former employee was able to access the SCADA system at Maroochy Water Services in Nambour, Australia, and spill raw sewage into waterways, hotel grounds and canals in the area. That man, Vitek Boden, was eventually sentenced to two years in prison.
Even if an attack were to knock the TCAA's SCADA system offline, the canals could continue to operate, said Robin Taylor, assistant U.S. attorney with the U.S. Department of Justice, which is prosecuting the Keehn case. "When the computer doesn't work, they have to go to manual operation," she said.
The intrusion cost the TCAA more than $5,000 in damages, Taylor said.
November 06, 2007 (Computerworld) -- Harnessing the power of more than 795,000 computers around the world, a new research project that will analyze human proteins in the fight against cancer begins today using the World Community Grid, which was built and is maintained by IBM.
By using the combined computing power of the grid, the Help Conquer Cancer project will allow cancer researchers to drastically shorten the amount of time it would take to analyze 90 million images of crystallized proteins, from 162 years using existing computing systems to between one to two years using the harnessed power of the grid.
"Even with the largest computers we have, it would not be possible to finish this task," said Igor Jurisica, who leads the research team at the Ontario Cancer Institute in Canada, where the work is being done. Also participating in the work are scientists at Princess Margaret Hospital and the University Health Network.
The researchers will analyze the results of experiments on proteins using data collected by other scientists at the Hauptman-Woodward Medical Research Institute in Buffalo, N.Y.
The World Community Grid was created by IBM about three years ago as a way to harness unused global computing power to help solve a variety of health and scientific issues. The project calls on home and corporate PC users to register with the grid, then download and install a small software program that allows their unused computer cycles to work on critical scientific research.
Robin Willner, vice president of global community initiatives at IBM, said the total number of grid participants so far is about 795,000 around the world and grows daily. The combined computer power so far would create a supercomputer that would be the fifth most powerful in the world if it were in one place, she said. The grid uses participants' computers when the systems are idle.
The results of the research will go into the public domain and will be used by cancer researchers around the world, she said.
Three levels of security are part of the grid system and security audits are done constantly, Willner said.
By using the grid to better understand the structure of human proteins, researchers are trying to understand disease-related proteins and how they function, Jurisica said.
Once the 90 million images of some 9,400 different proteins are analyzed, data mining techniques will be used to go through the results, he said. Previous experiments have looked at smaller groups of samples because the means didn't exist to analyze them all, he said.
"This will be important for future research," Jurisica said. "Hopefully, it will shed light on the principles or mechanisms of the proteins."
"We know that most cancers are caused by defective proteins in our bodies, but we need to better understand the specific function of those proteins and how they interact in the body," he said. "We also have to find proteins that will enable us to diagnose cancer earlier, before symptoms appear, to have the best chance of treating the disease -- or potentially stopping it completely."
Eight other projects have been run so far on the World Community Grid, including protein folding and FightAIDS@Home, which completed five years of HIV/AIDS research in six months. Additional projects are also being scheduled.
October 09, 2007 (Computerworld) -- Sun Microsystems Inc. today released its next generation of multicore chip technology, the Niagara 2 processor, which it says more than doubles the performance of its predecessor chip. Sun also disclosed that the next version of the chip, the 16-core Rock processor, will ship next year.
The UltraSparc T2, which is shipping in rack-mounted and blade server models, doubles the threads on an eight-core chip to 64, Sun said.
John Fowler, Sun's executive vice president of systems, said the latest release is part of a Sun effort to "move to systems that are designed for very high core and thread count." He also noted that the new system is ideally suited for virtualization, a direction that will envelop "basically our entire product line over time."
Sun described the T2 as an attractive virtualization platform, with logical partitions or LDoms as Sun calls them, that can support up to 64 copies of Solaris.
Fowler also noted that the development team has also married cryptographic security technology directly on the chip instead of having it on a separate card, which helps boost performance.
The new chip also offers improved floating-point capability, and it consumes 15% to 20% more power than the predecessor T1 processor, he said.
The initial product release will include a blade server, the T6320, which is priced from $9,995, and two rack systems T5120 and T5220, which start at $13,995.
Nathan Brookwood, an analyst at research firm Insight 64 in Saratoga, Calif., said virtualization capabilities included with the system as well as its performance per watt, will appeal to users. He believes the new systems "will be a compelling story" to Solaris users, and to companies running Linux with applications that have a Solaris equivalent.
Set to purchase assets of Cluster File Systems for undisclosed sum
(Computerworld) -- Sun Microsystems Inc. Wednesday agreed to purchase most of Cluster File Systems Inc.'s business assets and intellectual property, including the Lustre file system, an open-source software distribution tool.
Terms of the deal, expected to close on Oct. 1, were not disclosed.
In a statement, Sun said that it plans to port the Lustre file system to Solaris and to step up efforts to augment Lustre on the Linux-based systems of multiple vendors. When contacted, Sun officials refused to elaborate on their plans for the technology.
Sun and Cluster File Systems in July had agreed to jointly integrate Lustre and the OpenSolaris ZFS file system.
The Lustre file system is typically used to power large-scale server applications running in high-performance computing environments, because of its ability to support massive amounts of storage capacity and server clusters without severe performance impact.
The acquisition comes amid questions surrounding Sun's legal ownership of the ZFS, which emerged last week when Network Appliance Inc. contended in a lawsuit that the technology infringes on patents it owns. The lawsuit was filed last week in federal court in Lufkin, Texas.
Earlier this year, Sun donated its ZFS code to the open-source community. That effort prompted analysts to fear that the Network Appliance lawsuit could have a far-reaching effect -- potentially adverse -- on the future of open-source technology.
(Network World) -- Grid middleware vendor Appistry Inc. Monday launched a software module that automatically powers down servers when they are not needed by applications, thus saving on energy consumption.
The company's Enterprise Application Fabric (EAF) virtualizes applications enabled with Appistry middleware across x86 servers. The new EnergySaver module lets administrators define policies that establish acceptable workload levels and turn off computers when application use is low. When additional capacity is required, EnergySaver policies reactivate the servers.
EAF is used best in power-hungry, transaction-intensive environments. Because applications are decoupled from the grid of servers on which they run, energy can be saved by powering off servers when they are not needed. Additionally, EAF contains load-balancing and workload management features. The software provides high availability by replicating the state of a request to multiple places, so if a machine goes down, the request can be executed on another machine in the grid.
One customer, GeoEye Inc. in St. Louis, is getting ready to deploy EnergySaver. GeoEye collects satellite imagery for the Department of Defense and other customers. Ray Helmering, vice president of product engineering at GeoEye, said that with EnergySaver, he can set policies to shut down servers when the output of the satellites varies because of geographical position or weather conditions.
"We have variations in our processing schedule depending on the operations of our satellites," he said. "As imagery comes in, we need the processing power, but as there are slower times, we'll be able to save on energy. We don't know the actual impact yet of energy savings, but initial review says that this feature could be very important to us."
GeoEye develops its imaging application in-house and grid-enables it with an Appistry wrapper that allows its operations to be parallelized across the grid. This application requires huge amounts of computations and a large number of processors to run. Helmering's Appistry implementation, for instance, requires 50 dual-core x86 servers.
Analysts are encouraged with Appistry's efforts to consume less power in the data center. "The principle that Appistry is addressing is going to be really important," said Simon Mingay, an analyst at Gartner Inc. in Egham, England "Most data centers have the opportunity to alter the power status of the storage and servers in their infrastructure when that capacity is not required. In data centers, you run everything 24/7 and everyone is incented to keep things that way, which in a world where energy costs are not important, is perfectly fine. In a more energy-conscious world, that becomes more questionable."
Mingay said that many organizations have approached the idea of energy consumption by using job-scheduling software, such as Sun Microsystems Inc.'s N1 Grid Engine or CA Inc.'s Unicenter Autosys Job Manager, which allows applications to run when conditions are optimal for them.
The downside of EnergySaver, according to Mingay, is that it has to be deployed on Appistry-enabled applications. "We are going to see more of this technology, but right now applications need to be modified to work in the Appistry environment. That renders it generally unapplicable."
Appistry was founded in 2001 and is focused on data-intensive intelligence agencies, oil and gas and logistics organization.
(Network World) -- Grid middleware vendor Appistry Inc. Monday launched a software module that automatically powers down servers when they are not needed by applications, thus saving on energy consumption.
The company's Enterprise Application Fabric (EAF) virtualizes applications enabled with Appistry middleware across x86 servers. The new EnergySaver module lets administrators define policies that establish acceptable workload levels and turn off computers when application use is low. When additional capacity is required, EnergySaver policies reactivate the servers.
EAF is used best in power-hungry, transaction-intensive environments. Because applications are decoupled from the grid of servers on which they run, energy can be saved by powering off servers when they are not needed. Additionally, EAF contains load-balancing and workload management features. The software provides high availability by replicating the state of a request to multiple places, so if a machine goes down, the request can be executed on another machine in the grid.
One customer, GeoEye Inc. in St. Louis, is getting ready to deploy EnergySaver. GeoEye collects satellite imagery for the Department of Defense and other customers. Ray Helmering, vice president of product engineering at GeoEye, said that with EnergySaver, he can set policies to shut down servers when the output of the satellites varies because of geographical position or weather conditions.
"We have variations in our processing schedule depending on the operations of our satellites," he said. "As imagery comes in, we need the processing power, but as there are slower times, we'll be able to save on energy. We don't know the actual impact yet of energy savings, but initial review says that this feature could be very important to us."
GeoEye develops its imaging application in-house and grid-enables it with an Appistry wrapper that allows its operations to be parallelized across the grid. This application requires huge amounts of computations and a large number of processors to run. Helmering's Appistry implementation, for instance, requires 50 dual-core x86 servers.
Analysts are encouraged with Appistry's efforts to consume less power in the data center. "The principle that Appistry is addressing is going to be really important," said Simon Mingay, an analyst at Gartner Inc. in Egham, England "Most data centers have the opportunity to alter the power status of the storage and servers in their infrastructure when that capacity is not required. In data centers, you run everything 24/7 and everyone is incented to keep things that way, which in a world where energy costs are not important, is perfectly fine. In a more energy-conscious world, that becomes more questionable."
Mingay said that many organizations have approached the idea of energy consumption by using job-scheduling software, such as Sun Microsystems Inc.'s N1 Grid Engine or CA Inc.'s Unicenter Autosys Job Manager, which allows applications to run when conditions are optimal for them.
The downside of EnergySaver, according to Mingay, is that it has to be deployed on Appistry-enabled applications. "We are going to see more of this technology, but right now applications need to be modified to work in the Appistry environment. That renders it generally unapplicable."
Appistry was founded in 2001 and is focused on data-intensive intelligence agencies, oil and gas and logistics organization.
Sun's grid computing service goes global
(IDG News Service) -- Sun Microsystems Inc. is expanding its Network.com utility computing service from the U.S. to 23 countries in Europe and Asia, the company said Thursday.
The utility computing service, in which customers pay an hourly rate for access to a Sun data center, began as a U.S.-only pilot in March but is now ready for a large geographic expansion, said Rohit Valia, group product manager for the Sun Grid Compute Utility.
Sun charges $1 per CPU per hour to access a network of Sun x64 hardware running the Solaris 10 operating system. End users can now access the utility from Australia, Austria, Belgium, Canada, China, the Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, India, Ireland, Italy, Japan, New Zealand, Poland, Portugal, Singapore, Spain, Sweden and the U.K.
IBM, Hewlett-Packard Co. and other computer vendors provide similar services. Utility computing, also called on-demand computing or, more informally, computing "in the cloud," is for organizations that have a short-term need for extra computing capacity but don't want to incur the expense of adding onto their own data centers. By taking advantage of utility computing services, they only have to build out their own IT infrastructures to handle an average level of usage, not the occasional peak usage, said Valia.
"Our business model is around charging for CPU cycles, not idle CPUs. We only charge when your CPU is actually processing data," he said.
Sun is also adding a feature called Network.com Internet Access that enables customers to interact, through Sun's utility data center and the Internet, with other companies that have resources the customer might want to use for a particular project. The company will also offer a limited beta program for developers called Job Management Application Programming Interfaces. This offering allows users to perform production-scale tests when they're building software applications using Network.com.
The Gridbus Project is engaged in the design and development of grid middleware technologies to support eScience and eBusiness applications. These include visual Grid application development tools for rapid creation of distributed applications, competitive economy-based Grid scheduler, cooperative economybased cluster scheduler, Web-services based Grid market directory (GMD), Grid accounting services, Gridscape for creation of dynamic and interactive testbed portals, G-monitor portal for web-based management of Grid applications execution, and the widely used GridSim toolkit for performance evaluation. Recently, the Gridbus Project has developed a Windows/.NET-based desktop clustering software and Grid job web services to support the integration of both Windows and Unix-class resources for Grid computing. A layered architecture for realisation of low-level and high-level Grid technologies is shown in the figure below. Some of the Gridbus technologies discussed below have been developed by making use of Web Services technologies and services provided by low-level Grid middleware, particularly Globus Toolkit and Alchemi. A summary and status of various Gridbus technologies is listed below.
For more information please have a look at the:
Flash Demos: Demos
Manual: [PDF Version] [Word version]
Flyer: [PDF Version] [Word version]

منبع : http://www.gridbus.org
Fortune 400 businesses, and many smaller ones, run clusters in many different locations around the world.
As facility, management and other costs continue to become larger and large shares of corporate IT budgets, networking costs continue to fall. The result is that data center consolidation becomes a more reasonable goal.
I've seen this in a few of my customers. Beginning last year people started looking more toward grid technology to help them manage this. As the economy has tightened more people have considered this. Particularly as part of a plan toward cost reduction by moving to open source tools.
The general pattern is that the IT group decides they need to find ways to more effectively manage large and disconnected sets of resources. They turn to grid computing to help them manage that cloud and in the process realize that they have a lot of special purpose machines that are being quite underutilized and that they have enormous duplication of effort in the management of those data centers.
As we've entered into a bear market, many companies are taking a second look at their IT costs and looking for ways to tighten their belts. The combination of open source and grid/cloud computing models offers the ability to do that with open source offering a lower cost software acquisition model and grid computing allowing reduction in IT staff through centralization.
I've also been working with folks on the lost art of environment management. But more on that in a future blog...
از وبلاگ : http://gridgurus.typepad.com
How to Build Utility Computing Infrastructures with Globus
This is a guest post by Ignacio Martín Llorente, Professor of Distributed Systems Architectures at Universidad Complutense de Madrid.
While research institutions are interested in Partner Grids that provide access to a higher computing performance to satisfy peak demands and support to face collaborative projects; enterprises understand grid computing as a way to address the changing service needs in an organization. They are interested in in-house resource sharing, to achieve a better return from their information technology investment, supplemented by outsourced resources, to satisfy peak or unusual demands. An Outsourced/Utility Grid would provide pay-per-use computational power when Enterprise Grid resources are overloaded. Such hierarchical grid organization may be extended recursively to federate a higher number of Partner or Outsourced Grid infrastructures with consumer/provider relationships. This would allow supplying resources on demand, making resource provision more agile and adaptive. It would offer, therefore, access to a potentially unlimited computational capacity, causing IT costs to transform from fixed to variable.
In the context of the GridWay project we have developed a Grid Gateway that exposes a WSRF interface to a metascheduling instance, so enabling the creation of hierarchical grid structures. GridGateWay consists of a set of Globus services hosting a GridWay Metascheduler, thus providing a uniform, standard interface for the secure and reliable submission, monitoring and control of jobs. Most functionality is provided through GRAM (Grid Resource Allocation and Management), while scheduling information is provided through MDS (Monitoring and Discovery Service). The security requirement at the user level is addressed by GSI (Globus Security Infrastructure).
The new technology allows different layers of metaschedulers to be arranged in a hierarchical structure. In this arrangement, each target grid is handled as another resource, that is, the underlying grid is characterized as a single resource in the source grid, by means of grid gateways. This strategy encourages companies to federate their grids in order to have a better return of IT investment, and also satisfy peak demands of computation. Furthermore, this solution allows for gradual deployment (from fully in-house to fully outsourced), in order to deal with the obstacles for grid technology adoption, such as enterprise scepticism and IT staff resistance.
This approach also provides the components required for interoperability between existing Grid infrastructures. It is clear that we can’t wait for a single global grid to arise or to become predominant. Instead, we should work to build a seamless integration of the existing grids, which may eventually constitute the ultimate, capital-letter Grid, Grid of grids, or InterGrid, in the same way that the Internet was born. Grid interoperability can be achieved by means of common, ideally standard, grid interfaces, whose existence is an important (if not essential) characteristic of grid systems. Unfortunately, common interfaces (and even less standard ones) are not always available for given services. Then, the use of grid adapters and gateways becomes necessary. In particular, an interoperability solution based on grid gateways provides the infrastructures with significant benefits in terms of autonomy, scalability, deployment and security.
Well, what are you waiting for?, components are open-source, license is Apache v2.0, and we are willing to collaborate with you.
از وبلاگ : http://gridgurus.typepad.com
برای نصب Globusموارد زیر را باید در نظر بگیریم
نسبت به نرم افزاری که از Globus دانلود کردیم باید linux مربوط به آن را نصب کرده
حال برای پیاده سازی باید ورژن جاوا در linux و gcc آن را چک کنیم که اگر آن ورژن ها را ساپورت نمی کرد آنها را نصب کنیم ورژن جاوا باید 1.6 باشد
برای چک کردن ورژن جاوا دستور زیر را در ترمینال تایپ می کنیم
Java -version
و برای update ورژن جاوا مراحل زیر را انجام می دهیم
با این دستور java 1.6 از zip خارج می شود
(اسم فایل (java tar xzvf
نکته: می توان برای سریعتر نوشتن اسم فایل حرف اول را نوشته و بعد Tabرا بزنید با این کار سریع بقیه اسم فایل را به صورت اتوماتیک می آورد
بعد محتویات java 1.6 را در شاخه HOME\USER که رفته ایم OverWrite می کنیم
به این ترتیب ورژن جاوا update می شود
بعد ورژن gcc را چک میکنیم با دستور زیر
gcc –v
باید ورژن gcc 4.1 نباشد چون باگ دارد می توان از ورژنهای
3.2. 3.2.1 و 2.95.x استفاده کرد
و gccرا نمی توان updateکردو نسخه ایی از linuxکه این ورژن را دارد نصب می کنیم
نرم افزار Tomcatرا هم باید نصب کرد ولی در زمان کامپایل به آن نیاز نداریم و در زمان Runtime به آن نیاز داریم
اگراز لینوکس suse استفاده میکنید به هیچ نرم افزار جانبی احتیاجی نداریم
با دستور زیر یک user به نام Globus درست می کنیم
root# useradd globus
و از شاخه system \group and user.. می توان user مورد نظر را ساخت
ودر شاخه usr/local/globus محتویاتی که دانلود کردیم از globusکپی می کنیم و در آن شاخهf4 میزنیم و ترمینال باز میشود و دستورات زیر را تایپ می کنیم
نسبت به نام فولدری که در شاخه usr/local است نام آخرین فولدر را انتخاب می کنیم اگر فولدری که در این مسیر بود usr/local/globus-4.0.1 بود
این پیغام را باید تایپ کنیم
# mkdir /usr/local/globus-4.0.1# chown globus:globus /usr/local/globus-4.0.1حالا از user root خارج می شویم و به user Globus می رویم با سویچ کردن
و دستورات زیر را در کنسول تایپ می کنیم
معنی globus$ این است که از مسیری که هستیم این دستور را اجرا کنیم
و مثلا برای اجرای ./configure نباید اولش globus$ را تایپ کنیم و بعد از آن را. در اینجا globus$یعنی در مسیر usr/local/globus-4.0.1 باشیم و
./configure --prefix=$GLOBUS_LOCATION را تایپ کنیم
globus$ export GLOBUS_LOCATION=/usr/local/globus-4.0.1globus$ ./configure --prefix=$GLOBUS_LOCATIONاگر در root بودیم و ./configure می کردیم error می داد.
ولی حالا باید این پیغام را بدهد
1. Optional Features:
2. --enable-prewsmds Build pre-webservices mds. Default is disabled.
3. --enable-wsgram-condor Build GRAM Condor scheduler interface. Default is disabled.
4. --enable-wsgram-lsf Build GRAM LSF scheduler interface. Default is disabled.
5. --enable-wsgram-pbs Build GRAM PBS scheduler interface. Default is disabled.
6. --enable-i18n Enable internationalization. Default is disabled.
7. --enable-drs Enable Data Replication Service. Default is disabled.
8. [...]
9. Optional Packages:
10. [...]
11. --with-iodbc=dir Use the iodbc library in dir/lib/libiodbc.so.
12. Required for RLS builds.
13. --with-gsiopensshargs="args"
14. Arguments to pass to the build of GSI-OpenSSH, like
15. --with-tcp-wrappers
در مرحله چهارم
globus$ make
نکته :اگر شما یک log file بخواهید داشته باشید باید تایپ کنید
globus$ make 2>&1 | tee build.logدر مرحله پنجم و آخر
globus$ make installدر این مرحله کامل شده است Install و حالا شما باید پیکربندی کنید قسمتهایی که در زیر شرح داده شده است
توصیه میکنیم که Install کنید هر security
حالا شما مراحل security را طبق این step ها باید نصب کنید
که در بر می گیرد به دست آوردن host certificates و user certificates و ساختن
grid-mapfile که در صفحات بعدی به آن اشاره می شود
با security setup شما میتوانید شروع کنید سرور GridFTP
پیکربندی DB برای RFT و پیکربندی WS-GRAM
و شما همچنین میتوانید شروع کنید یک GSI-OpenSSH daemon
و setup کنید یک سرور MyProxy و اجرا کنید RLS و استفاده کنید CAS
IT هر روز لينوكس را بيشتر به كار میبرد و گزينههای grid نيز با استفاده هر چه بيشتر مواجهند
نويسنده: Carol Sliwa
Computer World
مترجم: زهره چكنی
فرامينگهام- آن دسته از كاربران اداری كه تاكنون در مورد استفاده از Open-Source مردد بودهاند هفته آينده فرصت دارند كه گزينههای آمادهای را از سوی سازندگان مطرح دنيا پيش رو داشته باشند، كه تلاش دارند خريد، استفاده و مديريت از سيستمهای مبتنی بر لينوكس را سهولت بيشتری بخشند.
شركتهای دل، هيوليتپاكارد و آیبیام از جمله سازندگان فراوانی هستند كه از كنفرانس و نمايشگاه Linux World Conference & Expo در سانفرانسيسكو بهره جستهاند تا خدمات و محصولات خود را معرفی كنند، خدمات و محصولاتی كه برای راحتی هر چه بيشتر كاربران در انتخاب لينوكس و ساير محصولات نرمافزاری منبع باز طراحی شدهاند.
به عنوان مثال، شركت دل قصد دارد پردازشگرهای اينتل دو هستهای در سرورهای 850 و 830 محصول Power Edge را معرفی كرده و به مشتريان فرصت دهد گزينه مجموعه نرمافزار منبع باز و سختافزار را در يك جا تجربه كنند.
كاربران میتوانند ردهت يا SuSE را به علاوه ديتابيس MySQL و سرور برنامه JBoss داشته باشند. به علاوه آنها میتوانند آبونمان پشتيبانی برای MySQL Network و JBoss Network را مستقيما از دل خريداری كنند.
Judy Charis مدير بخش توسعه تجاری و اعتلاف جهانی برای لينوكس و منبع باز در شركت دل میگويد، هدف كمك به كاربران برای استفاده و اجرای سريع با سيستم آزمايش شده و پشتيبانی شدهای است كه مثل سرور ويندوز بيرون از جعبه كار میكند.
تطبيق آسانتر
قابليت دسترسی به محصولات يك جا شده برای بسياری از كاربران اوليه لينوكس چندان امر مهمی محسوب نمیشد، زيرا آنها خود مهارتهای لازمه خانگی برای پيكربندی و نصب سيستم را داشتند.
Joseph Foran، مدير IT در FSW Inc واقع در بريج پورت ايالت كنتاكی میگويد برای اين شركت خدمات رسانی غيرانتفاعی، نصب لينوكس و بقيه به اصطلاح رده LAMP كه خود شامل سرور MySQL ، Apache web و Perl، PHP يا زبان برنامه نويسی Python میشد را هرگز مسئله جدی نمیدانسته است. يك استك LAMP پيشرفته كه دارای يك سرور برنامه پيكربندی شده با برنامههای تجاری ممكن است بسيار مفيد باشد. اما اگر و تنها اگر شما مهارت لازمه را داشته باشيد، در غير اينصورت به درد نمیخورد.
به هر حال با سرعت گرفتن لينوكس در جريان استفاده IT، بيشتر شركتها بالاخره به سازندگانی رو میآورند كه استفاده از تكنولوژی مذكور را تسهيل میكنند.
Dankusentcky از تحليلگران در فرامينگهام ماساچوست میگويد ، نبرد نرمافزار برنامه لازم و نبود مهارت كافی در سايتهای مشتريان مانع اصلی و بزرگ بر سر راه اتخاذ لينوكس بوده است.
HP با افتتاح چهار مركز Linux Expertise در ايالات متحده برای سازندگان نرمافزار، برنامه نويسان و ادغام كنندههای سيستمها استفاده از نرمافزار منبع باز را ترويج كرده است و به اين وسيله توانسته هماهنگی محصولات اين افراد را با سختافزار خود مطمئنتر سازد.
HP قصد دارد بيش از 200 بسته نرمافزاری منبع باز را برای سرورهای Integrity NonStop عرضه كند.
آیبیام با عرضه بسته "Grid and Grow" كه شامل يكی از انتخابهای سرور Blade Center با يك شاسی آماده گسترش، يك سيستم عامل، ميان افزار grid و خدمات میباشد، تلاش میكند كاربران بيشتری را به استفاده از محاسبات grid تشويق كند. قيمت اين پكيج 49000 دلار است.
Al Bunshaft قائم مقام بخش Grid Computing در آیبیام میگويد بيش از دو سوم تستهای grid كه شركت در آن دخالت داشته است مبتنی بر لينوكس بودهاند. او میگويد grid حال و هوای پيچيدهای دارد و میخواهيم پيچيدگی را از آن دور كنيم.
Torsten Geers قائم مقام SAP میگويد، يكی از علائمی كه نشان میدهد سازندگان نرمافزاری تلاش به جلب توجه هر چه بيشتر به سمت پيشينيانشان از لينوكس دارند در نمايشگاه Linux World از سالن SAP AG به چشم میخورد، كه در آن تلاش میشد كاربران هر چه بيشتر با برنامههايی آشنا شوند كه در اين سيستم عامل عمل میكنند. درصد كاربران SAP با لينوكس كم ولی به سرعت در حال رشد است.

At Univa UD we deal with a variety of different customers. These folks are trying to solve business problems in a semiconductor, life science, financial services, big science and lots of other sectors. What they have in common is that they have existing infrastructure and are hyper-concerned about business disruption while moving in a new direction. They should be.
There are a lot of ways to approach grid computing that require you to replace what you have with something new. This is particularly the case for vendors of proprietary tools. These tools are built on proprietary protocols that make it difficult to integrate other services or applications. Combine these two issues and it can be tough to get anything bigger than a cluster up and running. If you already have a cluster, or more, up and running, this disruption will have a real impact on your ability to accomplish your goals.
To borrow an old saying, you want your approach to be evolutionary rather than revolutionary. This means moving in a new direction using a phased delivery that allows existing work or research to continue without interruption.
With Globus, this is achieved by creating an additional layer atop existing resources. A common security platform is built on local security layers. A common job submission mechanism replaces product specific ones. A monitoring system that can aggregate information from multiple sources replaces those that only report data from their specific resource.
With these steps in place, new users, applications and clusters can be provisioned in ways that allow flexible cluster usage, better aggregate throughput and higher cluster utilization rates. Then, as time permits, existing applications -- and particularly scripts and workflows -- can be ported from their existing platform to interfaces that will allow them to utilize all the bandwidth available in the organization. Dig?
از وبلاگ : http://gridgurus.typepad.com
Grid computing is celebrating 11 years next month, and is poised to become increasingly mainstream in the coming years. There are a number of reasons that this is true, and most of them are the time tested ideas that have been proving themselves in your research institutions and businesses for years. The grid is about allowing your organization to run more efficiently and more effectively than can be done with more conventional technology solutions. It's about bringing many machines together in coordination around a task. It's about bringing data storage and movement to bear in a coordinated fashion with your application. It's about allowing people from different parts of your organization to work together more easily.
از وبلاگ : http://gridgurus.typepad.com
IDC recently published a finding that 92% of users have applications that are I/O constrained. That's a shockingly large number given the options that exist for reducing this pain. Let's break this down into three major categories:
The nature of each of these domains is very different and the options available to reduce the problems are similarly different.
Administrators unfamiliar with the demands of certain classes of applications will sometimes mount their scratch disk as (oh, the horror!) NFS. Over the past five years the average knowledge level has crept up as the community has grown and gained experience, but I have seen in the last couple months that there are still clusters out there than are doing significant I/O to slow network scratch.
Typically consisting of disk, NAS or SAN devices people have a tendency to not buy enough bandwidth (either in the form of network I/O or aggregate disk I/O) or view the purchase of these facilities as one time expenses, failing to keep up with their users expanding demand as time progresses.
Migrating data to tape is the only game in town for long term, high capacity storage (so far the decade old promise of using disk for long term storage still seems to be a decade away). The problem is that with drives and automated libraries costing enormous amounts of money, coupled with the latencies inherent in this type of storage, applications are left sitting tapping their feet waiting for data to stream in.
The services in the grid must be programmed to be aware of the data they require. An early example of this is the DDM system in incubation in dev.globus. This system knows what data is located in different resources on the grid and can thus be integrated with workflow and scheduling systems to pre-stage data to working storage before the application is started. This completely eliminates two of the three I/O constraints, and the last one is the easiest and cheapest. Just stop mounting your scratch on NFS...
از وبلاگ : http://gridgurus.typepad.com
Seth wrote:
The traffic engineers in New York think nothing of wasting two minutes of each person's time as they approach a gated toll booth. Multiply that two minutes times 12,000 people and it's a lot of hours every day, isn't it?
The truth in this is obvious, and it applies to the grid also. I've talked with folks that has hundreds or engineers each spending a third of their time managing jobs and data on their clusters. That's a lot of time that could be spent advancing their business wasted. Even if just on an expense basis, that's $3M in labor costs. And that's on the low end.
There were significant wins in moving from SMP boxes to clusters. There were significant wins in moving from clusters to grids. Now it's time to realize the next win by managing your grid effectively.
از وبلاگ : http://gridgurus.typepad.com
One of my favorite quotes is from E.B. White:
No one can write decently who is distrustful of the reader's intelligence, or whose attitude is patronizing
Pawel Plaszczak and I certainly took this sort of goal seriously when we wrote our Savvy Manager's Guide. You should take it seriously when you design your grid.
The single biggest mistake people make is to not trust their users to provide reasonable requirements. Designers and architects go out and talk to users, then write-off the feedback they get as being general guidance, rather than hard requirements.
Google, as an example, took their users seriously from day one. They could have created yet another site so littered with ads that it was unreadable, but instead created a user experience that is now the subject of design classes. You can do the same. Talk to your users. Spend a day understanding how they interact with their system. Get a bit deeper into the business issues that justify the IT expenses that feed your children and pay your mortgage.
Take your users seriously, feel their pain and be their hero.
از وبلاگ : http://gridgurus.typepad.com
Attending the working sessions at various conferences I hear a theme over and over again, "how can grid computing help us meet our goal of 80% utilization"? People post graphs showing how they went from 20% utilization to 50% and finally 80%. People celebrate achievement of this number as an axiom. The 80% utilized cluster is the well managed cluster. This is the wrong goal.
The way to illustrate this is to ask how 80% utilization brings a new drug to market more quickly? How does 80% create a new chip? How does 80% get financial results or insurance calculations done more quickly?
Of course, it does none of those things. 80% isn't even a measure of IT efficiency, though most people use it as such. It's only a statistic that deals with a cluster itself. It is, however, measurable, so it's easy to stand up as an objective that the organization can meet. The question to ask is, does an 80% target actually hurt the business of the company?
That target has three problems:
If your clusters are running at 80% that means that you have a lot of periods when work is being queued up and waiting. Think about the utilization pattern of your cluster. Almost every cluster out there is in one of two patterns. They are busy starting at nine in the morning when people start running work and the queue empties overnight. Or, they are busy starting at three in the afternoon when people have finished thinking about what they need to run overnight and the queue empties the next morning.
During the times when the queues are backed up, you are losing time. These jobs waiting represented people who are waiting, scientists who aren't making progress, portfolio analysts who are trailing the competition and semiconductor designers who are spending time managing workflow instead of designing new hardware.
For most businesses it's queue time and latency that matters more than utilization rates. Latency is the time that your most expensive resources, your scientists, designers, engineers, economists and other researchers are waiting for results from the system. Data centers are expensive. Don't get me wrong, I'm not arguing that it's time to start throwing money at clusters without consideration. It's just that understanding the way the business operates is critical to determining what the budget should be. Is the incremental cost of having another 100 or 1000 nodes really more than the cost of delaying the results that your business needs to remain viable?
Don't be willing to be the manager that measures what is convenient rather than what is valuable to the future of your business. Be 'savvy' in your approach. Find ways to understand the behavior of your drug discovery processes on your clusters, even if you are an IT guy instead of a computational chemist. Find ways to demonstrate how your approach of reducing cluster latency is turning up the heat on the next chip design. Find ways to measure what keeps your business around so that you can be part of the process of creating value instead of viewed by that CFO as nothing more than a cost center to be optimized away.
The message is that cost is only one part of the equation. Likely, it's even a minor part of the equation. Don't get yourself lost measuring the price of your stationary when it's the invoices you're putting in the envelopes that matters.
از وبلاگ : http://gridgurus.typepad.com
Jeremy Sherwood from opus:interactive has a good write up of HostingCon 2007.
My experience with Grid Computing goes back to the late 1990s with distributed.net in helping making encryption that much secure. With the technology originally designed to harness unused CPU cycles to solve complex problems, to now being used to hosting an infinite number of hosting environments. It is amazing the level of reliability and scalability options that are available with the system. The ability to grow in resources at an unlimited rate -on the fly- with little to no exposure to change, is outstanding. The other great aspect of this system of technology is the ability to contribute to a sustainable mindset. If done properly, you can reuse old servers and hardware that in a normal life cycle would be recycled, now can be reprovisioned back into a production environment with little concern of impact of hardware failure. This rejuvenation of hardware opens up a great opportunity to get that-much-more out of your initial investment as well as being able to pass those saving onto the customer.
از وبلاگ : http://gridgurus.typepad.com

I am not a GridFTP developer but I use GridFTP. A lot.
Often I find myself helping others optimize GridFTP transfers across networks and between machines about which I know little.When I sit down with an engineer or scientist trying to move files fast from one location to another the system administrators (fighting more important fires) and the network engineers (hiding from users) are often unavailable.
So it happens a lot that I have no idea as I begin trying to optimize a GridFTP transfer if the disk I/O for the machines willeven support faster transfers. Moving data at near wire speeds doesn't help if the disks can only read and write at half the wire speed.
One can find lots of good tools for measuring disk I/O but before I grab for those I like to try some GridFTP transfers without disk I/O on either end to get a feel for what role disk I/O might (or might not) be playing.
The globus-url-copy command client in the latest versions of the Globus Toolkit makes it easy to transfer some bits using GridFTP without any disk I/O. You simply have to "pull" or"read" data from the "file" /dev/zero and "write" data to the "file" /dev/null. The syntax is straight forward:
globus-url-copy -vb -p 4 gsiftp://one.machine.com/dev/zero file:/dev/null
Try using that syntax the next time you sit down to optimize a GridFTP transfer and you want to get a feel for the networkinfrastructure without being hindered by disk I/O on either end of the transfer.
از وبلاگ : http://gridgurus.typepad.com
Grid Engine (GE) is becoming increasingly popular software for distributed resource management. Although it comes with a GUI that can be used for various administrative and configuration tasks, the fact that all of those tasks can be scripted is very appealing. The GE Scripting HOWTO document already contains a few examples to get one started, but I wanted to further illustrate the usefulness of this GE feature with a simple example of a utility that modifies shell start mode for all queues in the system:
#!/bin/sh # Utility to modify shell start mode for all GE queues. # Usage: modify_shell_start_mode.sh# can be one of unix_behavior, posix_compliant or script_from_stdin # Temporary config file. tmpFile=/tmp/sge_q.$$ # Get new mode. newMode=$1 # Modify all known queues.
for q in `qconf -sql`; do
# Prepare queue modification.
echo "Modifying queue: $q"
cmd=”qconf -sq $q | sed 's?shell_start_mode.*?shell_start_mode $newMode?' > $tmpFile”
eval $cmd
# Modify queue.
qconf -Mq $tmpFile
# Cleanup.
rm -f $tmpFile
done
Using the above script one can quickly modify the variable for all queues without having to go through the manual configuration steps.
The basic approach of 1) preparing new configuration file by modifying the current object configuration, and 2) reconfiguring GE using the prepared file, works for a wide variety of tasks. There are cases, however, in which the desired object does not exist and has to be added. Those cases can be handled by modifying the EDITOR environment variable and invoking the appropriate qconf command. For example, here is a simple script that creates set of new queues from the command line:
#!/bin/sh# Utility to add new queues automatically.
# Usage: add_queue.sh… # Force non-interactive mode.
EDITOR=/bin/cat; export EDITOR# Get new queue names.
newQueues=$@# Add new queues.
for q in $newQueues; do
echo "Adding queue: $q"
qconf -aq $q
done
Utilities like the ones shown here get written once and usually quickly become indispensable tools for experienced GE administrators.
از وبلاگ : http://gridgurus.typepad.com

Every day we wake up to a new barrage of virtualization articles. I can't even read them all anymore, instead scanning headlines guided by statistical sampling (or is that stochastic?).
The hype is thick in the air, but it's not entirely unfounded. Somewhere in there we can see grid computing's going to be affected long term by OS virtualization in one way or another.
In this series we'll look at what's happening with various grid-VM efforts, often through a Globus lens (I work on the Globus Virtual Workspaces project so it's almost going to be impossible to avoid that).
There's a tradeoff between application performance improvements and developer time. Developers are expensive, development is time consuming. Perhaps it's worth waiting an extra few hours for results if it means you can start right now and stop paying those fine people. Obviously any particular calculation is going to be more nuanced than this, but I just wanted to set up an analogy.
In a similar vein, with virtualization you can take your prepared application+environment and get going on a new platform in minutes, not months. Cycles can be acquired and the exact compute environments can be provisioned out to the provider site's nodes. Resource consumption can be quantified well by the site (and even enforced at a fine grain). Less of the client's and site's administrators time (someone's money) needs to be spent on setup, environment conflicts, etc.
For all this you may take a small performance hit, but sometimes that's just worth it.
It sounds perfect, maybe. It's not quite, and we will look at a few problems, many of which only look temporary. A lot of progress is being made to get rid of the complexity, encapsulate it better, or factor it in such a way that the person/role who should be handling that complexity actually does (instead of it being unecessarily multiplied or divided across many people/roles).
Part 2? I'd like to talk about coordinating many VMs to work together, something being called contextualization. The fightin' Contextualization!
(Apologies to Stephen Colbert)
از وبلاگ : http://gridgurus.typepad.com
Next week is OGF21, where grid gurus from around the world assemble to discuss technologies, applications, standards, and how gray the weather is in Seattle.
We have organized a full day of Globus material on Wednesday October 17. We'll have overviews of old favorites such as GridFTP, RLS, OGSA-DAI and the GT4 distribution, as well as introductions to some of our many new Incubator projects: Shannon Hastings, OSU, discussing the service authoring tool Introduce, Steve Tuecke of UnivaUD discussing Data Catalyst, their open source higher level data solution, and Stephan Erberich who will overview the Internet2 IDEA Award-winning MEDICUS medical data tool, among others. Come hear about the latest updates and where Globus is going to next, and/or to talk to Globus architects and developers about things like:
If you'd like to meet with someone from the Globus team in Seattle, please email us: we'll see you there!
از وبلاگ : http://gridgurus.typepad.com
Lately I have been putting a lot of thought into the challenges that grid managers face in building an enterprise grid. Primarily they must support the various stakeholders throughout the enterprise, each of whom has their own sets of application workflows used to meet their business needs.
The software packages that each interested group uses may have a significant overlap with one another, but the similarity stops there. Because each group ostensibly has a different goal, the usage patterns are almost guaranteed to be unique. This implies that the community as a whole will demand any of the following:
When you consider users’ needs in more detail, you will recognize that a number of implications further complicate things:
Meanwhile, grid managers will most likely be focused on providing a stable, secure, and easy to maintain infrastructure that is both cost-effective and capable of meeting the users’ core requirements. Clearly the priorities between the individual groups and the support team will be at odds much of the time.
The most elegant solution to these issues is to build a grid whose execution environments are all virtualized. In this situation, each usage pattern would have its own environment tailored to its own unique needs while the core OS would be under the complete control of the infrastructure staff. Clearly there would be a stakeholder driven set of virtual servers available for use on each node in the grid.
It seems simple enough: rather than creating a complicated infrastructure that will not accommodate all of the situations your users will require, you simply will give them their own isolated operating environments. As you might expect, nothing is that straightforward. The standard tools that you use for grid and virtualization management do not work well in this architecture.
In future posts, we will explore the challenges and possible solutions in detail. In particular we will focus on:
- Networking
- Virtual Server Management
- Job Scheduling
- Performance Monitoring
- Security
- Data Lifecycle
از وبلاگ : http://gridgurus.typepad.com
Ask a user why they use a grid, a cluster, or any other type
of distributed system and you’ll hear, “Why, to get my work done faster, of
course.” But that’s an ambiguous statement at best, since it can mean two things:
faster runtimes or higher throughput. And although they might seem similar,
they’re really not.
Runtime is defined as the wallclock time it takes to complete one task. If you parallelize a task, for instance with MPI, or by taking advantage of the data splitting capabilities of Grid MP, you can get your job back in less time. If you can parallelize your job into 10 parallel sub-jobs and run it on 10 nodes, you can expect that job to complete on average in 1/10th of the time. Plus a bit of overhead of course, but let’s keep it simple for now. In Volvo’s innovative Uddevalla plant, groups of workers assemble entire automobiles in less time than it takes for one worker to complete a whole car. So with 10 workers in a group, you could potentially make a car in 1/10th of the time.
However, sometimes your task cannot be parallelized any further, but you might have lots of them pending. Grids can still help since they can increase the throughput of your jobs. Queuing theory states that with 10 nodes and 10 jobs, you can still expect a unique job to complete on average in 1/10th of the runtime of a single job, without using any parallelism. In a traditional American automotive plant, the car advances on the assembly line and at no point more than one operator is working on one car, so there’s no parallelism involved. It might take up to a day before one car is completed from start to finish, but a new car rolls off the end of the line every few minutes.
So next time when a user brags about his fancy new cluster, ask him whether he’s producing Fords or Volvos.
از وبلاگ : http://gridgurus.typepad.com
Today I read about GridWay winning the “Best Demo Prize” at the EGEE 2007 Conference in Budapest (Congratulations to the GridWay Team!), and this reminded me about the problem of building applications against the binary Globus Toolkit (GT) releases. Namely, building software like GridWay against the binary GT install usually fails with link errors. The problem is that the .la files in the $GLOBUS_LOCATION/lib directory have hardcoded the original build path for the dependency libraries. This issue has been known for some time (see, e.g. GT bug #174), and it persists in the 4.0.x releases of the Toolkit. The easiest solution is to build and install your GT from sources. However, if this is not an option, one can use a script that modifies the hardcoded paths in the binary GT install (do not worry, the script does not modify binary files :-)):
#!/bin/sh
# fix_paths.sh
# Script for modifying hardcoded library dependency paths in the binary
# Globus Toolkit installation.
# Usage.
usage() {
echo "Usage: $0 [oldPath] [newPath]"
}
oldPath=$1
newPath=$2
if [ $# -ne 2 ]; then
usage
exit 1
fi
if [ "$GLOBUS_LOCATION" = "" ]; then
echo "\$GLOBUS_LOCATION is not defined."
exit 1
fi
echo "Replacing $oldPath by $newPath in various ASCII files."
cd $GLOBUS_LOCATION
# Try to avoid header files, *.gar and *.jar files, config xml files, etc.
fileList=`find . -type f ! -name '*.h' -a ! -name '*.gar' -a ! -name '*.xml' -.
cnt=0
for f in $fileList; do
isAscii=`file $f | grep ASCII`
if [ "$isAscii" != "" ]; then
cmd="cat $f | sed 's?$oldPath?$newPath?g' > $f.tmp"
eval $cmd
diffPath=`diff $f.tmp $f`
if [ "$diffPath" != "" ]; then
echo "Fixing: $f"
mv $f.tmp $f
cnt=`expr $cnt + 1`
else
rm -f $f.tmp
fi
fi
done
echo "Fixed $cnt files."
exit 0
In order to use the above script, one has to determine the hardcoded paths by looking into one of the .la files in the $GLOBUS_LOCATION/lib directory. For example:
$ export GLOBUS_LOCATION=/scratch/veseli/devel/lib/
globus-4.0.5/$ cd
$GLOBUS_LOCATION/lib$ pwd/scratch/veseli/devel/lib/
globus-4.0.5/lib$ grep dependency_libs libxmlsec1_openssl_gcc32.ladependency_libs=' -L/home/condor/execute/dir_22100/userdir/install/
lib'$ ~/fix_paths.sh /home/condor/execute/
dir_22100/userdir/install/lib /scratch/
veseli/devel/lib/globus-4.0.5/libReplacing /home/condor/execute/dir_22100/userdir/
install/lib by /scratch/veseli/devel/lib
/globus-4.0.5/lib in various ASCII files.…Fixed
330 files.$ grep dependency_libs libxmlsec1_openssl_gcc32.ladependency_libs=' -L/scratch/veseli/devel/lib/globus-4.0.5/lib'
Once you correct the library dependency paths using this script, you should be able to compile and link external software packages against your binary GT installation.
از وبلاگ : http://gridgurus.typepad.com

For some time now, I've been really interested in the potential applications of grid computing in higher education and, possibly, in secondary education. So, I was really intrigued when I read about Google and IBM's computing cloud for students. Just looking at the headline, my first impression was that students anywhere would be able to have their own computing cloud to use as a playground for learning and experimentation. As it turns out, Google and IBM's computing cloud will be initially used by only five universities, with the goal of giving students a platform in which to learn about parallel programming and Internet-scale applications. Although still a very cool project, I thought this would be a good opportunity to share some ideas of how grid computing could end up benefiting education. Like fellow gridguru Tim Freeman, I'm a part of the Globus Virtual Workspaces project, so my ideas are biased towards how grid computing and workspaces could benefit education.
I have talked with many Computer Science and Engineering lecturers and professors at small colleges and universities who cannot teach certain courses for lack of computing resources. For example, while teaching an introductory programming course requires minimal computing resources (such as a computer lab), teaching a course on parallel programming or distributed systems may require more expensive resources. To get students to practice parallel programming in a somewhat realistic setting, you would like them to have access to a properly configured and maintained cluster. If, furthermore, you wanted to teach students how to set up a cluster, you would need a couple of clusters (ideally, one cluster per student) that the students could have unfettered access to.
There are two main issues with the above scenario. First of all, clusters aren't generally cheap, and some institutions can't afford one. Of course, you can easily build a cluster out of commodity hardware, but you also need someone to actually set it up and jiggle the handle whenever something goes awry. In one specific case, a department built a cluster with off-the-shelf PCs, and used it successfully... until the grad student charged with keeping the cluster running graduating. Apparently, that cluster has been sitting idly in a room for years now. Second, even if the institution can afford a cluster and a sysadmin, no sysadmin in his right mind is going to give root access to that cluster to undergrads, specially if that cluster is also used by researchers.
Enter virtual workspaces. In a nutshell, a virtual workspaces is an execution environment that you can dynamically and securely deploy on the grid with exactly the hardware and software you need. You need a 32-node dual CPU Linux cluster for a couple of hours to teach a parallel programming lab, with a very specific version of libfoobar installed on it? Just request a workspace for it, and that hardware will be allocated somewhere on the grid for you, and the software will be set up thanks to software contextualization, which Tim will discuss in his posts. There's no need for the institution to keep a cluster running 24/7, or even spend any time configuring a cluster (requiring a sysadmin, or burdening the lecturer or a grad student with this task). From a repository of ready-made workspaces, simply choose the one you want (or pay a one-time fee to have someone configure a workspace exactly the way you want it), deploy it on the grid ever Monday from 2pm to 4pm, and start teaching.
Unfortunately, we're not quite there yet, but virtual workspaces are being actively researched (yes, right now, even as you read this blog post!). Currently, virtual machines are the most promising vehicle to automagically stand up these custom execution environments on a grid. The Globus Virtual Workspaces Service, which uses the Xen VMM to instantiate workspaces, is still in a Technology Preview phase so, although you can still do a number of very cool things with it, you can't deploy arbitrary workspaces on arbitrary grids... yet. However, we're getting much closer, and in future blog posts I'll explain what progress we're making towards that goal.
When we do get there, I believe that workspaces stand to make really exciting contributions to Computer Science and Engineering education. Not only can they facilitate access to computational resources by underprivileged institutions, they can also enhance existing curriculums by enabling students to gain more practical experience than before (e.g., by giving each student their own cluster). In fact, workspaces will enable the creation of more complex "playgrounds", from virtual clusters to virtual grids, that students can use to learn and experiment.
از وبلاگ : http://gridgurus.typepad.com
Modeling is a very effective means in which to accurately measure the advantages of one scheduling policy over another in specific environments. High level abstraction models can be developed rapidly in order to observe efficiency benefits. In this type of an environment the most meaningful measurements that you would observe are the queue wait time of the jobs that have been submitted to the system as well as expansion factors that are partially derived from queue wait times. Although utilization is another measurement to observe, in a fully loaded system, high utilization is an already known fact and squeezing efficiency out of the system is more important, this is done by reducing queue wait times.
A prerequisite to accurate modeling is retrieving accurate job accounting data for the past year or more. This data is good for a number of reasons but the following two are most important. First, a modeler does not have to develop a distribution dataset of what is thought of as an accurate job data flow. Secondly, the data that is used is accurate as to job submission and run times, priority, and resources utilized. Expansion factor data can also be derived from part of this accounting data as well. All jobs are bounded in this environment and would eliminate any reservation slipping. In this modeling environment, you are attempting to improve on numbers that have already been produced in order to implement more efficient policies for the future.
Future segment: Developing an architecture for modeling a scheduling process that utilizes a priority queue policy with normal backfill algorithms.
از وبلاگ : http://gridgurus.typepad.com
Last time we talked about two similar
yet different benefits of using grids. Today we will expand on that list with
other benefits you might not have yet thought about. Just to be clear, we’re
purely talking about technical benefits here, the business benefits are left
for a whole other column.
Let’s first review what we found last time. The obvious benefits revolve around speedup of your parallel applications and higher throughput of your batch jobs. A typical example of the former is a crash-simulation with PAM-CRASH and MPI, a typical example of the latter is doing virtual high-throughput screening with applications such as LigandFit from Accelrys, where many potential drug targets are screened against a single protein target. But there are other less obvious use-cases for grid that can benefit you.
Imagine running a simulation that has many tweakable parameters that you’ve always set to a pre-set value. When you now move your computations to a grid, you might not need to get your results back any faster, so you could now opt to increase the accuracy of your computation by running the same simulation with different parameter sweeps on different nodes. Further expansion of your grid will suddenly increase the validity and accuracy of your results, rather than decrease runtime. An example of such computation can be found in the Oil and Gas industry where a more refined and accurate computational model of an oil-field can prevent costly dry holes.
One could assert that Monte Carlo situations are in fact also "accuracy-increasing" applications of grid, but there are two subtle differences. First, Monte Carlo simulations run usually on a much more massive scale, with thousands of very short simulations, where parameter sweep modeling typically utilizes larger models on a limited (less than a hundred) number of iterations. Second, typical Monte Carlo simulations only end once a pre-set certain resolution has been achieved, regardless of the number of grid nodes to your disposal. As such, it is better to categorize Monte Carlo simulations in the "throughput" category.
Once you understand these three basic benefits (speed-up, throughput and accuracy), there’s really no limit to what your imagination can come up with in terms of new applications of grid. Take the Ligandfit example that I mentioned earlier. United Devices' recently retired grid.org looked at the throughput use-case and took it to the extreme by simply taking a protein crucial to the internal workings of cancer cells and running every single possible potential drug target in the library against that protein. It took a leap of imagination to dream up six years of running billions of drug targets against multiple proteins.
The most rewarding moment during a consulting engagement is when I see that users "get" the basic use-cases and start dreaming big. Can you dream big? What can the grid do for you?
از وبلاگ : http://gridgurus.typepad.com

The grid infrastructure I work with daily is deploying more and more services based on the Globus Toolkit, and in particular on the Globus Toolkit Java Web services. Each of these services requires users to be authorized to invoke the service operations.
Most often the authorization is managed using our old Globus friend the static grid-mapfile. These grid-mapfiles work fine during development but as we scale out during production we hear the moans from the site administrators of "not another grid-mapfile!"
You can easily Google and find an entire zoo of projects aimed at helping production grids manage authorization for services. Each community seems to have its own effort and we can only hope at some point for a clear winner (I didn't say standard...and yes interoperability is nice but I still would like just a few "best of breed" tools that interoperate. I am naive in that way.)
What if, however, you are a grid architect or developer and you need to tie authorization to grid services into an existing authorization infrastructure? Does the solution necessarily have to involve pulling out authorization details from the legacy infrastrcuture, creating grid-mapfiles, and then having to manage all those grid-mapfiles?
No. A better approach might be to write your own authorization plugin for your Globus Toolkit Java Web services. It is surprisingly simple to do. Your approach might be as simple as writing one or two Java classes representing a Policy Decision Point (PDP) and/or a Policy Information Point (PIP).
Tim Freeman and Rachana Anathakrishnan have written a great tutorial on how to do just that. If you are wondering how you can tie together Globus grid services and a legacy authorization infrastructure do give it a read before you add one more grid-mapfile to your grid fabric.
از وبلاگ : http://gridgurus.typepad.com
In Part I of this article I’ve discussed meanings of various queue states that one might see after invoking the Grid Engine qstat command. The list of possible job states is just as long as the list of queue states:
• d (deletion) — Indicates that a job has been deleted using qdel.
• r (running) — Indicates that a job is about to be executed or is already executing.
• R (restarted) — Indicates that the job was restarted. This state can be caused by a job migration or because of one of the reasons described in the -r section of the qsub man page.
• s (suspended) — Shows that an already running job has been suspended using qmod.
• S (suspended) — Show that an already running job has been suspended because the queue that it belongs to has been suspended.
• t (transferring) — Indicates that a job is about to be executed or is already executing.
• T (threshold) — Show that an already running job has been suspended because at least one suspend threshold of the corresponding queue was exceeded, and that the job has been suspended as a consequence.
• w (waiting) — Indicates that the job is suspended pending the availability of a critical resource or specified condition.
• q (queued) — Indicates that the job has been queued.
• E (error) — Indicates that the job is in the error state. You can find the reason for this state using the qstat command with “-explain E” option.
• h (hold) — Indicates that the job is not eligible for execution due to a hold state assigned to it via qhold, qalter, or qsub -h command.
Just like with queue states, one also frequently encounters various combinations of the above job states.
از وبلاگ : http://gridgurus.typepad.com
In all likelihood most of the Grid Engine (GE) end users and administrators have at some point invoked the qstat command and found themselves wondering what do some of the resulting queue and job status letters mean. While some of those letters are pretty intuitive (e.g., ‘E’ stands for error), some are not entirely trivial to decipher. Unfortunately, it does not seem to be very easy to find explanation for these statuses. One usually has to resort to digging through the qstat man pages or through the various GE software manuals that one can find on the web. So, I’ve compiled below information about possible queue statuses:
• a (alarm) – At least one of the load thresholds defined in the load_thresholds list of the queue configuration is currently exceeded. This state prevents GE from scheduling further jobs to that queue. You can find the reason for the alarm state using the qstat command with “-explain a” option.
• A (Alarm) – At least one of the suspend thresholds of the queue is currently exceeded. This state causes jobs running in that queue to be successively suspended until no threshold is violated. You can see the reason for this state using the qstat command with “-explain A” option.
• c (configuration ambiguous) – The queue instance configuration (specified in GE configuration files) is ambiguous. The state resolves when the configuration becomes unambiguous again. This state prevents you from scheduling further jobs to that queue instance. You can find detailed reasons why a queue instance entered this state in the sge_qmaster messages file, or by using the qstat command with “-explain c” option. For queue instances in this state, the cluster queue's default settings are used for the ambiguous attribute.
• C (Calendar suspended) – The queue has been suspended automatically using the GE calendar facility.
• d (disabled) – Queues are disabled and released using the qmod command. Disabling a queue will prevent new jobs to be scheduled for execution in that queue, but it will not affect jobs that are already running there.
• D (Disabled) – The queue has been disabled automatically using the GE calendar facility.
• E (Error) – The queue is in the error state. You can find the reason for this state using the qstat command with “-explain E” option. Check that daemon's error log for information on how to resolve the problem, and clear the queue state afterwards using the qmod command with the -cq option.
• o (orphaned) – The current cluster queue's configuration and host group configuration no longer needs this queue instance. The queue instance is kept because unfinished jobs are still associated with it. The orphaned state prevents you from scheduling further jobs to that queue instance. It disappears from qstat output when these jobs finish. To help resolve an orphaned queue instance associated with a job, you use the qdel command. You can revive an orphaned queue instance by changing the cluster queue configuration so that the configuration covers that queue instance.
• s (suspended) – Queues are suspended and un-suspended using the qmod command. Suspending a queue suspends all jobs executing in that queue.
• S (Subordinate) – The queue has been suspended due to subordination to another queue. When queue is suspended, regardless of the cause, all jobs executing in that queue are suspended too.
• u (unknown) – The corresponding GE execution daemon (sge_execd) cannot be contacted.
I hope that those who are new to Grid Engine find the above descriptions useful. In Part II of this article I will cover possible job statuses.
از وبلاگ : http://gridgurus.typepad.com
Dan Ciruli at West Coast Grid writes
Europe is years ahead of the US in terms of large grids...
Is Europe years ahead of the US?
Open questions that come to mind include:
Certainly the US and Europe both have some very large grids, so the question is, what was Dan taking into account when making his claim.
از وبلاگ : http://gridgurus.typepad.com

Previously we discussed the tension that grid managers face when supporting various stakeholders on an enterprise grid. In particular we concluded that providing isolated virtual operating environments to each of the business units operating in your environment would be the easiest way to meet their competing and divergent needs. In this post we will explore the networking challenges that a grid of virtualized systems poses.
The primary challenge you face in this architecture is how to connect it all together. At first glance it seems simple enough: take your current grid, install a hypervisor on each of its nodes, and then start implementing your user’s specific environments. Sadly, this will probably not work.
In a typical grid you already have to consider the challenges of connecting several hundred compute nodes to one another and a storage network while keeping network latency low.
In order to illustrate the networking problems you would have in a virtualized grid, consider a system with a significant number of nodes used by several operational units. For example, imagine a large financial services company that provides banking, brokerage services, insurance, mortgage, and financing. Each of these business lines, while related, has their own distinct set of business application workflows. While there may be some overlap of the specific applications used by each of the units, there is little guarantee that each group will use those applications in the same way let alone use the same versions. Worse yet, a business unit may have multiple operational workflows which do not operate in similar environments (e.g. windows versus Linux specific applications suites). Finally, we grid managers would like to have development, test, and production instances segregated but running on the same hardware .
It is easy to project having to support at least ten times more virtual than physical operating environments. The actual number should be proportional to the number of unique operating environments required by the users. In a standard grid you have a fixed set of computational resources that are reasonably static; in other words systems do not appear and disappear on a regular basis. However in the virtualized grid, operating environments are going to appear and disappear as a function of the business workflows scheduled by your users. You can imagine how quickly this can become complicated.
What is the best way to deliver these operating environments to the physical hardware? If we keep all of the images on local disk then we need to guarantee that there is sufficient disk space on each node; a practice which not only can be costly but does not scale well. If we choose to keep no more than the maximum number of nodes supported by any application in each operating environment, we can reduce the number of virtual machines we require. Of course this implies that these images are either stored on a SAN or are transported to the individual physical nodes before booting the virtualized environment. Sadly, both of these approaches significantly increase network loads. We will discuss scheduling and managing individual virtual machines in subsequent posts.
How do we connect these virtual environments? If these systems were on segregated physical hardware (think Microsoft Windows versus Linux) we would likely keep them on their own network and/or VLANs. After all, these environments generally should not interact with one another. Consequently, shouldn’t we also do this for the virtualized grid? If we chose not to and instead used DHCP based upon physical topology to provide addresses to the virtualized environments, we could quickly run into trouble. Specifically, a single job executed on n nodes could conceivably land on n distinct networks and/or VLANs. This would significantly increase the size of the broadcast domain as well as require more work from your network switches. Therefore it would add significant latency to all communications between the nodes. Clearly this is a poor choice unless you are always using most of your nodes for each job.
Thus my preferred solution is to segregate operational environments, so that every physical node bridges traffic for several distinct networks over the same interface. Addresses would be assigned by virtual MAC addresses rather than physical location. As in the counter-example, this occurs because we will not be able to guarantee where on the physical network topology a particular job is scheduled. In fact, we probably want to use VLAN tags on our packets so that our switches could more efficiently operate. Additionally if your grid nodes have secondary interfaces, all communication with the hypervisor should be segregated to its own management network.
If this has not scared you away from the concept of the virtualized grid (I hope it hasn’t), we will continue to explore other hurdles inherent with this architecture in future posts.
از وبلاگ : http://gridgurus.typepad.com
For a number of years United Devices operated grid.org as a philanthropic site for cancer research. That mission was completed earlier this year and Univa UD has relaunched the domain to expand the scope of the project to open source cluster and grid management. This will allow many people who want to do large scale computing, but haven't had the ability to use existing tools, to download an easy to use cluster management suite that will allow them to run a variety of applications.
The press release associated with the launch follows:
RENO, Nev. (Nov. 13, 2007) – An online community for open source grid and cluster users, administrators and developers debuted today on the Internet at http://grid.org
Grid.org provides the single aggregation point for information and interaction by the community of users, administrators, and developers interested in a complete open source grid and cluster stack.
The site sponsor, Univa UD, unveiled Grid.org during the Supercomputing ’07 conference.
“The site has been built to support the needs of users of the open source Cluster Express release from Univa UD, which includes many open source components including Grid Engine, Globus and Ganglia,” said Steve Tuecke, co-founder and chief technology officer at Univa UD and a primary architect of the Grid.org community. “By aggregating information from many distinct open source grid and cluster efforts and facilitating interaction between users who have historically been left on their own to struggle with the integration of these components, Grid.org should be a valuable additional resource not just for new grid and cluster users but also for members of the current open source communities.”
Grid.org is designed as a destination for community members who want to connect easily and productively with those who have similar interests and who want to engage in a vibrant, functioning community of active participants. At Grid.org, community members can engage with others to discuss issues as well as give and receive help and contribute to the Cluster Express open source software project.
It also will be a resource for professionals who want to learn more about open source grid and cluster computing in general.
Besides providing links to other open source grid and cluster sites, Grid.org will include areas for participants to build their personal professional networks, participate in forums and blogs, access white papers and case studies, explore upcoming events and download free Univa UD Cluster Express open source software for integrated cluster management.
“We are hosting Grid.org to promote the broad adoption of open source grid and cluster technologies,” said Dr. Ian Foster, co-founder and chief open source strategist at Univa UD. “There are many very good resources relating to open source today, and we want to provide a single site that lets the community navigate this wealth of information and build on it.”
Grid and cluster pioneers will recognize Grid.org
About Grid.org
Established in 2001, Grid.org is an online community for open source grid and cluster software users, administrators and developers. The site’s current mission is to work with community members to broaden the reach of the site and encourage use of open source technologies for grid and cluster computing at large. The site provides a single location where open source grid and cluster information can be aggregated so that people with a similar range of interests can easily exchange information, experiences and ideas related to Univa UD’s complete open source grid and cluster software stack.
About Univa UD
Univa UD is the leading provider of open source products for grid and cluster computing environments. The company’s industrial-strength offerings range from departmental and HPC cluster management to enterprise-wide grids, and represent the proven and cost-effective alternative to traditional proprietary products that customers have been waiting for. Based on a combination of open source and proprietary components, Univa UD offerings include a downloadable open source cluster management product, a proprietary cluster product with rich functionality, and a comprehensive enterprise grid product based on award-winning technology. All Univa UD products are run by Fortune 1000 companies in large-scale, production environments. Univa UD is headquartered in Lisle, Ill. with offices in Austin, Texas. For more information, contact us Univa UD at 1-800-370-5320 or visit us at www.univaud.com
از وبلاگ : http://gridgurus.typepad.com
The opportunities to apply grid computing
methods in health care are, simply put, enormous. (Irving
Wladawsky-Berger refers to it as the "ASCI of Grid" to imply that the
challenges are comparable in their extreme scale to those tackled by
the DOE ASCI program in simulation. That is an understatement.) There
is an urgent need for community, best practices, standards, and the
like.
These considerations motivated the formation of the HealthGrid.US Alliance (HG.US), a partnership of scientific, medical and technology professionals from academia, industry and government, whose shared mission is to promote the application of advanced information technology to solve cutting-edge problems in Biomedical Science and Healthcare. HG.US is an affiliate of the international HealthGrid Association.
As a first action, HG.US is sponsoring the first HealthGrid Annual Meeting to be held outside of Europe, in Chicago, Illinois, USA, June 2-4 2008. See the announcement (pdf). The previous five meetings (2003-2007, held in Europe, have formal published proceedings that are also available from the website.
Many biomedical and health related problems are characterized by diverse collaborators needing access to great quantities of complex heterogeneous data, which is distributed across multiple computing systems, maintained by loosely connected institutions, often across international boundaries. Example projects addressing these challenges include sharing datasets to enable a cure for cancer (caBIG, ACGT) and science portals that enable neuroscientists to better visualize the morphology of the brain (BIRN). These and other projects have begun to demonstrate the power and potential of the Grid approach in biomedicine.
Initially, Grid technology development was driven by computing needs of the particle physics research community and enabled by the availability of high-performance networks. The term "grid" rapidly evolved toward a concept of ubiquitous and transparent computing to support a wide variety of applications, and builds on the well-known metaphor of the pervasive "electricity grid". Today, the HealthGrid space represents some of the most interesting drivers for progress in knowledge-based ubiquitous and transparent computing.
The international HealthGrid Association, based in Europe, provides a firm conceptual foundation for efforts in the US and is fully supportive of the HealthGrid.US Alliance. A HealthGrid white paper articulates the broad scope of the concept. US government agencies have begun to develop complementary strategies. These have been captured in TATRC's Integrated Research Team strategic report on HealthGrid: Grid Technologies for Biomedicine and the US Government interagency HealthGrid Core Strategic Planning Group
از وبلاگ : http://gridgurus.typepad.com