UNIT IV PROGRAMMING MODEL Open source grid middleware packages - Globus Toolkit (GT4) Architecture, Configuration - Usage of Globus Globus: One of the most influential Grid middleware projects is the Globus project mentioned earlier. Globus has gone though several versions and bug fixes as illustrated in Figure. Version 1.0.0 was released in 1998 and was subsequently updated with version 2.0 released in early 2002. Version 2 became widely adopted, especially versions 2.2 and 2.3, which were released later. The Globus toolkit has five major parts: Security Components to provide a security envelope and secure access. Information Monitoring and discovery of resources and services. Data management Access and transfer of data. Some Globus toolkit versions (approximate time line). Execution management Executing, monitoring and management of jobs Common run time Libraries and core services Security: First security is required. The distributed resources must be protected from unauthorized access. The Globus components for creating the security envelope is called GSI (Grid Security Infrastructure), which used public key cryptography. It requires each user to be authenticated (their identity vouched), which is done by each user having a digital certificate signed by a trusted certificate authority in a manner that is analogous to a passport or driving license being identified as true with the appropriate marks placed on it by the issuing authority. This technique is the basis of Internet security. Users will also need to be able to give their authority to Grid components to act on their behalf. This aspect is addressed in Grid computing by having special certificates called proxy certificates, which give resources holding them authority to act upon the issuer s behalf in a chain of trust. This is equivalent to one giving a proxy to another to vote on their behalf at a meeting.
Information: Next, the user often needs to know information about the available Grid resources. The basic Globus component for this is called MDS (Monitoring and Discovery System) historically, or simply information services. The users might access MDS to discover the status of the compute resources. Resource discovery is still very primitive and in the research domain, but the ideal is to be able to submit a job and the system find the best resources for that job based upon the job description and resource descriptions across the whole Grid. A Grid portal often interacts with MDS and other information services to display the current state of remote resources. It is also possible to use Globus APIs and higher-level APIs within an application to make decisions based upon the availability of resources. Grid-enabling applications touches upon this although its full treatment is beyond the scope of this book. Executing a Job. Next, the user typically would want to submit a job. The basic Globus component for running a job is GRAM (Globus or Grid Resource Allocation Management). It may be necessary beforehand to transfer files to the resources and afterwards to transfer files to other locations including back to the user. The user might use the data management component called GridFTP for that. The above activities are illustrated in Figure. It is important to note that Globus is a toolkit of components and not a complete solution for Grid computing infrastructure nor was it ever intended to be. Other higher-level components are needed in a sophisticated Grid computing infrastructure. Issues not addressed in the basic Globus toolkit include account management, job scheduling, and advanced features of security across multiple domains. Job scheduling either uses existing local job schedules such as Condor or higher-level global Grid meta-schedulers in concert with local schedulers. User employing Globus services and facilities.
User Interfaces: Grid computing environments are mostly Linux-based and originally accessed through the command line. Once wehave established our security credentials, to run a job wemight issue the GRAM command: globusrun-ws -submit -c prog1 where prog1 is the executable of the job. The executable needs to be present on the compute resource that is to execute it. This particular command does not specify the compute resource and hence the computer executing the globusrun-ws command will execute the program prog1. If needed, transferring files to compute resources could be done with the GridFTP command such as: 1 globus-url-copy \ gsiftp://www.coitgrid02.uncc.edu/ ~abw/proglout \ file:///home/abw/ The first argument of globus-url-copy is the source location and the second argument is the destination location. In the above case, the file www.coit-grid02.uncc.edu/ ~abw/proglout is transferred to home/abw/ on the local computer. (Transferring files can be also done by specifying the file transfer in a job description language document.) As one can see, the command-line interface is a very primitive way of interacting with the Grid resources. A more desirable way is to have a Web-based interface called a Grid portal or gateway. A backslash (\) indicates the command continues on the next line. If used in practice, it must immediately precede a newline character. See Appendix B for more details on the Linux command-line interface. other graphical user interfaces. The Grid portal used for the UNC-Charlotte/UNC-Wilmington course is based upon the GridSphere Grid portal toolkit. The login page is shown in Figure. This Grid portal is hosted on the server coitgrid02.uncc.edu:8080/gridsphere, which can be reached from anywhere on the Internet. GridSphere adheres to the JSR 168 portlet standard and can interface to the de facto standard Globus toolkit. It allows customized portlets to be created and deployed within the portal. Portlets are software components with an associated display area within the portal. Customized portlets can be created as a front-end to Grid-enabled applications. A tab on a GridSphere portlet will select a window within which there could be one or more portlet areas. The lawet of a portlet is defined using HTML and JSP (Java Server Pages) or similar technologies. Before users can log on, they need a user name and password for the portal. Before they do anything on a Grid platform, they must have user credentials and accounts on the resources they wish to access. In our course portal, the PURSe (Portal-based User Registration Service) portlet is incorporated into the portal to facilitate the user setup procedures. It can be reached by selecting the Register tab from the main course portal page. Figure shows the PURSe registration portlet once the Register tab is selected. The user then submits the required
information (name, email address, institution, etc.) This information is then forwarded to the Grid system administrator to set up accounts and credentials. A series of exchanges occur with the user by email confirming their intentions as shown in Figure. Note that communication is required with system administrators of remote resources. It is difficult to automate the process fully without communication between the user and administrators because, apart from the technical matters that need to be set up, approval is needed to use resources owned by others. A number of software projects and tools have focused on the very important matter of account management. However, account management is still often a human-centered process. UNC-Charlotte/UNC-Wilmington Grid computing course portal (GridSphere).
PURSe registration portlet Registration activities In some Grid projects, it may be necessary to have face-to-face meeting with a system administrator and present a photo ID to establish identity. Finally, once everything is in place, the user will be able to log in to the Grid portal and see a number of tabs across the top, which enable the user to perform many basic tasks. Depending
upon the installed portlets, tabs typically would be for Grid information, proxy management, file management, job submission, Condor job submission, and others such as Sakai for virtual organization member communications.