Risk and Security Management for Distributed Supercomputing with Grids Urpo Kaila <urpo.kaila@csc.fi> Funet CERT & CSC 2006-09-22 19th TF-CSIRT Meeting,Espoo, Finland
Agenda Grid s and supercomputing Some definitions How do they work? Example of Grids Grids and Security Risk management and Security domains Creating baselines for Security Case M-grid revisited Organisation and setup Security Working Group Risk analysis, Security Policy & Acceptable Use Policy User Security Guide, Administrator Security Guide Grid Security and CSIRT s Making Grid Security compatible Incident handling
Some definitions Supercomputers most efficient systems worldwide on a given time for massive parallel processing of advanced research tasks Distributed computing several inter-connected computers share the computing tasks assigned to the system [IEEE] Cluster Similar efficient computers coupled closely together Grid computing Affordable high performance distributed computing with interconnected clusters Moore s law as seen on the Top500 list Pentium 4 = ~ 2-4 GFlops
What is the Grid? Grid according to Ian Foster (2002) in "What is the Grid? A Three Point Checklist : Computing resources are not administered centrally. Open standards are used. Non-trivial quality of service is achieved Different types of grids Info-grid -WWW Data-grid - Databases Compu-grid - Computing Grid must have: Virtual organisations Middleware Truly Distributed Evolved from computational needs of "big science"
How do they work? $ grid-proxy-init Your identity: /O=Grid/O=NorduGrid/OU=csc.fi/CN=Urpo Kaila Enter GRID pass phrase for this identity: $ ngsub -d 1 -f mygridjob.xrsl
The Role of Grid Middleware NorduGrid ARC Tutorial / Arto Teräs and Juha Lento 2005-09-20
Examples of Grids and Grid resources TeraGrid - Open scientific discovery infrastructure financed US National Science Foundation DEISA - Distributed Euroapean Infrastructure for Supercomputing Applications EGEE - The Enabling Grids for E-sciencE LHCG - Large Hadron Collider Grid (CERN) e-irg - The e-infrastructure Reflection Group NorduGrid - a Grid Research and Development collaboration The Globus Alliance - an international collaboration that conducts research and development to create fundamental Grid technologies
Grids and Security
Threats WARNING! When working on the Grid, you must accept that some information on your jobs and on your Grid identity is made public. This includes your name, your affiliation, IP address of your client computer, job names and duration, used runtime environment names and other less sensitive information (see the Grid monitor for example). (Nordugrid) What excites hackers? (A. Cormack, 2002) High profile targets to enhance their reputation Powerful CPU for password cracking etc. Large disk to distribute illegal material High bandwidth for denial of service attacks
Security matrix Reactive Security Proactice Security Technical Security Forensics Firewalls Cryptography Patching vulnerabilities IPS Security Management Incident handling Security policies and guides Training and awarness building
Risk and (proactive) security Risk management (à la Wikipedia) Security Domains [à la (ISC) 2 CISSP CBK] 1.1 Establish the context 1.2 Identification 1.3 Assessment 1.4 Potential Risk Treatments 1.4.1 Risk avoidance 1.4.2 Risk reduction 1.4.3 Risk retention 1.4.4 Risk transfer 1.5 Create the plan 1.6 Implementation 1.7 Review of the plan residual risks 1. Access Control 2. Application Security 3. Business Continuity and Disaster Recovery Planning 4. Cryptography 5. Information Security and Risk Management 6. Legal, Regulations, Compliance and Investigations 7. Operations Security 8. Physical (Environmental) Security 9. Security Architecture and Design 10. Telecommunications and Network Security
What has already been done (examples) Joint Security Policy Group LCG/EGEE: The LCG Security and Availability Policy The Grid Acceptable Usage Policy The Virtual Organisation Security Policy PlanetLab Acceptable Use Policy (AUP) E-Infrastructure Reflection Group (e-irg) Authentication and authorisation policies Usage policies Etc
Case M-grid revisited
M-Grid - Material Sciences National Grid Infrastructure in Finland Joint project between CSC, seven universities and The Helsinki Institute of Physics (HIP) Connected to the Nordic NorduGrid network, but access is currently limited to M-grid partners and CSC customers The systems are particularly suitable for highthroughput running of sequential and easy-to-parallel programs The theoretical computing capacity of the system is approximately 2.5 Tflops. M-grid is based on HP ProLiant DL145, DL385 and DL585 servers equipped with 64 bit AMD Opteron processors (642 altogether)
The M-Grid Security Working Group Organisation Started January 2006, meetings once in a month, exept summertime Members: CSC staff, visiting experts and M-Grid administrators: Juha Jäykkä (UTU) Michael Gindonis, Kalle Happonen (HIP) Ivan Degtyarenko (HUT) Vera Hansper (JYU) Reports to M-Grid Administrators meeting Collaborating with the HIP Wiki Task Risk analysis To create a set of security policies and guidelines Technical planning, implementation and supervision Incident handling
The M-Grid Risk analysis 2006 Impact Residual Mitigate Disaster High Medium Problematic Low Internal - Intentional Internal - Accidental External - Intentional External - Accidental Likelihood Over 50 threats identified and analysed! Risk = likelihood x impact Picture by Vera Hansper
M-grid Security Policy (Reviewed) 1. Introduction ( scope, objectives) 2. Participants, roles and responsibilities 3. Physical security 4. User accounts and access control Local accounts Grid accounts Virtual Organization management Certificate Authorities 5. Network security Network access and services Additional services Firewalls 4. Network security (contd.) Firewalls 5. Operational security Patches Monitoring 6. Confidentiality and privacy Grid users Local users and administrators 7. Incident response 8. Compliance Exceptions 9. Approval and review 10. Comments
M-grid Security Policy (examples) Accounts must be protected by a good password or other method providing equivalent security Sites are allowed to create time-limited accounts for persons working in documented collaboration projects outside the site's organization Sites may offer additional services which are open to a large user base, but these must be approved by the M-grid administration A node Sites must not offer any additional services running on the administration server without approval of the M-grid administration.
M-grid Acceptable Use Policy Short, intended for the user, the security policy is to be read when needed Examples of content: By using the M-grid resources you automatically agree to comply with this Acceptable Use Policy You must act in a responsible manner and must not cause harm to other users, to M-grid or to other systems. You may not use M-grid for illegal activities. The M-grid services and systems are intended for professional, academic research or education. Your account is personal and may not be shared with other people
Security Guides M-grid User Security Guide A short technical howto Example: Your proxy certificate is not protected by a password therefore it should not be valid for longer than necessary as proxy certificates can be easily renewed M-grid Administrator Security Guide A Longer howto Under construction
Examples of Technical security tasks Implemented and on-the-wish list Firewall-rpm Log management and monitoring Integrity check Package signing Availability monitoring Automatic alerting Backup of frontend ssh- key managemnt Security audits
Grid Security and CSIRT s
Making Grid Security compatible The grid s tend to interconnect we need compatible security Complex new technologies and fuzzy virtual organisations in our hosts and networks International cooperation needed Technical level Management level Reactive security - Proactive security The risks haven t materialized yet
Grid Incident handling Existing CSIRT s should be used as professional incident handling hubs Constant and proactice knowledge transfer needed between Grid administration, CSIRT s and site administators In the M-Grid Security policy already a paragraph: The administrator, in consultation with CSC should also inform Funet CERT (cert@certdontspam.funet.fi, tel. +358-9- 4572038) if the incident affects other M-grid sites
Finally - Finnish security terminology :) Information Tieto Security turvallisuus Incident poikkeama Many incidents poikkeamia The interrogative form ~ko Also ~kin Have there been oliko Have there also been any security incidents? Oliko tietoturvapoikkeamiakinko?