Software, hardware and personnel requirements Author(s) Bartosz Oudekerk and Ashley Chacuto Version 1.0 Status Final The Hague, 08-09-2009
2/12 Document information Colophon Author(s) Status Bartosz Oudekerk and Ashley Chacuto Final Project Date Toolbox PSC 08-09-2009 Organisation Classification Title Version 1.0 Location History Datum Version Changes Status Processed by 02-09-09 0.3 Order of paragraphs draft Stephanie 04-09-09 0.4 Translated in English draft Josee 07-09-09 0.4 Added missing pictures draft Stephanie Distribution list Date Distribution Presentation Version 02-09-09 Lara van Riet, Stephanie Solingen 0.3 Approval Datum Name Version 08-09-09 Ashley Chacuto, Barry van de Graaf final
3/12 Preface This document describes the Toolbox PSC architecture. It names the components the Toolbox consists of, defines how they interact, as well as listing their requirements and the possibilities for scaling up or out.
4/12 Table of contents 1 Antwoord voor Bedrijven 5 1.1 Components 5 1.1.1 Hippo 1.2.x 5 1.1.2 Hippo CMS 6.05.x 5 1.1.3 Front-end 6 1.2 Requirements 6 1.2.1 Software requirements 6 1.2.2 Hardware requirements 6 1.2.3 Personnel requirements 6 1.3 Scalability 6 1.3.1 Scaling up 7 1.3.2 Scaling out 9 1.4 References 9 2 Cooperating Catalogues Import Tool 10 2.1 Components 10 2.1.1 Cooperating Catalogues Import Tool 10 2.2 Requirements 10 2.2.1 Software requirements 10 2.2.2 Hardware requirements 10 2.2.3 Personnel requirements 10 3 Message box 11 3.1 Components 11 3.1.1 Web application 11 3.2 Requirements 11 3.2.1 Software requirements 11 3.2.2 Hardware requirements 12 3.2.3 Personnel requirements 12
5/12 1 Antwoord voor Bedrijven 1.1 Components Antwoord voor bedrijven (Avb) has a component-based architecture. This part of the Toolbox consists of three separate components, allowing for a flexible scalable architecture: Hippo, Hippo CMS and the front-end. They are described below. 1.1.1 Hippo 1.2.x Hippo is based on Apache Slide. It stores and delivers content exposed through WebDAV 1. It provides functionalities for fast storage and retrieval of content, binary- and meta information, users, permissions and workflow. Avb uses LDAP 2 to manage its users and groups. The Hippo provides a transparent layer for access to different content stores, and has connectors for a wide range of databases. Examples are MySQL, MS SQL, PostgreSQL & Oracle. Avb has chosen to use MySQL. Searching the repository Searching the repository is effected by using DASL 3 queries as described in the standard DASL specifications. An indexing engine using Apache Lucene is built in. XML documents are automatically indexed, as are PDF, Word documents and several other proprietary formats. Slide supports custom extractors 4 that also allow indexing of images or any other kind of data, leaving as much room for extension as possible. Hippo sends out JMS messages upon changes, in order to facilitate event-based caching 5 in applications. 1.1.2 Hippo CMS 6.05.x Hippo CMS is an extensible Hippo Cocoon application that communicates with a Hippo repository across the WebDAV protocol to manage the content independent of the front-end. 1 http://www.webdav.org/ 2 http://wiki.onehippo.com/display/cms/5.+hippo++configure+ldap+authenticati on+and+authorization 3 http://www.webdav.org/dasl/ http://wiki.onehippo.com/display/cms/06.+using+dasl+queries 4 http://wiki.onehippo.com/display/cms/4.+hippo++configure+extractors 5 http://wiki.onehippo.com/display/cms/using+event+caching
6/12 1.1.3 Front-end The front-end itself is an Hippo Cocoon application with an event-based cache. Depending on the domain used in the request, either the published (www) or the unpublished (preview) content will show. 1.2 Requirements In the following paragraphs, Avb's software, hardware and personnel requirements are listed. 1.2.1 Software requirements OS: Linux JVM: A JVM from Sun, version 1.6 Container: Tomcat 6.x RDBMs: MySQL Other: LDAP (OpenLDAP) Web server: Apache HTTPD 1.2.2 Hardware requirements For a basic set-up we suggest one server for the application and one server for the back end (database). It is possible to deploy the application and database on the same server, but it is not recommended. Single core Intel Xeon @ 2.5 Ghz 1024 MB RAM 17 GB disk Single Ethernet adapter Actual requirements may differ based on the size of deployment. 1.2.3 Personnel requirements Senior system administrator Medium level Cocoon developer Authors (for constructing the Meta-Feed index) Editors (for producing the content in the CMS) 1.3 Scalability Below, we shall make some suggestions and give examples with regard to scalability. This pertains to the Avb part of the project.
7/12 1.3.1 Scaling up When scaling up by adding more memory, in the case of Cocoon (CMS & Front-end), Cocoon's internal memory management needs to be configured to efficiently use the extra memory. For both front-end and CMS adjust in WEB-INF/cocoon.xconf: <eventaware-store logger="core.eventaware-store"> <parameter name="maxobjects" value="15000"/> </eventaware-store> maxobjects should be 15000 for every 256 MB of memory. (So 30000 for 512) The freememory and heap size of the store janitor will also need to be updated, the comment above that section in WEB-INF/cocoon.xconf explains how. For the front-end (not the CMS!), the size of the stores should also be changed. In WEB-INF/cocoon.xconf there are three stores, namely: store, store-repository-doc & store-repository-binary. Where the options for size are small, medium, large or huge. Small would be sites running below 128 Mb, medium below 256 Mb, large below 512 Mb, and huge for even more than that. The default Avb setup is: Apache Hippo CMS Depending on which component is the bottleneck, different solutions can be chosen. Front-end and CMS The front-end and CMS are most easily scaled out by simply load balancing several deployments. Note that all CMSs deployed in this fashion must share a single workflow. Example:
8/12 Loadbalancer Apache Apache For the repository two elements are important: replication clustering Replication Replication 6 can be used to create a master-slave set-up for all or part of the content. Thus, a front-end impacting the CMS by placing a heavy load on the repository can be reconfigured to read from a slave. Alternatively, you can allocate the website and published (live) content to a DMZ, with the CMS on an internal LAN. Example: Hippo CMS 6 http://wiki.onehippo.com/display/cms/6.+hippo++configure+replication
9/12 Clustering Any number of Hippo repositories can be clustered, provided they all access a single database. Hippo repository clustering can be configured 7 using both JGroups or JMS. Example: Loadbalancer Apache Apache CMS 1.3.2 Scaling out Since the components are relatively independent of each other, there are numerous possibilities to scale out. Ranging from simply adding dedicated servers for some components, to replicating between clusters of repositories. 1.4 References Hippo CMS 6.x, Hippo 1.2.x & Hippo Cocoon: http://wiki.onehippo.com/display/cms 7 http://wiki.onehippo.com/display/cms/7.+hippo++configure+clustering
10/12 2 Cooperating Catalogues Import Tool 2.1 Components Before specifying the hardware and software requirements we will give a brief explanation of the Cooperating Catalogues Import Tool. 2.1.1 Cooperating Catalogues Import Tool The Cooperating Catalogues Import Tool (CCET) is an application built in Spring and Wicket on top of Apache Jackrabbit. It is a standard on which the Avb site is connected to. With it municipalities are able to log in to the application and specify the location of their data. 2.2 Requirements In the following paragraphs CCET's software, hardware and personnel requirements are listed. 2.2.1 Software requirements OS: Linux JVM: A JVM by Sun, version 1.6 Container: Tomcat 6.x RDBMs: MySQL 2.2.2 Hardware requirements For a basic set-up, we suggest one server for the application and one server for the back-end (database). Single core Intel Xeon @ 2.5 Ghz 1024 MB RAM 17 GB disk Single Ethernet adapter 2.2.3 Personnel requirements System administrator Medium level Java developer 3 Message box
11/12 3.1 Components Before specifying the hardware and software requirements we will give an overview and brief explanation of the message box. 3.1.1 Web application The web application is a based on the Spring and Wicket frameworks, which store messages on the database server. This application can be deployed on a server running Tomcat. 3.2 Requirements 3.2.1 Software requirements OS: Linux JVM: A JVM by Sun, version 1.6 Container: Tomcat 6.x RDBMs: MySQL Furthermore, the following services are required: SMTP service This service is used to notify the client about a new message via e-mail. SMS service This service is used to notify the client when he receives a new message by sms. File storage services The file storage is used for storing message attachments. Virus scanning service The open source virus scanner ClamAV is used for scanning attachments uploaded by the user. Authentication service This is a pluggable component providing the login and authentication mechanism for each user. This means that the Message box hoster can choose to: 1. use the toolbox-provided authentication service 2. use an authentication service already implemented. The Toolbox provided authentication is based on the AselectFilter. This filter is delivered by the web authentication (in Dutch: eherkenning ) project. The ASelectFilter implements the authentication mechanism. It checks the user's credentials. When these are not (or no longer) valid, it redirects the user to an external authentication
12/12 application. When the user is properly authenticated, he/she is redirected once again to the message box application. 3.2.2 Hardware requirements Single core Intel Xeon @ 2.5 Ghz 1024 MB RAM 17 GB disk Single Ethernet adapter 3.2.3 Personnel requirements System administrator Medium/Senior level Java developer