IBM Cúram Social Program Management Determining dependencies in Cúram data In support of data archiving and purging requirements Document version 1.0 Paddy Fagan, Chief Architect, IBM Cúram Platform Group PaddyFagan@ie.ibm.com Paddy Fagan is the chief architect of the IBM Cúram platform group. Paddy has architectural responsibility for all the projects in the Cúram platform. He has worked on the Cúram product since its inception 15 years ago and has been involved in almost all aspects of the development of the Cúram product. 1
Contents Introduction... 3 Using the tool... 3 Prerequisites... 3 Installation... 3 Configuration... 3 Run... 3 Analyze output... 4 Troubleshoot... 4 Configuring... 4 How the tool works... 4 Inputs used by the tool... 4 Providing additional metadata... 5 Sample XML... 6 Analyzing the output... 6 How to limit analysis... 6 Further analysis... 7 Resources... 7 Notices... 8 Trademarks... 10 2
Introduction In order to remove data from a Cúram application, the first step is to identify all of the data that may need to be removed to support the removal of a top-level row (e.g. Person). This article includes a tool, which can automate this process and provide a starting point to develop the business logic required to implement data removal. The tool, called foreign key walker, will provide a table (in the form of a comma separated text document - csv) with a list of all the entities that may contain rows related to the toplevel row. Each row in this table lists the entity, the other entity it is related to, and the fields (on each entity) that form the basis of this relationship along with the mandatory indicator and the cardinality for the fields in this relationship. For central entities in the Cúram application (like Person), the tool will produce a long list of tables. This article includes some tips on how to limit the amount of analysis required. However it is important to note that the tool provided with this article is just a starting point for further analysis. The tool is shipped with a starter set of meta-data content based on the 6.0.5 release, but driven by some local analysis - this content is provided AS IS. See Providing additional metadata for more details. This article first describes how to use the tool, and then provides further details on the configuration options available and some initial guidance on analyzing the output. Using the tool The tool is attached to this article. The Foreign Key Walker Tool requires access to files created during the model extract and database build steps of the Cúram application build, as such it needs to have access to the files from a Cúram Development install. Prerequisites 1. Make sure that your Cúram server and database builds have completed 2. Ensure that the CURAM_DIR and CURAMSDEJ environment variables are set correctly Installation Unzip the tool to a convenient location. Configuration The tool will run without any further configuration. However you can configure it to limit the scope of the analysis or to match your Cúram deployment. Details of how to configure the tool are included in the Configuring section. Run 1. Open a command prompt. 3
2. From the directory where the tool was unzipped run the tool with the following command - ant Drootentity=<entity name> e.g. ant Drootentity=Person. 3. A sub directory is created called output, containing a file called ForeignKeyMap_<entity name>.csv. Analyze output The analysis of the output of the tool is covered in detail in the Analyzing the output section. Troubleshoot The tool relies on the datamanager_config.xml (located in EJBServer/project/config) and the generated database schema files generated by the Cúram build. If there is an error from the tool the most likely cause is that these files are missing or invalid building the database for the application will highlight or address most issues. If the configuration files for the tool have been modified (see Providing additional metadata) then invalid syntax in these files can also trigger errors. The reported errors will highlight where there is a problem. Configuring The configuration of the tool centers on options to do two key things: 1. Limit the relationships processed by the tool using admin and firewall entities. 2. Capture additional relationships, which aren t captured in the Cúram application development environment using generic links and missing foreign keys. In order to effectively make use of these configuration options it is also important to understand how the tool works. How the tool works The tool works by building up a model of all the relationships between entities in the Cúram application database, then starting from the root entity it navigates out along the relationships adding each related entity to the output list. It will stop processing at a given relationship if: The relationship (both entity and field) is already in the output list. The relationship is to an entity that is in the firewall list. The relationship is to an entity that is flagged as an administration entity. The tool exits when there are no further relationships to be processed. Inputs used by the tool The tool uses three key sources of data: The Cúram development configuration: o The datamanager_config.properties file is used to find the entity definitions, foreign keys and unique constraints that are part of the Cúram model. It is worth noting that only the modeled content is extracted by the 4
tool, any content added via hand-crafted SQL files (included in datamanager_config.properties) is ignored by the tool. Metadata which is packaged with the tool: o The AdminTables.txt file a text document with a list of administration/configuration tables that are excluded from the analysis. o The GenericEntity_ForeignKeys.xml file an XML document, following the format of the generated <component>_foreignkeys.xml files, which lists known generic link entities in the OOTB content and the potential target entities. This allows these relationships to be included in the analysis. o The Missing_ForeignKeys.xml file - an XML document, following the format of the generated <component>_foreignkeys.xml files, which lists foreign keys known to be absent from the OOTB content. This allows these relationships to be included in the analysis. Command line parameters: o The tool also allows one or more firewall entities to be specified, these are used to limit the analysis at these entities. This can be useful in limiting the analysis where it is known in advance that certain related information will not be removed. An example of this would be that Participants on a case should not be removed when removing the case. To use this feature an additional parameter is passed to the tool at the command line Dfirewallentites=<comma separated list of entities> e.g. Dfirewallentities=CaseParticipantRole,ConcernRole Providing additional metadata The meta-data shipped with the tool is a starter set of meta-data based on the Cúram 6.0.5.0 release and driven by some analysis conducted when the tooling was developed. It is provided AS IS. It may be necessary to update this meta-data to match custom development or OOTB content which was not included in the analysis when the tool was developed. The meta-data used by the tool is located in the config directory under the tool install location. This contains three files: AdminTables.txt a text file listing administration entities, listed one per line. Entries can be added to this file as needed. The ordering of entries in this file is not significant. GenericEntity_ForeignKeys.xml an XML document, following the format of the generated <component>_foreignkeys.xml files. This lists entities that use a style of generic link, normally using a combination of an ID and type field, to allow references to a number of entities to be stored in shared fields. The supplied content lists the OOTB relationships. These may need to be extended where additional relationships are stored in the existing fields or where new fields of this type have been added. Entries can be added to this file as needed. The ordering of entries in this file is not significant. Missing_ForeignKeys.xml - an XML document, following the format of the generated <component>_foreignkeys.xml files. The supplied content lists known 5
relationships not captured in the Cúram model. Entries can be added to this file as needed. The ordering of entries in this file is not significant. The format for adding foreign key entries is common to the two XML files listed above and an example of this is included below for reference: Sample XML <key> <association constraintname="add_fk" tablename="intake" othertablename="caseheader"> <foreignkeypair localfield="intakecaseid" remotefield="caseid" /> </association> </key> Analyzing the output Each row in the output lists the entity, the other entity it is related to, and the fields (on each entity) that form the basis of this relationship along with the mandatory indicator and the cardinality for the fields in this relationship (see Table 1 below). The primary aim of this information is to allow the analysis to be undertaken to determine if this data needs to be removed when rows from the root entity are removed. Table 1. Sample Output From Entity Name To Entity Name From Entity Field Mand Ord To Entity Field Mand Ord Person FALSE MANY FALSE MANY Person ConcernRole concernroleid TRUE ONE concernroleid TRUE ONE AbsencePeriod ConcernRole concernroleid FALSE MANY concernroleid TRUE ONE AbsencePeriodCorrectionHistory AbsencePeriod absenceperiodid FALSE MANY absenceperiodid TRUE ONE AbsencePeriodCorrectionHistory PRLICorrection prlicorrectionid TRUE MANY prlicorrectionid TRUE ONE AbsencePeriodCorrection PRLICorrection prlicorrectionid TRUE MANY prlicorrectionid TRUE ONE AbsencePeriodCorrection AbsencePeriod absenceperiodid FALSE MANY absenceperiodid TRUE ONE AbsencePeriodHistory AbsencePeriod absenceperiodid TRUE MANY absenceperiodid TRUE ONE AbsencePeriod RosterLineItem rosterlineitemid FALSE MANY rosterlineitemid TRUE ONE DailyAttendance RosterLineItem rosterlineitemid FALSE MANY rosterlineitemid TRUE ONE DailyAttendanceCorrection DailyAttendance dailyattendanceid FALSE MANY dailyattendanceid TRUE ONE DailyAttendanceCorrection PRLICorrection prlicorrectionid TRUE MANY prlicorrectionid TRUE ONE PRLICorrectionClient PRLICorrection prlicorrectionid TRUE MANY prlicorrectionid TRUE ONE PRLICorrectionHistory PRLICorrection prlicorrectionid TRUE MANY prlicorrectionid TRUE ONE How to limit analysis For most central entities this tool will produce a very long list of related entities, so a key activity is to try and limit the analysis required. There are number of suggested techniques for this: Using firewalls - firewalls limit the analysis by causing the tool to stop processing relationships when it encounters one of the listed entities, these are specified as a command line option when running the tool, see Inputs used by the tool. Using row counts from your database if you have a production (or similar) database, extracting the row counts for all the tables, may help to identify the relationships that are unused in your Cúram deployment. Any unused tables can be excluded from the 6
analysis, of course it s important to distinguish between tables that are unused in the current dataset and truly unused tables. Excluding administration data it s critical to ensure that no administration/configuration data is removed as part of any processing. To support this the tool has a built in list of administration entities and these will be excluded from the output of the tool. The section Providing additional metadata describes how to extend this list if required. Further analysis The output for the tool is a starting point for your analysis. The next important step is to identify any possible shared rows where a row in a table may also be referenced from another location that will not be removed as part of your processing. So, for example, where a case is being removed some payments may be linked (via deductions) to the overpayment on another case. Here the decision needs to made about where this relationship will be broken when the record is removed, if any additional information needs to be recorded to bridge the gap created by the removal of the record. Resources Cúram Information Centre - http://pic.dhe.ibm.com/infocenter/curam/6.0.5/index.jsp Cúram product page - http://www-03.ibm.com/software/products/id/en/socialprogram-management-platform/ 7
Notices This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information about the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user s responsibility to evaluate and verify the operation of any non-ibm product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A. For license inquiries regarding double-byte character set (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to: Intellectual Property Licensing Legal and Intellectual Property Law IBM Japan Ltd. 1623-14, Shimotsuruma, Yamato-shi Kanagawa 242-8502 Japan The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement might not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. 8
Any references in this document to non-ibm websites are provided for convenience only and do not in any manner serve as an endorsement of those websites. The materials at those websites are not part of the materials for this IBM product and use of those websites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact: IBM Corporation 2Z4A/101 11400 Burnet Road Austin, TX 78758 U.S.A. Such information may be available, subject to appropriate terms and conditions, including in some cases payment of a fee. The licensed program described in this document and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-ibm products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-ibm products. Questions on the capabilities of non-ibm products should be addressed to the suppliers of those products. All statements regarding IBM s future direction or intent are subject to change or withdrawal without notice, and represent goals and objectives only. This information is for planning purposes only. The information herein is subject to change before the products described become available. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. 9
COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall not be liable for any damages arising out of your use of the sample programs. If you are viewing this information in softcopy form, the photographs and color illustrations might not appear. Trademarks IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the web at Copyright and trademark information at www.ibm.com/legal/copytrade.shtml. 10