Images and Oracle Database 11g BioGrid Australia - Health Through Information PRANABH JAIN and NAOMI RAFAEL Presented by Susan Mavris, Oracle Multimedia
Agenda Purpose and Description of BioGrid Oracle Database 11g Advantages Oracle Database 11g Examples and Details Best Practices from the Oracle Multimedia Team References
If You Wanted To Compare the volume and shape of the brain of individual EPILEPSY PATIENTS where would you start?
What is BioGrid Australia? The platform and infrastructure that provides researchers access to data in many disease types from disparate existing databases at many institutions with privacy and intellectual property protection co-located in a virtual repository linked with public research & genetic profiling data Provides a flexible and secure method for authorized researchers to interrogate the multiple data sources
What is the Australian Cancer Grid? The flagship of the expansion of BioGrid Australia platform Focus: provide cancer researchers with data on clinical and surgical outcomes as well as to tumour biospecimens The depth and breadth of the data will provide a huge resource and covers the following tumour streams initially: Colorectal, Brain, Breast, Lung, Sarcoma, Gynaecology, Prostate, Head & Neck, Upper Gastro-intestinal, Melanoma, Renal, Prostate.
BioGrid Australia Vision Facilitate multi-disciplinary medical research Leverage research collaboration Link heterogeneous data from multiple institutions Confer value, retain and re-use health data Enforce system security Respect patient privacy Select pragmatic technology
Federating the Data in BioGrid
The Images Sub-project Take 7 million proprietary Magnetic Resonance Images (MRIs) on over 1000 DAT format tapes out-of-date media, inaccessible Convert to Digital Imaging and Communications in Medicine (DICOM) format Store and index images on-line Extract DICOM header information Link into BioGrid Australia and issue record linking ID Retrieve de-identified images on demand Be economical and sustainable
BioGrid Architecture
Database Architecture Windows Server 2003 (32bit) Moving to Windows Server 2003 (64bit) Oracle Database 11g Release 1 Version 11.1.0.6 Single Instance Database 957GB Storage comprised of 8 drives in a RAID0 configuration Separate physical drive contains flashback recovery area
Agenda Purpose and Description of BioGrid Oracle Database 11g Advantages Oracle Database 11g Examples and Details Best Practices from the Oracle Multimedia Team References
Oracle Database 11g: Why Chosen (1) Oracle Database 11g stores the images on line The images can be retrieved on demand Oracle Database 11g indexes and partitions for fast query Oracle Multimedia 11g has a dedicated DICOM data type with rich feature set SQL*Loader can be tuned for fast image load
Oracle Database 11g: Why Chosen (2) Security features are available Compression is available at the LOB level, on backup, and on DataPump export. Application Express is available for rapid application development Oracle Multimedia is a no-cost feature of Oracle Database Oracle Multimedia DICOM is a no-cost feature of Oracle Database Enterprise Edition
Oracle Database 11g: Oracle Multimedia DICOM Features Used Metadata extraction Selection and Viewing of DICOM attributes Conversion of DICOM into other image formats (eg. JPEG, GIF, PNG, AND TIFF) Making DICOM data anonymous Importing and exporting images on other servers using mapped drives
Agenda Purpose and Description of BioGrid Oracle Database 11g Advantages Oracle Database 11g Examples and Details Best Practices from the Oracle Multimedia Team References
Oracle Database 11g Examples and Details Use ORDDICOM object type in create table Use setproperties to extract metadata into ORDDICOM object Select and view DICOM attributes Create view for patient details Convert image from DICOM to JPEG and make anonymous Import and export images Use of SecureFiles compression for DICOM objects Compression in backup using RMAN Maximize SQL*Loader performance Compression and parallelisation with Data Pump export
Example: Use ORDDICOM Object Type in Create Table create table medical_image_table (id varchar(50), TAPE_ID number, dicom orddicom, USI varchar(50) ) LOB (dicom.source.localdata) STORE AS SECUREFILE (COMPRESS HIGH) PARTITION BY range (TAPE_ID) ( PARTITION PART1 VALUES less than (50) TABLESPACE TBLS_PART1_FROM_TAPE1 );
Example: Use setproperties to Extract Metadata into ORDDICOM Object -- Set Data Model Repository. This procedure must be called at the -- beginning of each database session. execute ordsys.ord_dicom.setdatamodel(); declare obj orddicom; res varchar2(1000); begin select dicom into obj from medical_image_table where id = 'E11200S001I001.dcm' for update; obj.setproperties; end; /
Example: Select and View DICOM Attributes select t.dicom.getattributebytag('00200010') as STUDY_ID, t.dicom.getattributebytag('00100010') as PATIENT_NAME, t.dicom.getattributebytag('00100020') as PATIENT_ID, TO_DATE(t.dicom.getAttributebyTag('0010003 0'),'YYYY-DD-MM') as PATIENT_DOB, from medical_image_table t where t.dicom.id = 'E11200S001I001.dcm';
Example: Create View for Patient Details Create or replace view patient_details as select t.id,t.tape_id,t.usi,,(t.dicom.getattributebytag('00080030')) as STUDY_TIME from medical_image_table But for viewing the view we have to execute ordsys.ord_dicom.setdatamodel() to load datamodel repository for getting attibutes by tag.
Example: Convert Image from DICOM to JPEG and Make Anonymous declare dcm ordsys.orddicom; begin ord_dicom.setdatamodel; for rec in (select * from medical_image_table for update) loop rec.dicom.setproperties(); -- create a JPEG thumbnail rec.dicom.processcopy('fileformat=jpeg fixedscale=75,100', rec.imagethumb); -- make a new anonymous version of the ORDDicom object rec.dicom.makeanonymous(genuid(rec.id), rec.anondicom); -- write the objects back to the row.. end loop; commit; end; /
Example: Import and Export Images CONNECT / AS SYSDBA --Directory IMAGEDIR for export/import DICOM create or replace directory imagedir as 'O:\ORACLE_DICOM_IMAGES'; grant read,write on directory IMAGEDIR to Administrator; -- import() method can be used to import (where -- ORDDICOM source attributes contain FILE, -- IMAGEDIR, and filename) dcm.import(); -- export() method can be used to export dcmsrc.export('file', 'IMAGEDIR', filename);
Example: Use of SecureFiles Compression for DICOM Objects Use SecureFiles Compression on LOB Columns create table medical_image_table (id varchar(50), TAPE_ID number, dicom orddicom, USI varchar(50) ) LOB (dicom.source.localdata) STORE AS SECUREFILE (COMPRESS HIGH) PARTITION BY range (TAPE_ID) ( PARTITION PART1 VALUES less than (50) TABLESPACE TBLS_PART1_FROM_TAPE1 );
Impact of Compression DICOM images are stored as SECUREFILE (COMPRESS HIGH) Achieves highest compression level DICOM image size on file system: 1.48TB=1515.52GB Database size: 820GB (less 3-4GB for other tables) Overall compression: (1515-816)/1515= approx. 46%
Example: Compression in Backup Using RMAN RMAN> configure device type disk backup type to compressed backupset; RMAN> configure channel device type disk maxpiecesize 50g; RMAN> show compression algorithm; RMAN configuration parameters for database with db_unique_name RMHIMG are: CONFIGURE COMPRESSION ALGORITHM 'BZIP2'; RMAN> backup database; Note: ZLIB offers better speed, lower compression ratio. BZIP2 offers better compression ratio, but is slower
Maximize SQL*Loader Performance Use Direct Path Loads(direct=true) - The conventional path uses standard insert statements whereas the direct path loader loads directly into the Oracle data files and creates blocks in Oracle database block format. Disable/Drop Indexes and Constraints Disable Archiving During Load Use unrecoverable - This disables the writing of the data to the redo logs. The parallel load option is not allowed when loading lob columns with direct path Do remember to create indexes or enable them after direct load. Otherwise performance will be affected.
SQL*Loader Performance Results Using these options BioGrid was able to reduce time for loading 50 tapes from 13 hours to approximately 5 hours.
Compression and Parallelisation with Data Pump Export expdp Images_admin/WELCOME DIRECTORY=BACKUP_64BIT JOB_NAME=IMAGES_ADMIN_EXP_JOB dumpfile=images_admin%u.dmp PARALLEL=3 COMPRESSION=all With PARALLEL=3 three Dump files IMAGES_ADMIN%u.DMP are created as shown in screen making the export process much faster. After export, each partition is further compressed to 21-30 GB (originally 40-50GB after SecureFiles compress high).
Agenda Purpose and Description of BioGrid Oracle Database 11g Advantages Oracle Database 11g Examples and Details Best Practices from the Oracle Multimedia Team References
References (1) http://www.oracle.com/technology/produ cts/database/application_express/howto s/howtos.html http://www.oracle.com/technology/obe/1 1gr1_db/index.htm http://download.oracle.com/docs/cd/b28 359_01/appdev.111/b28416/ch_dev_ap ps.htm#ciheigbc
References (2) http://www.remotedba.net/teas_rem_util18.htm http://medical.nema.org/ http://www.oracle.com/pls/db111/homepage
Thank you! BIOGRID AUSTRALIA Pranabh Jain Naomi Rafael Oracle Multimedia Development Susan Mavris