Persistent Persistent Identifiers Identifiers (PIDs) (PIDs) WORKSHOP 28 TH /29 TH APRIL 2015 WORKSHOP 28 TH /29 TH APRIL 2015 Christine Staiger
Persistent Identifiers (PIDs) Pointers to data resources Digital Resources: Data, metadata, documents Real world objects: Species, patient, cell line Globally unique Exist infinitely long Used to identify and retrieve resources Examples: ISBNs, BSNs, DOIs, EPIC PIDS, URIs 2
Digital Object (DO) PID Data Metadata Synchronise PID, Data and Metadata during creation, maintenance and deletion of a digital object! 3
PIDs are static PID 1 PID 2 PID 3 PID 4 Data 1 Data 2 World of data infrastructure Data 4 Data 3 4
Workflow1: Change storage environment PID1 PID2 Storage site A Storage site B 5
Use Case 1: Digital repositories PIDs point to landing page of the digital repository showing metadata Real data can be downloaded from this page with another link E.g. B2SHARE, 3.TU Datacentrum PID http://hdl.handle.net/11304/3265434c-4b34-11e4-81ac-dcbd1b51435e resolves to https://b2share.eudat.eu/record/139 6
Use Case 2: Enabling data flows PIDs point to data directly If needed create another field specifying the data type to choose application 7
Use Case 2a: Retrieving information Use data in workflow via PID, NOT via actual location! 8
Use Case 2b: Enabling workflows Execute program hidden behind a PID 9
PID resolution: Example Handle Handle Resolution Collection of handle services Services consist of several sites Sites contain several serversß 10
PID resolution: Example Handle Handle Resolution Client Local HS Local HS Global HS Local HS Local HS Site 1 Site 2 Site 3. Site n Site 1 Site 2 #1 #2 #1 #2 #3 #4. #n 123.456/abc URL 4 http://www.acme.com/ URL 8 http://www.ideal.com/ 11
Resolving PIDs 1. Client sends request to Global to resolve 0.NA/123 (prefix handle for 123/456) 2. Global Responds with Service Information for 123 Global Registry E.g. Handle system 3. Client gets request to resolve hdl:123/456 IP. 4. Server responds with handle data Secondary Site B #1 ccxv cx cx ccxv cx cx #1 #2 #1 #2 #3 Primary Site Secondary Site A ccxv cx cx Service Information Local Handle Service Local Service
Example: Relationships between DOs PID: prefix2/suffix2 PID: prefix1/suffix1 Metadata: key1: key2: prefix1/suffix1 Part of/has part relationships Metadata: key1: key2: prefix2/suffix2 key3: prefix3/suffix3 PID: prefix3/suffix3 Metadata: key1: key2: prefix1/suffix1 Model cohort-patient relationship Model patient-samples relationship Which metadata to store with the PID and which in en extra catalogue?
Guidelines: Characteristics of PIDs What should be identifiable by a PID? Define what is data and what is metadata Information contained in PID entries: Location Checksums System specific information No information on context or contents! Don t mix PIDs with other IDs, e.g. database IDs Opacity: No assumptions about data context in PID 14
The handle system Offers a resolution service for PIDs Gives a lot of freedom for implementation, e.g. PID information types Software architecture designed for high availability and scalability Basis for several PID providers European Persistent Identifier Consortium PIDs and Digital Object Identifiers (EPIC) Employ handle service Provide extended APIs 15
Thank you! 16