Big Data 11. Data Models
|
|
- Victor Wilcox
- 5 years ago
- Views:
Transcription
1 Ghislain Fourny Big Data 11. Data Models pinkyone / 123RF Stock Photo
2 CSV (Comma separated values) This is syntax ID,Last name,first name,theory, 1,Einstein,Albert,"General, Special Relativity" 2,Gödel,Kurt,"""Incompleteness"" Theorem" 2
3 CSV (Comma separated values) This is syntax ID,Last name,first name,theory, 1,Einstein,Albert,"General, Special Relativity" 2,Gödel,Kurt,"""Incompleteness"" Theorem" ID Last name First name Theory 1 Einstein Albert General, Special Relativity 2 Gödel Kurt "Incompleteness" Theorem This is a data model 3
4 Syntax vs. Data Models Physical view Syntax ID,Last name,first name,theory, 1,Einstein,Albert,"General, Special Relativity" 2,Gödel,Kurt,"""Incompleteness"" Theorem" 4
5 Syntax vs. Data Models Logical view Data Model ID Last name First name Theory 1 Einstein Albert General, Special Relativity 2 Gödel Kurt "Incompleteness" Theorem Physical view Syntax ID,Last name,first name,theory, 1,Einstein,Albert,"General, Special Relativity" 2,Gödel,Kurt,"""Incompleteness"" Theorem" 5
6 Syntax vs. Data Models Physical view Syntax <a> <d e="f"/> <c>this is <b>text</b>.</c> </a> 6
7 Syntax vs. Data Models a Logical view Data Model e = f d This is c b. text Physical view Syntax <a> <d e="f"/> <c>this is <b>text</b>.</c> </a> 7
8 Edge vs. Node labeling foo bar Labels are on the edges 8
9 Edge vs. Node labeling foo foo foobar bar bar Labels are on the edges Labels are on the nodes 9
10 XML Data models Information Set (Infoset) 10
11 XML Data models Information Set (Infoset) Post Schema-Validation Infoset (PSVI) 11
12 XML Data models Information Set (Infoset) Post Schema-Validation Infoset (PSVI) XQuery and XPath Data Model (XDM) 12
13 JSON Data Models "original" (implicit) JSON Data Model 13
14 JSON Data Models "original" (implicit) JSON Data Model JSON Schema Data Model 14
15 JSON Data Models "original" (implicit) JSON Data Model JSON Schema Data Model JSONiq Data Model (JDM) y/html/section-jsoniq-data-model.html 15
16 HTML/XML Data model Document Object Model (DOM) 16
17 grigory_bruev / 123RF Stock Photo XML Information Set 17
18 Information Set <?xml version="1.0" encoding="utf-8"?> <dc:metadata xmlns:dc=" <title xml:lang="en" year="2008" >Systems Group</title> <publisher>eth Zurich</publisher> </dc:metadata> 18
19 Information Set 19
20 The 11 XML Information Items Document Element Attribute Processing Instruction Character Comment Namespace Unexpanded Entity Reference DTD Unparsed Entity Notation 20 20
21 The 11 XML Information Items Document Element Attribute Processing Instruction Character Comment Namespace Unexpanded Entity Reference DTD Unparsed Entity Notation 21 21
22 Document Information Items Document Information Item doc [children] Element Information Item metadata [document element] Element Information Item metadata [notations] <empty> [unparsed entities] <empty> [base URI ] file:///users/bigdata/documents/info.xml [character encoding scheme] UTF-8 [standalone] <no value> [version]
23 Element Information Items Element Information Item metadata metadata [namespace name] [local name] metadata [prefix] dc [children] Element Information Items title, publisher [attributes] <empty> [namespace attributes] Attribute Information Item xmlns:dc [in-scope namespaces] Namespace Information Items dc->systems, xml->ns [base URI] file:///users/bigdata/documents/info.xml [parent] Document Information Item doc 23
24 Attribute Information Items Attribute Information Item xmlns:dc [namespace name] [local name] dc [prefix] xmlns [normalized value] [specified] true [attribute type] <no value> [references] unknown [owner element] Element Information Item metadata xmlns:dc 24
25 Namespace Information Items Namespace Information Item dc->systems [prefix] dc [namespace name] dc->systems Namespace Information Item xml->ns xml->ns [prefix] xml [namespace name] 25
26 XML Infoset - the tree doc xmlns:dc metadata xml->ns dc->systems xml->ns dc->systems title publisher dc->systems xml->ns lang year ETH Zurich Systems Group 26
27 Post-Schema-Validation Infoset Infoset + Types Post-Schema-Validation Infoset (PSVI) 27
28 Weerapat Kiatdumrong / 123RF Stock Photo XPath and XQuery Data Model 28
29 XDM: Sequences of Items (,,,,, ) 29
30 XDM: Sequence of one item 30
31 XDM: Sequence of one item = ( ) 31
32 XDM: Sequences are flat ((, ), ) 32
33 XDM: Sequences are flat ((, ), )=(,, ) 33
34 XDM: Items Atomic Node 34
35 XDM: Seven Kinds of XML Nodes Document node Element node Attribute node Text node Comment node Processing instruction node Namespace node 35
36 XDM: Seven Kinds of XML Nodes Infoset XDM 36
37 XDM vs. Infoset Infoset XDM xs:untyped 37
38 XDM: New Items in 3.0 and 3.1 Functions 38
39 XDM: New Items in 3.0 and 3.1 lorem ipsum dolor sit amet Functions Maps 39
40 XDM: New Items in 3.0 and 3.1 lorem ipsum dolor sit amet Functions Maps Arrays 40
41 XDM and Querying for let order by if + any else = then every while where return exit with Expression 41
42 Types 42
43 Type Systems Almost all type systems (Java, SQL, PSVI, JDM, Protocol buffers, Avro, Parquet, and so on) share the following properties: 43
44 Type Systems Almost all type systems (Java, SQL, PSVI, JDM, Protocol buffers, Avro, Parquet, and so on) share the following properties: - Distinction between atomic types and structured types 44
45 Type Systems Almost all type systems (Java, SQL, PSVI, JDM, Protocol buffers, Avro, Parquet, and so on) share the following properties: - Distinction between atomic types and structured types - Same categories of atomic types 45
46 Type Systems Almost all type systems (Java, SQL, PSVI, JDM, Protocol buffers, Avro, Parquet, and so on) share the following properties: - Distinction between atomic types and structured types - Same categories of atomic types - Lists and maps as structured types 46
47 Type Systems Almost all type systems (Java, SQL, PSVI, JDM, Protocol buffers, Avro, Parquet, and so on) share the following properties: - Distinction between atomic types and structured types - Same categories of atomic types - Lists and maps as structured types - Sequence type cardinalities 47
48 Types (General) Atomic Types vs. Structured Types 48
49 Atomic Types 49
50 Atomic Types Strings 50
51 Strings (Character sequences with monoid structure) "foo" "Zurich" "Ilsebill salzte nach." f o o Z u r i c h I l s e b i l l s a l z t e n a c h. 51
52 Atomic Types Strings Numbers 52
53 Interval-based integer types (exist as signed and unsigned) 8-bit (Java's byte) 16-bit (Java's short) 32-bit (Java's int) 64-bit (Java's long) 53
54 Arbitrary precision decimals (and integers) Any precision and scale
55 Float and Double IEEE 754 standard 32 bits 64 bits ca. 7 digits ca. 15 digits to to single precision double precision 55
56 Atomic Types Strings Numbers Booleans 56
57 Booleans TRUE t true y yes on 1 FALSE f false n no off 0 57
58 Atomic Types Strings Numbers Booleans Dates and Times 58
59 Dates and times Date Time Timestamp Duration 59
60 Dates (Gregorian calendar) Year + Month + Day 2017 August 1st (AD) 60
61 Times Hours + Minutes + Seconds 10 : 31 :
62 Timestamps Year + Month + Day + Hours + Minutes + Seconds 2017 August 1 st 10 : 31 : (AD) 62
63 Atomic Types Strings Numbers Booleans Dates and Times Time Intervals 63
64 Duration kinds Year Month Day Hour Minute Second Example: 2 years and 4 months 64
65 Duration kinds Year Month Day Hour Minute Second Example: 2 years and 4 months Example: 3 hours and 14 minutes 65
66 Atomic Types Strings Numbers Booleans Dates and Times Time Intervals Binaries 66
67 Atomic Types Strings Numbers Booleans Dates and Times Time Intervals Binaries Null 67
68 Lexical space vs. value space Value space Lexical space 68
69 Lexical space vs. value space "1" "01"... Value space Lexical space 69
70 Lexical space vs. value space "4" "04" "100b"... Value space Lexical space 70
71 Subtypes Supertype's value space Subtype's value space 71
72 Structured Types Data Structure Associative Arrays (a.k.a. maps) Ordered Lists Examples JSON Object, Protobuf Message, Set of XML Attributes JSON Array, XML Element, Protobuf repeated field 72
73 Structured Types Data Structure Associative Arrays (a.k.a. maps) Ordered Lists Examples JSON Object, Protobuf Message, Set of XML Attributes JSON Array, XML Element, Protobuf repeated field 73
74 Cardinality How many? Common sign Common adjective 74
75 Cardinality One How many? Common sign Common adjective required 75
76 Cardinality One How many? Common sign Zero or more * Common adjective required 76
77 Cardinality One How many? Common sign Zero or more * Common adjective required Zero or one? optional 77
78 Cardinality One How many? Common sign Zero or more * Common adjective required Zero or one? optional One or more + 78
79 wklzzz / 123RF Stock Photo JSON Data Model 79
80 JSON Values Strings Objects Numbers Booleans Arrays Null Atomic values Structured values 80
81 JSON Values Strings Numbers Booleans Null Objects String-to-Value map Arrays List of values Atomic values Structured values 81
82 JSON Values Strings Numbers Booleans Null Objects String-to-Value map Recursion Arrays List of values Atomic values Structured values 82
83 Tree-based visual model { } "foo" : true, "bar" : [ { "foobar" : "foo" }, null ] 83
84 Tree-based visual model { } "foo" : true, "bar" : [ { "foobar" : "foo" }, null ] object 84
85 Tree-based visual model { } "foo" : true, "bar" : [ { "foobar" : "foo" }, null ] foo true object bar array 85
86 Tree-based visual model { } "foo" : true, "bar" : [ { "foobar" : "foo" }, null ] foo true object object bar array null foobar foo 86
87 wklzzz / 123RF Stock Photo Protocol Buffers 87
88 Messages message Person { required string last_name = 1; repeated string first_name = 2; optional Title title = 3; optional Person boss = 4; } 88
89 Scalar types double, float int32, int64 and variants bool string bytes 89
90 Enums enum Title { MR = 1; MS = 2; MRS = 3; } 90
91 In C++ person.boss().first_name() 91
92 Validation Burak Cakmak / 123RF Stock Photo 92
93 Validation: The Pipeline Document 93
94 Validation: The Pipeline Document Well- Formedness 94
95 Validation: The Pipeline Document Well- Formedness Validation 95
96 On the oxygen Cheat Sheet Validity Well- Formedness 96
97 Validation vs. Annotation Validation 97
98 Validation vs. Annotation Validation Annotation 98
99 Validation 99
100 Validation 100
101 Validation 101
102 Validation 102
103 Validation 103
104 DTD Validation 104
105 Document Type Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a>& <a>& &&<d&e="f"/>& &&<c>this&is&<b>text</b>.</c>& </a>& & 105
106 Document Type Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a>& <a>& &&<d&e="f"/>& &&<c>this&is&<b>text</b>.</c>& </a>& & 106
107 Document Type Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& ]>& <a>& &&<d&e="f"/>& &&<c>this&is&<b>text</b>.</c>& </a>& & 107
108 Element Type Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& ]>& <a>& &&<d&e="f"/>& &&<c>this&is&<b>text</b>.</c>& </a>& & 108
109 Element Type Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& <!ELEMENT&a&EMPTY>& ]>& <a/>& & Empty Content 109
110 Element Type Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& <!ELEMENT&a&(#PCDATA)>& ]>& <a>& &&This&is&text.& </a>& & Simple Content 110
111 Element Type Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& <!ELEMENT&a&(foo,&bar)>& <!ELEMENT&foo&EMPTY>& <!ELEMENT&bar&EMPTY>& ]>& <a>& &&<foo/>& &&<bar/>& </a>& & Complex Content 111
112 Element Type Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& <!ELEMENT&a&(foo+,&bar*,&foobar?)>& <!ELEMENT&foo&EMPTY>& <!ELEMENT&bar&EMPTY>& <!ELEMENT&foobar&EMPTY>& ]>& <a>& &&<foo/>& &&<foo/>& &&<foo/>& &&<foobar/>& </a>& Complex Content 112
113 Element Type Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& <!ELEMENT&a&(&bar*& &(foo foobar)+)?>& <!ELEMENT&foo&EMPTY>& <!ELEMENT&foobar&EMPTY>& ]>& <a>& &&<foo/>& &&<foobar/>& &&<foo/>& &&<foobar/>& </a>& & Complex Content 113
114 Element Type Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& <!ELEMENT&a&(#PCDATA& &foo)*>& <!ELEMENT&foo&EMPTY>& ]>& <a>& &&<foo/>lorum<foo/>ipsum<foo/>& </a>& & Mixed Content 114
115 Element Type Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& <!ELEMENT&a&ANY>& <!ELEMENT&foo&EMPTY>& <!ELEMENT&bar&EMPTY>& ]>& <a>& &&<foo/>& &&Lorem& &&<bar/>& &&Ipsum& &&<bar/>& &&<bar/>& </a>& & Mixed Content 115
116 Attribute-List Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& <!ELEMENT&a&EMPTY>& <!ATTLIST&a&foo&CDATA&#REQUIRED>& ]>& <a&foo="this&is&a&"value""></a>& & 116
117 Attribute-List Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& <!ELEMENT&a&EMPTY>& <!ATTLIST&a&foo&CDATA&#IMPLIED>& ]>& <a&foo="this&is&a&"value""></a>& & 117
118 Attribute-List Declaration <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE a [ <!ELEMENT a EMPTY> <!ATTLIST a foo CDATA #IMPLIED> ]> <a></a> 118
119 Attribute-List Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& <!ELEMENT&a&EMPTY>& <!ATTLIST&a&foo&CDATA&"bar">& ]>& <a&foo="this&is&a&"value""></a>& & 119
120 Attribute-List Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& <!ELEMENT&a&EMPTY>& <!ATTLIST&a&foo&CDATA&"bar">& ]>& <a></a>& & 120
121 Attribute-List Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& <!ELEMENT&a&EMPTY>& <!ATTLIST&a&foo&CDATA&#FIXED&"bar">& ]>& <a&foo="bar"></a>& & 121
122 Attribute-List Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& <!ELEMENT&a&EMPTY>& <!ATTLIST&a&foo&NMTOKEN&#REQUIRED>& ]>& <a&foo="123atoken456"></a>& & 122
123 Attribute-List Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&a&[& <!ELEMENT&a&EMPTY>& <!ATTLIST&a&foo&NMTOKENS&#REQUIRED>& ]>& <a&foo="123atoken&456"></a>& & 123
124 Attribute-List Declaration <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&root&[& <!ELEMENT&root&(foo+,&bar,&barlist)>& <!ELEMENT&foo&EMPTY>& <!ELEMENT&bar&EMPTY>& <!ELEMENT&barlist&EMPTY>& <!ATTLIST&foo&myid&ID&#REQUIRED>& <!ATTLIST&bar&ref&IDREF&#REQUIRED>& <!ATTLIST&barlist&ref&IDREFS&#REQUIRED>& ]>& <root>& &&<foo&myid="foobar"/>& &&<foo&myid="foobar2"/>& &&<bar&ref="foobar"/>& &&<barlist&ref="foobar&foobar2"/>& </root>& & 124
125 DTD Example: External Subset <?xml&version="1.0"&encoding="utf98"?>& <!DOCTYPE&root&SYSTEM&"schema.dtd">& <a&foo="bar"/>& 125
126 Warning: DTDs and Namespaces <!ELEMENT(eth((date,(president,(Rektor)>( <!ATTLIST(eth(xmlns(CDATA(#FIXED( " ((((((((((((((xmlns:xmldb(cdata(#fixed( " ((((((((((((((date(cdata(#implied( ((((((((((((((xmldb:date(cdata(#implied>( <!ELEMENT(date((#PCDATA)>( <!ELEMENT(president((#PCDATA)>( <!ATTLIST(president(number(CDATA(#IMPLIED>( <!ELEMENT(Rektor((#PCDATA)>( ( 126
127 Notations and Unparsed Entities <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE foo [ <!ENTITY presentation SYSTEM "/Users/bigdata/desktop/presentation.pptx" NDATA pptx> <!NOTATION pptx PUBLIC "powerpoint"> <!ELEMENT foo EMPTY> <!ATTLIST foo foo ENTITY #REQUIRED> ]> <foo foo="presentation"></foo> 127
128 XML Schema 128
129 Empty Schema <?xml&version="1.0"&encoding="utf98"?>& <xs:schema& &&xmlns:xs=" </xs:schema>& & 129
130 Simple Scenario <?xml&version="1.0"&encoding="utf98"?>& <xs:schema& &&xmlns:xs=" &&<xs:element&name="foo"&type="xs:string"/>& </xs:schema>& & & schema.xsd <?xml&version="1.0"&encoding="utf98"?>& <foo& &&xmlns:xsi=" &&xsi:nonamespaceschemalocation="schema.xsd">& &&This&is&text.& </foo>& & file.xml 130
131 Simple Scenario <?xml&version="1.0"&encoding="utf98"?>& <xs:schema& &&xmlns:xs=" &&<xs:element&name="foo"&type="xs:integer"/>& </xs:schema>& & & schema.xsd <?xml&version="1.0"&encoding="utf98"?>& <foo& &&xmlns:xsi=" &&xsi:nonamespaceschemalocation="schema.xsd">& &&142857& </foo>& & file.xml 131
132 Simple Types: Built-in Strings Numbers Booleans string anyuri QName decimal integer float double long int short byte positiveinteger nonnegativeinteger... unsignedlong unsignedint... boolean 132
133 Simple Types: Built-in Dates and Times Time Intervals Binaries Null - datetime time date gyearmonth gmonthday gyear gmonth gday datetimestamp duration yearmonthduration daytimeduration hexbinary base64binary 133
134 Dates T10:15:00Z 01:15:00-08:00 134
135 Durations P1Y2MT3H 135
136 User-defined types 136
137 User-defined types Restriction 137
138 User-defined types Restriction Union Not atomic 138
139 User-defined types Restriction Union Not atomic List Not atomic 139
140 Restriction <?xml&version="1.0"&encoding="utf98"?>& <xs:schema& &&xmlns:xs=" &&<xs:simpletype&name="myfixedlengthstring">& &&&&<xs:restriction&base="xs:string">& &&&&&&<xs:length&value="3"/>& &&&&</xs:restriction>& &&</xs:simpletype>& &&<xs:element&name="foo"&type="myfixedlengthstring"/>& </xs:schema>& schema.xsd <?xml&version="1.0"&encoding="utf98"?>& <foo& &&xmlns:xsi=" &&xsi:nonamespaceschemalocation="schema.xsd">zrh</foo>& & & file.xml 140
141 Restriction <xs:simpletype,name="myfixedlengthstring">,,,<xs:restriction,base="xs:string">,,,,,<xs:length,value="3"/>,,,</xs:restriction>, </xs:simpletype>,, <foo>zrh</foo>,,, 141
142 List <xs:simpletype,name="mylist">,,,<xs:list,itemtype="xs:string"/>, </xs:simpletype>,, <foo>foo,bar,foobar</foo>,, 142
143 Union <xs:simpletype,name="myunion">,,,<xs:union,membertypes="xs:integer,xs:boolean"/>, </xs:simpletype>,, <foo>true</foo>,,,, 143
144 Complex Types 144
145 Complex Types Empty <foo/> 145
146 Complex Types Empty <foo/> Simple Content <foo>text</foo> 146
147 Complex Types Empty <foo/> Simple Content Complex Content <foo>text</foo> <foo> <a/> <b/> </foo> 147
148 Complex Types Empty <foo/> Simple Content Complex Content Mixed Content <foo>text</foo> <foo> <a/> <b/> </foo> <foo> Text<a/>Text<b/> </foo> 148
149 Complex content <xs:complextype-name="complexcontent">- --<xs:sequence>- ----<xs:element-name="twotofour"-type="xs:string"-minoccurs="2"-maxoccurs="4"/>- ----<xs:element-name="zeroorone"-type="xs:boolean"-minoccurs="0"-maxoccurs="1"/>- --</xs:sequence>- </xs:complextype>- - <foo>- --<twotofour>foobar</twotofour>- --<twotofour>foobar</twotofour>- --<twotofour>foobar</twotofour>- --<zeroorone>true</zeroorone>- </foo>
150 Complex content <xs:complextype-name="complexcontent">- --<xs:sequence>- ----<xs:element-name="twotofour"-type="xs:string"-minoccurs="2"-maxoccurs="4"/>- ----<xs:element-name="zeroorone"-type="xs:boolean"-minoccurs="0"-maxoccurs="1"/>- --</xs:sequence>- </xs:complextype>- - <foo>- --<twotofour>foobar</twotofour>- --<twotofour>foobar</twotofour>- --<twotofour>foobar</twotofour>- --<zeroorone>true</zeroorone>- </foo>
151 Empty content <xs:complextype-name="emptytype">- --<xs:sequence/>- </xs:complextype>- - <foo/>
152 Simple content <xs:complextype-name="datecountry">- --<xs:simplecontent>- ----<xs:extension-base="xs:date"> <xs:attribute-name="country"-type="xs:string"/>- ----</xs:extension>- --</xs:simplecontent>- </xs:complextype>- - <foo-country="switzerland">2014d12d02</foo>
153 Mixed content <xs:complextype-name="mixedcontent"-mixed="true">- --<xs:sequence>- ----<xs:element-name="b"-type="xs:string"-minoccurs="0"-maxoccurs="unbounded"/>- --</xs:sequence>- </xs:complextype>- - <foo>some-text-and-some-<b>bold</b>-text.</foo>
154 Simple type on attributes <xs:complextype-name="withattribute">- --<xs:sequence/>- --<xs:attribute-name="country" type="xs:string" default="switzerland"/>- </xs:complextype>- - <foo-country="switzerland"/>
155 Named Types <xs:complextype-name="empty">---- <xs:sequence/>- </xs:complextype>- - <xs:element-name="c"-type="empty">- </xs:element>
156 Anonymous Types <xs:element*name="c">* **<xs:complextype>* ****<xs:sequence/>* **</xs:complextype>* </xs:element>* * 156
157 No namespaces <?xml&version="1.0"&encoding="utf98"?>& <xs:schema& &&xmlns:xs=" &&<xs:element&name="foo"&type="xs:string"/>& </xs:schema>& & & <?xml&version="1.0"&encoding="utf98"?>& <foo& &&xmlns:xsi=" &&xsi:nonamespaceschemalocation="schema.xsd">& &&This&is&text.& </foo>& & 157
158 With namespaces <?xml&version="1.0"&encoding="utf98"?>& <xs:schema& &&xmlns:xs=" &&<xs:element&name="foo"&type="xs:string"/>& </xs:schema>& & <?xml&version="1.0"&encoding="utf98"?>& <big:foo& &&xmlns:xsi=" &&xsi:schemalocation="! &&xmlns:big=" &&This&is&text.& </big:foo>& & & 158
159 Keys <xs:schema* ****xmlns:xs=" **<xs:element*name="root">* ****<xs:complextype>* ******..* ****</xs:complextype>* ****<xs:key*name="foodid">* ******<xs:selector*xpath="foo"/>* ****</xs:key>* **</xs:element>* </xs:schema>* * <?xml*version="1.0"*encoding="utfd8"?>* <root* **xmlns:xsi=" **xsi:nonamespaceschemalocation="schema.xsd">* **<foo*id="foo"/>* **<foo*id="bar"/>* **<foo*id="foobar"/>* </root>* * 159
160 Keys <xs:schema* ****xmlns:xs=" **<xs:element*name="root">* ****<xs:complextype>* ******..* ****</xs:complextype>* ****<xs:key*name="foodid">* What must be unique ******<xs:selector*xpath="foo"/>* ****</xs:key>* **</xs:element>* </xs:schema>* * <?xml*version="1.0"*encoding="utfd8"?>* <root* **xmlns:xsi=" **xsi:nonamespaceschemalocation="schema.xsd">* **<foo*id="foo"/>* **<foo*id="bar"/>* **<foo*id="foobar"/>* </root>* * 160
161 Keys <xs:schema* ****xmlns:xs=" **<xs:element*name="root">* ****<xs:complextype>* ******..* ****</xs:complextype>* ****<xs:key*name="foodid">* ******<xs:selector*xpath="foo"/>* ****</xs:key>* **</xs:element>* </xs:schema>* * <?xml*version="1.0"*encoding="utfd8"?>* <root* What must be unique What makes it unique **xmlns:xsi=" **xsi:nonamespaceschemalocation="schema.xsd">* **<foo*id="foo"/>* **<foo*id="bar"/>* **<foo*id="foobar"/>* </root>* * 161
162 Bonus material: The Schema of Schemas!!<xs:schema!!!!!xmlns:xs=" 162
163 Avro 163
164 Avro is language neutral 164
165 Interoperability 165
166 Interoperability Avro DataFile 166
167 Interoperability Avro DataFile 167
168 Unlike Protocol Buffer and Thrift... Schema (Language independent) 168
169 Unlike Protocol Buffer and Thrift... Schema (Language independent) Compilation 169
170 Unlike Protocol Buffer and Thrift... Schema (Language independent) Compilation Code 170
171 Unlike Protocol Buffer and Thrift... Schema (Language independent) Compilation More Code More Code Code More Code More Code 171
172 Unlike Protocol Buffer and Thrift... Schema (Language independent) Compilation More Code More Code Code More Code More Code Data 172
173 Avro processes schemas not known at compile time More Code More Code Code More Code More Code Schema Data + Known at runtime 173
174 Avro Schema Schema This is actually JSON 174
175 Avro Schema Schema Avro IDL But there is also another syntax for users that like C 175
176 Avro DataFile Avro DataFile An Avro DataFile is encoded in binary format 176
177 Avro DataFile DataFile JSONbased encoder But there is also another debuggingfriendly syntax 177
178 Avro types Atomic Types 178
179 Avro types Atomic Types vs. Structured Types 179
180 Avro atomic types Category Absence of value Boolean Number String Binary Types null boolean int long float double string enum bytes fixed (list of 8-bit unsigned bytes) 180
181 Avro atomic types { "type" : "null" } { "type" : "string" } { "type" : "boolean" } Primitive types 181
182 Avro atomic types { "type" : "null" } { "type" : "string" } { "type" : "boolean" } { } { } "type" : "enum", "name" : "cuttlery", "symbols" : [ "KNIFE", "FORK", "SPOON" ] "type" : "fixed", "name" : "MD5", "size" : 16 Primitive types Non-primitive types 182
183 Avro structured types Category Arrays Objects Types array map record 183
184 Avro arrays 1 foo 2 bar 3 foobar 4 foo 5 foobar 6 bar {a: 1, b: "foo"} 2 {a: 2, b: "foo"} 3 {a: 3, b: "bar"} 184
185 Avro arrays 1 foo 2 bar 3 foobar 4 foo 5 foobar 6 bar {a: 1, b: "foo"} 2 {a: 2, b: "foo"} 3 {a: 3, b: "bar"} { } "type" : "array", "items" : "string" 185
186 Avro arrays 1 foo 2 bar 3 foobar 4 foo 5 foobar 6 bar {a: 1, b: "foo"} 2 {a: 2, b: "foo"} 3 {a: 3, b: "bar"} { } "type" : "array", "items" : "string" All items in the array must have the same type. 186
187 Avro maps keys foo bar foobar values true false true 187
188 Avro maps keys foo bar foobar values true false true Must be strings 188
189 Avro maps keys foo bar foobar values true false true Must be same type 189
190 Avro maps keys foo bar foobar values true false true { } "type" : "map", "name" : "flags", "values" : "boolean" Must be same type 190
191 Avro records fields values name ETH canton ZH students 20,
192 Avro records { } fields name canton values ETH ZH students 20,000 "type" : "map", "name" : "university", "fields" : [ { "name" : "name", "type" : "string" }, { "name" : "canton", "type" : "cantonal-code" }, { "name" : "students", "type" : "long" } ] 192
193 Mappings Generic 193
194 Mappings Generic Specific 194
195 Mappings Generic Specific Reflexive 195
196 Avro DataFile Metadata (including schema) 196
197 Avro DataFile Metadata (including schema) Block 197
198 Avro DataFile Metadata (including schema) Block Block 198
199 Avro DataFile Metadata (including schema) Block Block Block 199
200 Avro DataFile Metadata (including schema) Block Block Block Block 200
201 Avro DataFile Metadata (including schema) Block Block Block Block Block 201
202 Avro DataFile Metadata (including schema) Specifies sync marker Block sync marker Block Block Block Block 202
203 Avro DataFiles are splittable 203
204 Avro DataFiles are splittable 204
205 Avro DataFiles are splittable 205
206 Schema resolution Old Schema 206
207 Schema resolution Old Schema Avro DataFile 207
208 Schema resolution Old Schema New Schema Avro DataFile 208
209 Schema resolution Old Schema New Schema Avro DataFile New field 209
210 Schema resolution Old Schema New Schema Avro DataFile New field Default value is taken 210
211 Schema resolution Old Schema New Schema Avro DataFile Removed field 211
212 Schema resolution Old Schema New Schema Avro DataFile Removed field Ignored (projection) 212
213 Schema resolution New Schema Old Schema Avro DataFile New field 213
214 Schema resolution New Schema Old Schema Avro DataFile New field Ignored (projection) 214
215 Schema resolution New Schema Old Schema Avro DataFile Removed field 215
216 Schema resolution New Schema Old Schema Avro DataFile Removed field Default value (error if none) 216
217 Parquet 217
218 YET ANOTHER format?
219 YET ANOTHER format? Don't we already have enough?
220 But this time, it's different
221 Row storage 221
222 Columnar storage 222
223 But... But I thought we were talking about heterogeneous collections of trees?
224 Mapping trees to tables... If our heterogeneous collection is really homogeneous and our trees are flat, this is easy 224
225 Mapping trees to tables... Otherwise, there are ways. Edge shredding Ad-hoc mappings Tree-encoding Dremel 225
226 Parquet atomic types Category Boolean Number String Binary Date Types boolean int32 int64 int96 double DECIMAL(precision, scale) UTF8 ENUM binary fixed_len_byte_array DATE 226
227 Parquet structured types Category Arrays Objects Types LIST MAP 227
228 Parquet schema definition Protocol buffer schemas Avro schemas Thrift schemas 228
Big Data for Engineers Spring Data Models
Ghislain Fourny Big Data for Engineers Spring 2018 11. Data Models pinkyone / 123RF Stock Photo CSV (Comma separated values) This is syntax ID,Last name,first name,theory, 1,Einstein,Albert,"General, Special
More informationBig Data 9. Data Models
Ghislain Fourny Big Data 9. Data Models pinkyone / 123RF Stock Photo 1 Syntax vs. Data Models Physical view Syntax this is text. 2 Syntax vs. Data Models a Logical view
More informationBig Data Fall Data Models
Ghislain Fourny Big Data Fall 2018 11. Data Models pinkyone / 123RF Stock Photo CSV (Comma separated values) This is syntax ID,Last name,first name,theory, 1,Einstein,Albert,"General, Special Relativity"
More informationSolution Sheet 5 XML Data Models and XQuery
The Systems Group at ETH Zurich Big Data Fall Semester 2012 Prof. Dr. Donald Kossmann Prof. Dr. Nesime Tatbul Assistants: Martin Kaufmann Besmira Nushi 07.12.2012 Solution Sheet 5 XML Data Models and XQuery
More informationIntroduction Syntax and Usage XML Databases Java Tutorial XML. November 5, 2008 XML
Introduction Syntax and Usage Databases Java Tutorial November 5, 2008 Introduction Syntax and Usage Databases Java Tutorial Outline 1 Introduction 2 Syntax and Usage Syntax Well Formed and Valid Displaying
More informationBig Data Exercises. Fall 2018 Week 8 ETH Zurich. XML validation
Big Data Exercises Fall 2018 Week 8 ETH Zurich XML validation Reading: (optional, but useful) XML in a Nutshell, Elliotte Rusty Harold, W. Scott Means, 3rd edition, 2005: Online via ETH Library 1. XML
More informationInformation Systems for Engineers Fall Data Definition with SQL
Ghislain Fourny Information Systems for Engineers Fall 2018 3. Data Definition with SQL Rare Book and Manuscript Library, Columbia University. What does data look like? Relations 2 Reminder: relation 0
More informationChapter 5: XPath/XQuery Data Model
5. XPath/XQuery Data Model 5-1 Chapter 5: XPath/XQuery Data Model References: Mary Fernández, Ashok Malhotra, Jonathan Marsh, Marton Nagy, Norman Walsh (Ed.): XQuery 1.0 and XPath 2.0 Data Model (XDM).
More informationJava EE 7: Back-end Server Application Development 4-2
Java EE 7: Back-end Server Application Development 4-2 XML describes data objects called XML documents that: Are composed of markup language for structuring the document data Support custom tags for data
More informationW3C XML Schemas For Publishing
W3C XML Schemas For Publishing 208 5.8.xml: Getting Started
More informationMarker s feedback version
Two hours Special instructions: This paper will be taken on-line and this is the paper format which will be available as a back-up UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE Semi-structured Data
More informationAvro Specification
Table of contents 1 Introduction...2 2 Schema Declaration... 2 2.1 Primitive Types... 2 2.2 Complex Types...2 2.3 Names... 5 2.4 Aliases... 6 3 Data Serialization...6 3.1 Encodings... 7 3.2 Binary Encoding...7
More information2006 Martin v. Löwis. Data-centric XML. XML Schema (Part 1)
Data-centric XML XML Schema (Part 1) Schema and DTD Disadvantages of DTD: separate, non-xml syntax very limited constraints on data types (just ID, IDREF, ) no support for sets (i.e. each element type
More informationAvro Specification
Table of contents 1 Introduction...2 2 Schema Declaration... 2 2.1 Primitive Types... 2 2.2 Complex Types...2 2.3 Names... 5 3 Data Serialization...6 3.1 Encodings... 6 3.2 Binary Encoding...6 3.3 JSON
More information[MS-SSISPARAMS-Diff]: Integration Services Project Parameter File Format. Intellectual Property Rights Notice for Open Specifications Documentation
[MS-SSISPARAMS-Diff]: Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft publishes Open Specifications documentation ( this documentation ) for
More informationBig Data 10. Querying
Ghislain Fourny Big Data 10. Querying pinkyone / 123RF Stock Photo 1 Declarative Languages What vs. How 2 Functional Languages for let order by if + any else = then every while where return exit with Expression
More informationXML and Web Services
XML and Web Services Lecture 8 1 XML (Section 17) Outline XML syntax, semistructured data Document Type Definitions (DTDs) XML Schema Introduction to XML based Web Services 2 Additional Readings on XML
More informationBig Data 12. Querying
Ghislain Fourny Big Data 12. Querying pinkyone / 123RF Stock Photo Declarative Languages What vs. How 2 Functional Languages for let order by if + any else = then every while where return exit with Expression
More informationBEAWebLogic. Integration. Transforming Data Using XQuery Mapper
BEAWebLogic Integration Transforming Data Using XQuery Mapper Version: 10.2 Document Revised: March 2008 Contents Introduction Overview of XQuery Mapper.............................................. 1-1
More information[MS-DPAD]: Alert Definition Data Portability Overview. Intellectual Property Rights Notice for Open Specifications Documentation
[MS-DPAD]: Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft publishes Open Specifications documentation ( this documentation ) for protocols,
More informationXML. Part II DTD (cont.) and XML Schema
XML Part II DTD (cont.) and XML Schema Attribute Declarations Declare a list of allowable attributes for each element These lists are called ATTLIST declarations Consists of 3 basic parts The ATTLIST keyword
More informationNo Trade Secrets. Microsoft does not claim any trade secret rights in this documentation.
[MS-DPAD]: Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft publishes Open Specifications documentation for protocols, file formats, languages,
More informationIntellectual Property Rights Notice for Open Specifications Documentation
[MS-SSISPARAMS-Diff]: Intellectual Property Rights tice for Open Specifications Documentation Technical Documentation. Microsoft publishes Open Specifications documentation for protocols, file formats,
More informationModelling XML Applications (part 2)
Modelling XML Applications (part 2) Patryk Czarnik XML and Applications 2014/2015 Lecture 3 20.10.2014 Common design decisions Natural language Which natural language to use? It would be a nonsense not
More informationThis Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to
This Documentation, which includes embedded help systems and electronically distributed materials, (hereinafter referred to as the Documentation ) is for your informational purposes only and is subject
More informationUsing and defining new simple types. Basic structuring of definitions. Copyright , Sosnoski Software Solutions, Inc. All rights reserved.
IRIS Web Services Workshop II Session 1 Wednesday AM, September 21 Dennis M. Sosnoski What is XML? Extensible Markup Language XML is metalanguage for markup Enables markup languages for particular domains
More informationFaster XML data validation in a programming language with XML datatypes
Faster XML data validation in a programming language with XML datatypes Kurt Svensson Inobiz AB Kornhamnstorg 61, 103 12 Stockholm, Sweden kurt.svensson@inobiz.se Abstract EDI-C is a programming language
More informationThe XQuery Data Model
The XQuery Data Model 9. XQuery Data Model XQuery Type System Like for any other database query language, before we talk about the operators of the language, we have to specify exactly what it is that
More informationOracle Berkeley DB XML. API Reference for C++ 12c Release 1
Oracle Berkeley DB XML API Reference for C++ 12c Release 1 Library Version 12.1.6.0 Legal Notice This documentation is distributed under an open source license. You may review the terms of this license
More informationIntroduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington
Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington CS330 Lecture April 8, 2003 1 Overview From HTML to XML DTDs Querying XML: XPath Transforming XML: XSLT
More informationXML Format Plug-in User s Guide. Version 10g Release 3 (10.3)
XML Format Plug-in User s Guide Version 10g Release 3 (10.3) XML... 4 TERMINOLOGY... 4 CREATING AN XML FORMAT... 5 CREATING AN XML FORMAT BASED ON AN EXISTING XML MESSAGE FORMAT... 5 CREATING AN EMPTY
More informationXML extensible Markup Language
extensible Markup Language Eshcar Hillel Sources: http://www.w3schools.com http://java.sun.com/webservices/jaxp/ learning/tutorial/index.html Tutorial Outline What is? syntax rules Schema Document Object
More informationSoftware Engineering Methods, XML extensible Markup Language. Tutorial Outline. An Example File: Note.xml XML 1
extensible Markup Language Eshcar Hillel Sources: http://www.w3schools.com http://java.sun.com/webservices/jaxp/ learning/tutorial/index.html Tutorial Outline What is? syntax rules Schema Document Object
More informationXML. Document Type Definitions XML Schema. Database Systems and Concepts, CSCI 3030U, UOIT, Course Instructor: Jarek Szlichta
XML Document Type Definitions XML Schema 1 XML XML stands for extensible Markup Language. XML was designed to describe data. XML has come into common use for the interchange of data over the Internet.
More informationAll About <xml> CS193D, 2/22/06
CS193D Handout 17 Winter 2005/2006 February 21, 2006 XML See also: Chapter 24 (709-728) All About CS193D, 2/22/06 XML is A markup language, but not really a language General purpose Cross-platform
More informationPart VII. Querying XML The XQuery Data Model. Marc H. Scholl (DBIS, Uni KN) XML and Databases Winter 2005/06 153
Part VII Querying XML The XQuery Data Model Marc H. Scholl (DBIS, Uni KN) XML and Databases Winter 2005/06 153 Outline of this part 1 Querying XML Documents Overview 2 The XQuery Data Model The XQuery
More informationXML Schema. Mario Alviano A.Y. 2017/2018. University of Calabria, Italy 1 / 28
1 / 28 XML Schema Mario Alviano University of Calabria, Italy A.Y. 2017/2018 Outline 2 / 28 1 Introduction 2 Elements 3 Simple and complex types 4 Attributes 5 Groups and built-in 6 Import of other schemes
More informationAbout the Data-Flow Complexity of Web Processes
Cardoso, J., "About the Data-Flow Complexity of Web Processes", 6th International Workshop on Business Process Modeling, Development, and Support: Business Processes and Support Systems: Design for Flexibility.
More informationLast week we saw how to use the DOM parser to read an XML document. The DOM parser can also be used to create and modify nodes.
Distributed Software Development XML Schema Chris Brooks Department of Computer Science University of San Francisco 7-2: Modifying XML programmatically Last week we saw how to use the DOM parser to read
More informationRestricting complextypes that have mixed content
Restricting complextypes that have mixed content Roger L. Costello October 2012 complextype with mixed content (no attributes) Here is a complextype with mixed content:
More information[MS-QDEFF]: Query Definition File Format. Intellectual Property Rights Notice for Open Specifications Documentation
[MS-QDEFF]: Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft publishes Open Specifications documentation for protocols, file formats, languages,
More informationCSC Web Technologies, Spring Web Data Exchange Formats
CSC 342 - Web Technologies, Spring 2017 Web Data Exchange Formats Web Data Exchange Data exchange is the process of transforming structured data from one format to another to facilitate data sharing between
More informationFraming how values are extracted from the data stream. Includes properties for alignment, length, and delimiters.
DFDL Introduction For Beginners Lesson 3: DFDL properties In lesson 2 we learned that in the DFDL language, XML Schema conveys the basic structure of the data format being modeled, and DFDL properties
More informationXML Information Set. Working Draft of May 17, 1999
XML Information Set Working Draft of May 17, 1999 This version: http://www.w3.org/tr/1999/wd-xml-infoset-19990517 Latest version: http://www.w3.org/tr/xml-infoset Editors: John Cowan David Megginson Copyright
More informationXML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11
!important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... 7:4 @import Directive... 9:11 A Absolute Units of Length... 9:14 Addressing the First Line... 9:6 Assigning Meaning to XML Tags...
More informationSyntax XML Schema XML Techniques for E-Commerce, Budapest 2004
Mag. iur. Dr. techn. Michael Sonntag Syntax XML Schema XML Techniques for E-Commerce, Budapest 2004 E-Mail: sonntag@fim.uni-linz.ac.at http://www.fim.uni-linz.ac.at/staff/sonntag.htm Michael Sonntag 2004
More information[MS-QDEFF]: Query Definition File Format. Intellectual Property Rights Notice for Open Specifications Documentation
[MS-QDEFF]: Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft publishes Open Specifications documentation ( this documentation ) for protocols,
More information[MS-MSL]: Mapping Specification Language File Format. Intellectual Property Rights Notice for Open Specifications Documentation
[MS-MSL]: Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft publishes Open Specifications documentation ( this documentation ) for protocols,
More informationPESC Compliant JSON Version /19/2018. A publication of the Technical Advisory Board Postsecondary Electronic Standards Council
Version 0.5.0 10/19/2018 A publication of the Technical Advisory Board Postsecondary Electronic Standards Council 2018. All Rights Reserved. This document may be copied and furnished to others, and derivative
More informationCS 378 Big Data Programming
CS 378 Big Data Programming Lecture 9 Complex Writable Types AVRO, Protobuf CS 378 - Fall 2017 Big Data Programming 1 Review Assignment 4 WordStaQsQcs using Avro QuesQons/issues? MRUnit and AVRO AVRO keys,
More informationFraming how values are extracted from the data stream. Includes properties for alignment, length, and delimiters.
DFDL Introduction For Beginners Lesson 3: DFDL properties Version Author Date Change 1 S Hanson 2011-01-24 Created 2 S Hanson 2011-03-30 Updated 3 S Hanson 2012-09-21 Corrections for errata 4 S Hanson
More informationMCS-274 Final Exam Serial #:
MCS-274 Final Exam Serial #: This exam is closed-book and mostly closed-notes. You may, however, use a single 8 1/2 by 11 sheet of paper with hand-written notes for reference. (Both sides of the sheet
More informationChapter 1: Getting Started. You will learn:
Chapter 1: Getting Started SGML and SGML document components. What XML is. XML as compared to SGML and HTML. XML format. XML specifications. XML architecture. Data structure namespaces. Data delivery,
More informationAccessing Arbitrary Hierarchical Data
D.G.Muir February 2010 Accessing Arbitrary Hierarchical Data Accessing experimental data is relatively straightforward when data are regular and can be modelled using fixed size arrays of an atomic data
More informationData Presentation and Markup Languages
Data Presentation and Markup Languages MIE456 Tutorial Acknowledgements Some contents of this presentation are borrowed from a tutorial given at VLDB 2000, Cairo, Agypte (www.vldb.org) by D. Florescu &.
More informationService Modeling Language (SML) Pratul Dublish Principal Program Manager Microsoft Corporation
Service Modeling Language (SML) Pratul Dublish Principal Program Manager Microsoft Corporation 1 Outline Introduction to SML SML Model Inter-Document References URI scheme deref() extension function Schema-based
More informationCopyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1
Slide 27-1 Chapter 27 XML: Extensible Markup Language Chapter Outline Introduction Structured, Semi structured, and Unstructured Data. XML Hierarchical (Tree) Data Model. XML Documents, DTD, and XML Schema.
More informationIntroduction to Semistructured Data and XML
Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of Washington Database Management Systems, R. Ramakrishnan 1 How the Web is Today HTML documents often
More information7.1 Introduction. extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML
7.1 Introduction extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML Lax syntactical rules Many complex features that are rarely used HTML is a markup language,
More informationCopyright 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 7 XML
Chapter 7 XML 7.1 Introduction extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML Lax syntactical rules Many complex features that are rarely used HTML
More informationSDPL : XML Basics 2. SDPL : XML Basics 1. SDPL : XML Basics 4. SDPL : XML Basics 3. SDPL : XML Basics 5
2 Basics of XML and XML documents 2.1 XML and XML documents Survivor's Guide to XML, or XML for Computer Scientists / Dummies 2.1 XML and XML documents 2.2 Basics of XML DTDs 2.3 XML Namespaces XML 1.0
More informationSemistructured Content
On our first day Semistructured Content 1 Structured data : database system tagged, typed well-defined semantic interpretation Semi-structured data: tagged - XML (HTML?) some help with semantic interpretation
More informationCS A1150 Databases, 2017 Exercise Round 6 (( ), Solutions
CS A1150 Databases, 2017 Exercise Round 6 ((8.5. 12.5.2017), Solutions 1. (6 p.) Consider a database schema which consists of three relations, whose schemas are: Students(ID, name, program, year) Courses(code,
More informationDescribing Document Types: The Schema Languages of XML Part 2
Describing Document Types: The Schema Languages of XML Part 2 John Cowan 1 Copyright Copyright 2005 John Cowan Licensed under the GNU General Public License ABSOLUTELY NO WARRANTIES; USE AT YOUR OWN RISK
More informationTwo hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. M.Sc. in Advanced Computer Science. Date: Tuesday 20 th May 2008.
COMP60370 Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE M.Sc. in Advanced Computer Science Semi-Structured Data and the Web Date: Tuesday 20 th May 2008 Time: 09:45 11:45 Please answer
More informationModule 4. Implementation of XQuery. Part 2: Data Storage
Module 4 Implementation of XQuery Part 2: Data Storage Aspects of XQuery Implementation Compile Time + Optimizations Operator Models Query Rewrite Runtime + Query Execution XML Data Representation XML
More informationSo far, we've discussed the use of XML in creating web services. How does this work? What other things can we do with it?
XML Page 1 XML and web services Monday, March 14, 2011 2:50 PM So far, we've discussed the use of XML in creating web services. How does this work? What other things can we do with it? XML Page 2 Where
More informationXML Schema Element and Attribute Reference
E XML Schema Element and Attribute Reference all This appendix provides a full listing of all elements within the XML Schema Structures Recommendation (found at http://www.w3.org/tr/xmlschema-1/). The
More informationXML DTDs and Namespaces. CS174 Chris Pollett Oct 3, 2007.
XML DTDs and Namespaces CS174 Chris Pollett Oct 3, 2007. Outline Internal versus External DTDs Namespaces XML Schemas Internal versus External DTDs There are two ways to associate a DTD with an XML document:
More informationIntroduction to Database Systems CSE 414
Introduction to Database Systems CSE 414 Lecture 14-15: XML CSE 414 - Spring 2013 1 Announcements Homework 4 solution will be posted tomorrow Midterm: Monday in class Open books, no notes beyond one hand-written
More informationThrift specification - Remote Procedure Call
Erik van Oosten Revision History Revision 1.0 2016-09-27 EVO Initial version v1.1, 2016-10-05: Corrected integer type names. Small changes to section headers. Table of Contents 1.
More informationTutorial 2: Validating Documents with DTDs
1. One way to create a valid document is to design a document type definition, or DTD, for the document. 2. As shown in the accompanying figure, the external subset would define some basic rules for all
More informationW3c Xml Schema 1.0 Data Type Datetime
W3c Xml Schema 1.0 Data Type Datetime The XML Schema recommendations define features, such as structures ((Schema Part 1)) and simple data types ((Schema Part 2)), that extend the XML Information Set with
More informationSemantic Web. XML and XML Schema. Morteza Amini. Sharif University of Technology Fall 94-95
ه عا ی Semantic Web XML and XML Schema Morteza Amini Sharif University of Technology Fall 94-95 Outline Markup Languages XML Building Blocks XML Applications Namespaces XML Schema 2 Outline Markup Languages
More informationMarkup Languages. Lecture 4. XML Schema
Markup Languages Lecture 4. XML Schema Introduction to XML Schema XML Schema is an XML-based alternative to DTD. An XML schema describes the structure of an XML document. The XML Schema language is also
More informationPART. Oracle and the XML Standards
PART I Oracle and the XML Standards CHAPTER 1 Introducing XML 4 Oracle Database 10g XML & SQL E xtensible Markup Language (XML) is a meta-markup language, meaning that the language, as specified by the
More informationModelling XML Applications
Modelling XML Applications Patryk Czarnik XML and Applications 2013/2014 Lecture 2 14.10.2013 XML application (recall) XML application (zastosowanie XML) A concrete language with XML syntax Typically defined
More informationB4M36DS2, BE4M36DS2: Database Systems 2
B4M36DS2, BE4M36DS2: Database Systems 2 h p://www.ksi.mff.cuni.cz/~svoboda/courses/171-b4m36ds2/ Lecture 2 Data Formats Mar n Svoboda mar n.svoboda@fel.cvut.cz 9. 10. 2017 Charles University in Prague,
More informationRequest for Comments: Tail-f Systems December Partial Lock Remote Procedure Call (RPC) for NETCONF
Network Working Group Request for Comments: 5717 Category: Standards Track B. Lengyel Ericsson M. Bjorklund Tail-f Systems December 2009 Partial Lock Remote Procedure Call (RPC) for NETCONF Abstract The
More informationChapter 3 Brief Overview of XML
Slide 3.1 Web Serv vices: Princ ciples & Te echno ology Chapter 3 Brief Overview of XML Mike P. Papazoglou & mikep@uvt.nl Slide 3.2 Topics XML document structure XML schemas reuse Document navigation and
More informationApache Avro# Specification
Table of contents 1 Introduction...3 2 Schema Declaration... 3 2.1 Primitive Types... 3 2.2 Complex Types...3 2.3 Names... 6 2.4 Aliases... 7 3 Data Serialization...8 3.1 Encodings... 8 3.2 Binary Encoding...8
More informationBig Data Exercises. Fall 2016 Week 9 ETH Zurich
Big Data Exercises Fall 2016 Week 9 ETH Zurich Introduction This exercise will cover XQuery. You will be using oxygen (https://www.oxygenxml.com/xml_editor/software_archive_editor.html), AN XML/JSON development
More informationComplex type. This subset is enough to model the logical structure of all kinds of non-xml data.
DFDL Introduction For Beginners Lesson 2: DFDL language basics We have seen in lesson 1 how DFDL is not an entirely new language. Its foundation is XML Schema 1.0. Although XML Schema was created as a
More informationQVX File Format and QlikView Custom Connector
QVX File Format and QlikView Custom Connector Contents 1 QVX File Format... 2 1.1 QvxTableHeader XML Schema... 2 1.1.1 QvxTableHeader Element... 4 1.1.2 QvxFieldHeader Element... 5 1.1.3 QvxFieldType Type...
More informationBig Data Exam ETH Zürich February 9, 2017 Answer sheet
Big Data Exam ETH Zürich February 9, 2017 Answer sheet Name: Legi-Number: 1. A B C D 2. A B C D 3. A B C D 4. A B C D 5. A B C D 6. A B C D 7. A B C D 8. A B C D 9. A B C D 10. A B C D 11. A B C D 12.
More informationAppendix H XML Quick Reference
HTML Appendix H XML Quick Reference What Is XML? Extensible Markup Language (XML) is a subset of the Standard Generalized Markup Language (SGML). XML allows developers to create their own document elements
More informationIntroduction to Database Systems CSE 414
Introduction to Database Systems CSE 414 Lecture 13: XML and XPath 1 Announcements Current assignments: Web quiz 4 due tonight, 11 pm Homework 4 due Wednesday night, 11 pm Midterm: next Monday, May 4,
More informationJSON - Overview JSon Terminology
Announcements Introduction to Database Systems CSE 414 Lecture 12: Json and SQL++ Office hours changes this week Check schedule HW 4 due next Tuesday Start early WQ 4 due tomorrow 1 2 JSON - Overview JSon
More informationCLIENT-SIDE XML SCHEMA VALIDATION
Factonomy Ltd The University of Edinburgh Aleksejs Goremikins Henry S. Thompson CLIENT-SIDE XML SCHEMA VALIDATION Edinburgh 2011 Motivation Key gap in the integration of XML into the global Web infrastructure
More informationXML. Rodrigo García Carmona Universidad San Pablo-CEU Escuela Politécnica Superior
XML Rodrigo García Carmona Universidad San Pablo-CEU Escuela Politécnica Superior XML INTRODUCTION 2 THE XML LANGUAGE XML: Extensible Markup Language Standard for the presentation and transmission of information.
More informationADT 2005 Lecture 7 Chapter 10: XML
ADT 2005 Lecture 7 Chapter 10: XML Stefan Manegold Stefan.Manegold@cwi.nl http://www.cwi.nl/~manegold/ Database System Concepts Silberschatz, Korth and Sudarshan The Challenge: Comic Strip Finder The Challenge:
More informationAuthor: Irena Holubová Lecturer: Martin Svoboda
NPRG036 XML Technologies Lecture 1 Introduction, XML, DTD 19. 2. 2018 Author: Irena Holubová Lecturer: Martin Svoboda http://www.ksi.mff.cuni.cz/~svoboda/courses/172-nprg036/ Lecture Outline Introduction
More information[MS-WORDSSP]: Word Automation Services Stored Procedures Protocol Specification
[MS-WORDSSP]: Word Automation Services Stored Procedures Protocol Specification Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft publishes Open
More informationCSI 3140 WWW Structures, Techniques and Standards. Representing Web Data: XML
CSI 3140 WWW Structures, Techniques and Standards Representing Web Data: XML XML Example XML document: An XML document is one that follows certain syntax rules (most of which we followed for XHTML) Guy-Vincent
More informationAnnouncements. JSon Data Structures. JSon Syntax. JSon Semantics: a Tree! JSon Primitive Datatypes. Introduction to Database Systems CSE 414
Introduction to Database Systems CSE 414 Lecture 13: Json and SQL++ Announcements HW5 + WQ5 will be out tomorrow Both due in 1 week Midterm in class on Friday, 5/4 Covers everything (HW, WQ, lectures,
More informationFall, 2005 CIS 550. Database and Information Systems Homework 5 Solutions
Fall, 2005 CIS 550 Database and Information Systems Homework 5 Solutions November 15, 2005; Due November 22, 2005 at 1:30 pm For this homework, you should test your answers using Galax., the same XQuery
More informationMANAGING INFORMATION (CSCU9T4) LECTURE 2: XML STRUCTURE
MANAGING INFORMATION (CSCU9T4) LECTURE 2: XML STRUCTURE Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE XML Elements vs. Attributes Well-formed vs. Valid XML documents Document Type Definitions (DTDs)
More informationWork/Studies History. Programming XML / XSD. Database
Work/Studies History 1. What was your emphasis in your bachelor s work at XXX? 2. What was the most interesting project you worked on there? 3. What is your emphasis in your master s work here at UF? 4.
More information[MS-ASNOTE]: Exchange ActiveSync: Notes Class Protocol. Intellectual Property Rights Notice for Open Specifications Documentation
[MS-ASNOTE]: Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft publishes Open Specifications documentation ( this documentation ) for protocols,
More informationNo Trade Secrets. Microsoft does not claim any trade secret rights in this documentation.
[MS-MSL]: Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft publishes Open Specifications documentation for protocols, file formats, languages,
More information