EXTRACTING DATA FROM THE WEB
|
|
- Joseph Cameron Houston
- 6 years ago
- Views:
Transcription
1 EXTRACTING DATA FROM THE WEB Georg Gottlob Oxford University.
2 Talk Outline Motivation: need of information extraction Logical foundations of information extraction The Lixto Visual Wrapper The Diadem Project
3 Traditional data-based decision making in enterprises. Enterprise Data ETL Data Warehouse (entrepot de données) Analytics DECISION (e.g. pricing)
4 Traditional data-based decision making in enterprises. Enterprise Data ETL Data Warehouse (entrepot de données) Analytics DECISION (e.g. pricing) But often the most relevant data are outside the company, on the Web! Online data intelligence, online market intelligence, automatic web data extraction.
5 Online Market Intelligence (OMI) (surveillance du marché) - Electronics Retailer (détaillant d électronique - composants) : market overview, 20 competitors, 200,000 products/prices - Supermarket Chain: Price comparison; must quickly react to special offers (offres spéciales), new products, - Internet Travel Agency: Gives best price guarantee, wants to detect pricing attacks, - Road Construction Company: Find new public tenders ( appels d offre ) - Hedge Fund ( fonds de placement ): Obtain recent house price changes from real-estate agent s Web pages before the weekly index is published. Anticipating the Consumer price index (index des prix à la consommation). - Governmental/Policy Making.
6 The Web corporate news pages hotel reservation airline booking sites real estate markets environmental data bookmakers jobs ebay Quarterly reports in pdf retail prices tenders blogs governmental info news etc Enterprise Data ETL Data Warehouse (entrepot de données) Analytics DECISION (e.g. pricing)
7 The Web corporate news pages hotel reservation airline booking sites real estate markets environmental data bookmakers jobs ebay Quarterly reports in pdf retail prices tenders blogs governmental info news etc Automatic web data extraction Data aggregation & integration & cleaning WEB ETL Enterprise Data ETL Data Warehouse (entrepot de données) Analytics DECISION (e.g. pricing)
8 Marketing & Business Intelligence Marketing Department goulet d'étranglement Business Objects report BI Tool Oracle 9 entrepot de données
9 The Wall Problem: Make web contents accessible to electronic data processing WEB HTML pages layout Corporate edp apps structured data, Databases, XML
10 WEB HTML pages layout Corporate edp apps structured data, Databases, XML Travail aliénant
11 Web wrapping Goal: Make web contents accessible to electronic data processing WEB HTML pages layout Corporate edp apps structured data, Databases, XML
12 Web wrapping Goal: Make web contents accessible to electronic data processing WEB HTML pages layout WRAPPER (adapteur) Corporate edp apps structured data, Databases, XML Wrappers: HTML select extract annotate XML
13 Enregistrement: hierarchie de données
14 Patterns:
15 Different approaches in the past Programming (Java, Perl, WebL, SQL+...) - very complicated & boring & expensive - testing very difficult Simple Screen scrapers ( gratte-écran ) - no complex data structures extracted Wrapper induction (apprentissage d adapteurs) - requires larger amounts of sample data - precision often not satisfactory - current systems text-based (not tree-based)
16 Different approaches in the past Programming (Java, Perl, WebL, SQL+...) - very complicated & boring & expensive - testing very difficult Simple Screen scrapers - no complex data structures extracted Wrapper induction - requires larger amounts of sample data - accuracy not satisfactory in all situations - current systems text-based (not tree-based)
17 Modern Solutions Semi-automatic tool (outils) - based on solid theory - modular knowledge representation - easy to use - commercial product since 2002 Fully automated extraction - for specific application domains - extracts from 1000s of websites - current research
18 Talk Outline Motivation: need of information extraction Logical foundations of information extraction The Lixto Visual Wrapper The Diadem Project
19 Web documents are trees! HTML: Hypertext Markup Language XML: Extensible Markup Language HTML, XML: Context free* languages. Represent a document by its parse tree (arbre syntaxique).
20 HTML Content Extractor Function f: HTML Parse tree Subtrees f Leaves of subtrees are among leaves of orig. tree
21 The Essence of Web Wrapping? Functional view: Wrapper defines functions f f: Tree P (Tree) t T subtrees(t) Equivalent logical view: Wrapper defines monadic predicates P over the nodes (arbre dom) of each input document
22 A HTML page <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <body> DBAI</h1> <table border="1" cellpadding="3" cellspacing="1"> <tr> <td>georg Gottlob</td> <td>gottlob@dbai.tuwien.ac.at</td> <td>18420</td> </tr> <tr> <td>christoph Koch</td> <td>koch@dbai.tuwien.ac.at</td> td <td>18449</td> </tr> </table> </body> </html> DBAI Georg Gottlob gottlob@ Christoph Koch koch@ Georg Gottlob h1 tr td gottlob@dbai.tuwien.ac.at html body table tr td td td td koch@dbai.tuwien.ac.at Christoph Koch 18420
23 Predicate employeetable <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <body> DBAI</h1> <table border="1" cellpadding="3" cellspacing="1"> <tr> <td>georg Gottlob</td> <td>gottlob@dbai.tuwien.ac.at</td> <td>18420</td> </tr> <tr> <td>christoph Koch</td> <td>koch@dbai.tuwien.ac.at</td> td <td>18449</td> </tr> </table> </body> </html> DBAI Georg Gottlob gottlob@ Christoph Koch koch@ Georg Gottlob h1 tr td gottlob@dbai.tuwien.ac.at html body table tr td td td td koch@dbai.tuwien.ac.at Christoph Koch 18420
24 Predicate employee <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <body> DBAI</h1> <table border="1" cellpadding="3" cellspacing="1"> <tr> <td>georg Gottlob</td> <td>gottlob@dbai.tuwien.ac.at</td> <td>18420</td> td </tr> <tr> <td>christoph Koch</td> <td>koch@dbai.tuwien.ac.at</td> <td>18449</td> </tr> </table> DBAI </body> </html> Georg Gottlob gottlob@ Christoph Koch koch@ Georg Gottlob h1 tr td gottlob@dbai.tuwien.ac.at html body table tr td td td td koch@dbai.tuwien.ac.at Christoph Koch 18420
25 Predicate phone <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <body> DBAI</h1> <table border="1" cellpadding="3" cellspacing="1"> <tr> <td>georg Gottlob</td> <td>gottlob@dbai.tuwien.ac.at</td> <td>18420</td> </tr> <tr> <td>christoph Koch</td> <td>koch@dbai.tuwien.ac.at</td> <td>18449</td> </tr> </table> </body> </html> DBAI Georg Gottlob gottlob@ Christoph Koch koch@ td Georg Gottlob h1 tr td gottlob@dbai.tuwien.ac.at html body table tr td td td td koch@dbai.tuwien.ac.at Christoph Koch 18420
26 Expressiveness Yardstick: MSO MSO captures exactly the essence of data extraction: - Define sets of nodes of a document Expressiveness, complexity, semantics well understood: MSO over trees: perfect logical semantics MSO over trees: high expressive power (tree automata) MSO over trees: low data complexity Drawbacks: - hard to use, no visual specification, - high query complexity (cpl. de requetes) ( bad scalability, mauvais passage à l échelle).
27 MSO on strings and trees Rich theory: Büchi: MSO = REG over strings (chaînes de caractères) Thatcher and Wright, Rabin: MSO = REG over ranked trees (arbres bornés) = tree automata Brüggemann-Klein/Wood/Murata: MSO = REG over unranked trees Neven & Schwentick: Unranked Query Automata Courcelle: MSO in LinTime on tree-like structures (treewidth <= k, data complexity)
28 Ordered Trees as finite structures html html body body h1 table firstchild h1 table tr tr tr tr nextsibling td td td td td td td td td td td td Christoph Koch Georg Gottlob label h1 () label td () Christoph Koch root() leaf() Georg Gottlob
29 MSO over Trees Extract from a binary tree all roots of sub-trees with an odd number of leaves: S x [ S(u) & ( leaf(x) S(x)) & x,y,z (((firstchild(x,y) & nextsibling(y,z)) (S(x) (S(y) S(z))))] Tree automaton: auxiliary state roots even subtree roots odd subtree
30 MSO over Trees Extract from a binary tree all roots of sub-trees with an odd number of leaves: S x [ S(u) & ( leaf(x) S(x)) & x,y,z (((firstchild(x,y) & nextsibling(y,z)) (S(x) (S(y) S(z))))] Tree automaton:
31 MSO over Trees Extract from a binary tree all roots of sub-trees with an odd number of leaves: S x [ S(u) & ( leaf(x) S(x)) & x,y,z (((firstchild(x,y) & nextsibling(y,z)) (S(x) (S(y) S(z))))] Tree automaton:
32 MSO over Trees Extract from a binary tree all roots of sub-trees with an odd number of leaves: S x [ S(u) & ( leaf(x) S(x)) & x,y,z (((firstchild(x,y) & nextsibling(y,z)) (S(x) (S(y) S(z))))] Tree automaton:
33 MSO over Trees Extract from a binary tree all roots of sub-trees with an odd number of leaves: S x [ S(u) & ( leaf(x) S(x)) & x,y,z (((firstchild(x,y) & nextsibling(y,z)) (S(x) (S(y) S(z))))] Tree automaton:
34 MSO over Trees Extract from a binary tree all roots of sub-trees with an odd number of leaves: S x [ S(u) & ( leaf(x) S(x)) & x,y,z (((firstchild(x,y) & nextsibling(y,z)) (S(x) (S(y) S(z))))] Tree automaton:
35 MSO over Trees Extract from a binary tree all roots of sub-trees with an odd number of leaves: S x [ S(u) & ( leaf(x) S(x)) & x,y,z (((firstchild(x,y) & nextsibling(y,z)) (S(x) (S(y) S(z))))] Tree automaton:
36 MSO over Trees Extract from a binary tree all roots of sub-trees with an odd number of leaves: S x [ S(u) & ( leaf(x) S(x)) & x,y,z (((firstchild(x,y) & nextsibling(y,z)) (S(x) (S(y) S(z))))] Tree automaton:
37 MSO over Trees Extract from a binary tree all roots of sub-trees with an odd number of leaves: S x [ S(u) & ( leaf(x) S(x)) & x,y,z (((firstchild(x,y) & nextsibling(y,z)) (S(x) (S(y) S(z))))] Tree automaton:
38 MSO over Trees Extract from a binary tree all roots of sub-trees with an odd number of leaves: S x [ S(u) & ( leaf(x) S(x)) & x,y,z (((firstchild(x,y) & nextsibling(y,z)) (S(x) (S(y) S(z))))] Tree automaton:
39 MSO over Trees Extract from a binary tree all roots of sub-trees with an odd number of leaves: S x [ S(u) & ( leaf(x) S(x)) & x,y,z (((firstchild(x,y) & nextsibling(y,z)) (S(x) (S(y) S(z))))] Tree automaton:
40 MSO over Trees Extract from a binary tree all roots of sub-trees with an odd number of leaves: S x [ S(u) & ( leaf(x) S(x)) & x,y,z (((firstchild(x,y) & nextsibling(y,z)) (S(x) (S(y) S(z))))] Tree automaton:
41 Logic heaven DB theory heaven DB programming heaven Application design heaven
42 Logic heaven MSO DB theory heaven DB programming heaven Application design heaven
43 = Logic heaven MSO DB theory heaven Monadic Datalog DB programming heaven Application design heaven
44 = Logic heaven MSO DB theory heaven Monadic Datalog DB programming heaven Elog Application design heaven
45 = Logic heaven MSO DB theory heaven Monadic Datalog DB programming heaven Application design heaven Elog Lixto Visual Wrapper
46 Logic heaven MSO DB theory heaven Monadic Datalog DB programming heaven Application design heaven = Elog Lixto Visual Wrapper Suite
47 Monadic Datalog as a Wrapping Language root entry(x) :- root(r), firstchild(r,u), label[html](u), firstchild(u,v), label[body](v), firstchild(v,w),label[table](w), firstchild(w,x), label[tr](x). entry(x):- entry(y), nextsibling(y,x). html body name(x) :- entry(e), firstchild(e, X), label[td](x). (x) :- name(n), nextsibling(n, X), label[td](x). table phone(x) :- (m), nextsibling(m, X), label[td](x). tr tr td td td td td td
48 Monadic Datalog as a Wrapping Language root entry(x) :- root(r), firstchild(r,u), label[html](u), firstchild(u,v), label[body](v), firstchild(v,w),label[table](w), firstchild(w,x), label[tr](x). entry(x):- entry(y), nextsibling(y,x). html body name(x) :- entry(e), firstchild(e, X), label[td](x). (x) :- name(n), nextsibling(n, X), label[td](x). table phone(x) :- (m), nextsibling(m, X), label[td](x). tr tr td td td td td td
49 Monadic Datalog as a Wrapping Language root entry(x) :- root(r), firstchild(r,u), label[html](u), firstchild(u,v), label[body](v), firstchild(v,w),label[table](w), firstchild(w,x), label[tr](x). entry(x):- entry(y), nextsibling(y,x). html body name(x) :- entry(e), firstchild(e, X), label[td](x). (x) :- name(n), nextsibling(n, X), label[td](x). table phone(x) :- (m), nextsibling(m, X), label[td](x). tr tr td td td td td td
50 Monadic Datalog as a Wrapping Language root entry(x) :- root(r), firstchild(r,u), label[html](u), firstchild(u,v), label[body](v), firstchild(v,w),label[table](w), firstchild(w,x), label[tr](x). entry(x):- entry(y), nextsibling(y,x). html body name(x) :- entry(e), firstchild(e, X), label[td](x). (x) :- name(n), nextsibling(n, X), label[td](x). table phone(x) :- (m), nextsibling(m, X), label[td](x). tr tr td td td td td td
51 Monadic Datalog as a Wrapping Language root entry(x) :- root(r), firstchild(r,u), label[html](u), firstchild(u,v), label[body](v), firstchild(v,w),label[table](w), firstchild(w,x), label[tr](x). entry(x):- entry(y), nextsibling(y,x). html body name(x) :- entry(e), firstchild(e, X), label[td](x). (x) :- name(n), nextsibling(n, X), label[td](x). table phone(x) :- (m), nextsibling(m, X), label[td](x). tr tr td td td td td td
52 Monadic Datalog as a Wrapping Language root entry(x) :- root(r), firstchild(r,u), label[html](u), firstchild(u,v), label[body](v), firstchild(v,w),label[table](w), firstchild(w,x), label[tr](x). entry(x):- entry(y), nextsibling(y,x). html body name(x) :- entry(e), firstchild(e, X), label[td](x). (x) :- name(n), nextsibling(n, X), label[td](x). table phone(x) :- (m), nextsibling(m, X), label[td](x). tr tr td td td td td td
53 entry(x) :- root(r), firstchild(r,u), label[html](u), firstchild(u,v), label[body](v), firstchild(v,w),label[table](w), firstchild(w,x), label[tr](x). entry(x):- entry(y), nextsibling(y,x). name(x) :- entry(e), firstchild(e, X), label[td](x). (x) :- name(n), nextsibling(n, X), label[td](x). phone(x) :- (m), nextsibling(m, X), label[td](x). root html <?xml version="1.0"?> <peopledb> <entry> <name>georg Gottlob</name> body table <phone>18420</phone> tr tr </entry> <entry> <name>christoph Koch</name> td td td td td td <phone>18449</phone> </entry> </peopledb>
54 Monadic Datalog over XML author fc paper ns fc title paperdb ns paper ns fc chandra ns merlin fc Conj. Queries Select titles of articles authored by Chandra and Merlin paper(x) root(r) & firstchild(r,x). paper(x) paper(y) & nextsibling(y,x). output(x) paper(p) & firstchild(p,a) & firstchild(a,z) & label[chandra](z) & nextsibling(z,v) & label[merlin](v) & nextsibling(a,t) & firstchild(t,x).
55 How expressive is monadic Datalog? It was known that over arbitrary structures: Monadic Datalog Π 1 -MSO Full Datalog = P (in presence of order) Theorem [G. & Koch 2002]: Over trees, monadic Datalog = MSO A unary query is definable in MSO iff it is definable via a monadic datalog program.
56 How complex is Monadic Datalog? Theorem [G. & Koch 2002]: Monadic Datalog over trees has combined complexity: O( data * query ) Query Complexity: P-complete and linear-time.
57 Proof idea: 1.) Transform datalog program + input tree in linear time into a ground propositional logic program (programme Datalog instancié) Exploit functional dependencies: nextsibling(x,y) has only a linear number of ground instances: nextsibling(n i,n j ), etc. Decouple independent atoms of rule bodies p(x) q(x) & r(y) & nextsibling(x,z) & s(z). p(x) q(x) & r & nextsibling(x,z) & s(z). r r(y). 2.) Execute ground program in linear time by using well-known algorithms: [Beeri&Bernstein][Dowling&Gallier] [Minoux]
58 = Logic heaven MSO DB theory heaven Monadic Datalog DB programming heaven Application design heaven Elog Lixto Visual Wrapper
59 item description and link to detailpage one record # of bids date next page link price info
60 ELOG [Baumgartner, Flesca, G. VLDB 01] Examples of Special predicates: subelem(s,x,path, ) before(x,y,..) after(x,y, ) property(x,attribute, Op,Value..) document(url,d) getdocumentfromhref(x,d), etc. Xpath-like expression distance tolerance,etc. Additional features: Stratified negation, string processing ontological concepts phonenumber(x) ranges: H(S,X) :- body(..)[1,5] object hierarchies
61 <?xml version="1.0" encoding="utf-8"?> <document> <record> <number> </number> <item>98 Degrees - Notebook - New</item> <picture/> <price>2.99</price> <currency>$</currency> <bids>-</bids> </record> <record> <number> </number> <item>notebook - Compaq Presario 1207</item> <price>730.00</price> <currency>au $</currency> [...]
62 ELOG Program for ebay pages
63 = Logic heaven MSO DB theory heaven Monadic Datalog DB programming heaven Application design heaven Elog Lixto Visual Wrapper (outil: suite logicielle)
64 Lixto Visual Wrapper Architecture Web Visual Wrapper Generator similarly structured pages Example page(s) Extraction Module Extractionprogram XML Further processing: tracking changes, delivering ( ,sms)... ( transformatio server)
65 Product Architecture Transformation Server LiXto Extraction Engine
66 SHORT DEMO
67 Talk Outline Motivation: need of information extraction Logical foundations of information extraction The Lixto Visual Wrapper The Diadem Project: Fully automatic data extraction
68 Need for Automatic Extraction Technology (2) All search engine providers need it! Many work on it. Keywords: Vertical search, object search, semantic search. Raghu Ramakrishnan, Yahoo!, March 2009: no one really has done this successfully at scale yet Alon Halevy, Google, Feb. 2009: Current technologies are not good enough yet to provide what search engines really need. [ ] any successful approach would probably need a combination of knowledge and learning.
69
70
71 The Blackbox we are constructing URL BLACKBOX Application domain with thousands of websites Application relevant Structured data (XML or RDF) To achieve this, we combine a host of annotators with a new knowledge-based approach.
72 How to achieve it? Combine existing and new low level annotators with high level AI and reasoning.
73 <table>113 <tr>115 <tr> 134 <td>119 I m interested in radiobuttons <table>124 <tr>125 <tr>126 <td>129 <td>130 Buying Renting <td>135 Maximum price <select>136 <option>137 <option>138 <td>139 <td>140 GBP EUR
74 Bottom-up (low-level) annotation [(02873,227) (03900,417)] [(02873,227) (03900,417)] Geo-Price-Searchbox ISA Monochromatic Rectangle Occurs in Georaphic query form (formulaire de requete géo.) Occurs in ISA ISA Price search facility. 105 Postcode input field.. Active map (carte active).
75 Top-down reasoning Property Search Facility Property List part-of 1 m Single Property Description Specially highlighted property
76 Bottom-up processing Top-down reasoning Phenomenology [(02873,227) (03900,417)] 105 [(02873,227) (03900,417)] Postcode input field Occurs in Georaphic search facility. ISA Monochromatic Rectangle Occurs in ISA ISA. Geo-Price-Searchbox Active map. Price search facility. Property Search Facility Property List part-of 1 m Single Property Description Specially highlighted property
77 Phenomenological Record Segmentation data area p a img div img a img div img a img div img p set of uniform, non-overlapping records maximise sequence of evenly segmented (same distance pivot) minimise irregularity of records 77
78 precision recall precision recall data areas records attributes Real Estate (100 pages) precision 98 data areas records attributes recall Used Car (100 pages) (voitures d occasion) 90 price postcode location bathroom bedroom reception legal type 78
79 OPAL Form Interpretation Form Patterns Example Small set of ubiquitous patterns ranges, dates, options, etc. Ontology by instantiation 79
80 OPAL-TL Example Price range two successive fields in the same group at least one price type range connector in between TEMPLATE concept_minmax<c,c M,A> { concept<c M >(N 1 ) child(n 1,G),child(N 2,G),adjacent(N 1,N 2 ), N 2 ) N concept<c M >(N 2 ) child(n 1,G),child(N 2,G),follows(N 2,N 1 ), concept<c>(n 1 ),N (A 1 A, N 1 {d}) concept<c M >(N 1 ) child(n ( 1,G),child(N 2,G),adjacent(N 1,N 2 ), N (N ) (N 80
81 1 Precision Recall F-score UK Real Estate (100) UK Used Car (100) ICQ (98) Tel-8 (436) Dragut et al., VLDB, 2009 Airfare Auto Book Job US R.E.
82 Short Demo diadem-3min43.m4v 82
Declarative Information Extraction, Web Crawling, and Recursive Wrapping with Lixto
Declarative Information Extraction, Web Crawling, and Recursive Wrapping with Lixto Robert Baumgartner 1, Sergio Flesca 2, and Georg Gottlob 1 1 DBAI, TU Wien, Vienna, Austria {baumgart,gottlob}@dbai.tuwien.ac.at
More informationMonadic Queries over Tree-Structured Data
Monadic Queries over Tree-Structured Data Georg Gottlob and Christoph Koch Database and Artificial Intelligence Group Technische Universität Wien, A-1040 Vienna, Austria {gottlob, koch}@dbai.tuwien.ac.at
More informationJ. Carme, R. Gilleron, A. Lemay, J. Niehren. INRIA FUTURS, University of Lille 3
Interactive Learning o Node Selection Queries in Web Documents J. Carme, R. Gilleron, A. Lemay, J. Niehren INRIA FUTURS, University o Lille 3 Web Inormation Extraction Data organisation is : adapted to
More informationInteractive Learning of HTML Wrappers Using Attribute Classification
Interactive Learning of HTML Wrappers Using Attribute Classification Michal Ceresna DBAI, TU Wien, Vienna, Austria ceresna@dbai.tuwien.ac.at Abstract. Reviewing the current HTML wrapping systems, it is
More informationLogic-based Web Information Extraction
Logic-based Web Information Extraction Georg Gottlob and Christoph Koch Database and Artificial Intelligence Group, Technische Universität Wien, A-1040 Vienna, Austria. {gottlob, koch}@dbai.tuwien.ac.at
More informationKripke style Dynamic model for Web Annotation with Similarity and Reliability
Kripke style Dynamic model for Web Annotation with Similarity and Reliability M. Kopecký 1, M. Vomlelová 2, P. Vojtáš 1 Faculty of Mathematics and Physics Charles University Malostranske namesti 25, Prague,
More informationSchema-Guided Query Induction
Schema-Guided Query Induction Jérôme Champavère Ph.D. Defense September 10, 2010 Supervisors: Joachim Niehren and Rémi Gilleron Advisor: Aurélien Lemay Introduction Big Picture XML: Standard language for
More informationData Querying, Extraction and Integration II: Applications. Recuperación de Información 2007 Lecture 5.
Data Querying, Extraction and Integration II: Applications Recuperación de Información 2007 Lecture 5. Goal today: Provide examples for useful XML based applications Motivation: Integrating Legacy Databases,
More informationDeclarative Web data extraction and annotation
Declarative Web data extraction and annotation Carlo Bernardoni 1, Giacomo Fiumara 2, Massimo Marchi 1, and Alessandro Provetti 2 1 Dip. di Scienze dell Informazione, Università degli Studi di Milano Via
More informationXPath with transitive closure
XPath with transitive closure Logic and Databases Feb 2006 1 XPath with transitive closure Logic and Databases Feb 2006 2 Navigating XML trees XPath with transitive closure Newton Institute: Logic and
More informationarxiv:cs/ v2 [cs.db] 8 Oct 2003
Monadic Datalog and the Expressive Power of Languages for Web Information Extraction GEORG GOTTLOB and CHRISTOPH KOCH Technische Universität Wien, Austria arxiv:cs/0211020v2 [cs.db] 8 Oct 2003 Research
More informationTowards Schema-Guided XML Query Induction
Towards Schema-Guided XML Query Induction Jérôme Champavère Rémi Gilleron Aurélien Lemay Joachim Niehren Université de Lille INRIA, France ICML-2007 Workshop on Challenges and Applications of Grammar Induction
More informationMonadic Datalog Containment on Trees
Monadic Datalog Containment on Trees André Frochaux 1, Martin Grohe 2, and Nicole Schweikardt 1 1 Goethe-Universität Frankfurt am Main, {afrochaux,schweika}@informatik.uni-frankfurt.de 2 RWTH Aachen University,
More informationDatabase Theory VU , SS Codd s Theorem. Reinhard Pichler
Database Theory Database Theory VU 181.140, SS 2011 3. Codd s Theorem Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 29 March, 2011 Pichler 29 March,
More informationConjunctive queries. Many computational problems are much easier for conjunctive queries than for general first-order queries.
Conjunctive queries Relational calculus queries without negation and disjunction. Conjunctive queries have a normal form: ( y 1 ) ( y n )(p 1 (x 1,..., x m, y 1,..., y n ) p k (x 1,..., x m, y 1,..., y
More informationLogic, Languages, and Rules for Web Data Extraction and Reasoning over Data
Logic, Languages, and Rules for Web Data Extraction and Reasoning over Data Georg Gottlob 1, Christoph Koch 2, and Andreas Pieris 3 1 University of Oxford georg.gottlob@cs.ox.ac.uk 2 École Polytechnique
More informationThe Web Data Factory. Search, Integrate, Organize The Real World. The Problem Today s search engines simply index the
The Web Data Factory Search, Integrate, Organize The Real World The Web has evolved from its early days as a big library of interconnected hypertext documents to its modern form as a giant database of
More informationCSE-6490B Final Exam
February 2009 CSE-6490B Final Exam Fall 2008 p 1 CSE-6490B Final Exam In your submitted work for this final exam, please include and sign the following statement: I understand that this final take-home
More informationFOUNDATIONS OF DATABASES AND QUERY LANGUAGES
FOUNDATIONS OF DATABASES AND QUERY LANGUAGES Lecture 14: Database Theory in Practice Markus Krötzsch TU Dresden, 20 July 2015 Overview 1. Introduction Relational data model 2. First-order queries 3. Complexity
More informationInnovations in Business Solutions. SAP Analytics, Data Modeling and Reporting Course
SAP Analytics, Data Modeling and Reporting Course Introduction: This course is design to cover SAP Analytics, Data Modeling and Reporting course content. After completion of this course students can go
More informationEXTRACTION AND ALIGNMENT OF DATA FROM WEB PAGES
EXTRACTION AND ALIGNMENT OF DATA FROM WEB PAGES Praveen Kumar Malapati 1, M. Harathi 2, Shaik Garib Nawaz 2 1 M.Tech, Computer Science Engineering, 2 M.Tech, Associate Professor, Computer Science Engineering,
More informationAutomatic Wrapper Adaptation by Tree Edit Distance Matching
Automatic Wrapper Adaptation by Tree Edit Distance Matching E. Ferrara 1 R. Baumgartner 2 1 Department of Mathematics University of Messina, Italy 2 Lixto Software GmbH Vienna, Austria 2nd International
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval http://informationretrieval.org IIR 10: XML Retrieval Hinrich Schütze, Christina Lioma Center for Information and Language Processing, University of Munich 2010-07-12
More informationCS425 Fall 2016 Boris Glavic Chapter 1: Introduction
CS425 Fall 2016 Boris Glavic Chapter 1: Introduction Modified from: Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Textbook: Chapter 1 1.2 Database Management System (DBMS)
More informationThe Context Interchange Approach
COntext INterchange (COIN) System Demonstration Aykut Firat (aykut@mit.edu) M. Bilal Kaleem (mbilal@mit.edu) Philip Lee (philee@mit.edu) Stuart Madnick (smadnick@mit.edu) Allen Moulton (amoulton@mit.edu)
More informationDatabase Technology Introduction. Heiko Paulheim
Database Technology Introduction Outline The Need for Databases Data Models Relational Databases Database Design Storage Manager Query Processing Transaction Manager Introduction to the Relational Model
More informationDigital Marketing. Introduction of Marketing. Introductions
Digital Marketing Introduction of Marketing Origin of Marketing Why Marketing is important? What is Marketing? Understanding Marketing Processes Pillars of marketing Marketing is Communication Mass Communication
More informationIstat s Pilot Use Case 1
Istat s Pilot Use Case 1 Pilot identification 1 IT 1 Reference Use case X 1) URL Inventory of enterprises 2) E-commerce from enterprises websites 3) Job advertisements on enterprises websites 4) Social
More informationKnowledge Representation
Knowledge Representation References Rich and Knight, Artificial Intelligence, 2nd ed. McGraw-Hill, 1991 Russell and Norvig, Artificial Intelligence: A modern approach, 2nd ed. Prentice Hall, 2003 Outline
More informationIntroduction to Database Systems CSE 414
Introduction to Database Systems CSE 414 Lecture 14-15: XML CSE 414 - Spring 2013 1 Announcements Homework 4 solution will be posted tomorrow Midterm: Monday in class Open books, no notes beyond one hand-written
More informationthe rest of this course
the rest of this course Volume size does mattes (thousands of TBs of data) Veracity data is often incomplete/inconsistent Variety many data formats (structured, semi-structured, etc.) Velocity data often
More informationSemantic Web Lecture Part 1. Prof. Do van Thanh
Semantic Web Lecture Part 1 Prof. Do van Thanh Overview of the lecture Part 1 Why Semantic Web? Part 2 Semantic Web components: XML - XML Schema Part 3 - Semantic Web components: RDF RDF Schema Part 4
More informationHTML + CSS. ScottyLabs WDW. Overview HTML Tags CSS Properties Resources
HTML + CSS ScottyLabs WDW OVERVIEW What are HTML and CSS? How can I use them? WHAT ARE HTML AND CSS? HTML - HyperText Markup Language Specifies webpage content hierarchy Describes rough layout of content
More information10/24/12. What We Have Learned So Far. XML Outline. Where We are Going Next. XML vs Relational. What is XML? Introduction to Data Management CSE 344
What We Have Learned So Far Introduction to Data Management CSE 344 Lecture 12: XML and XPath A LOT about the relational model Hand s on experience using a relational DBMS From basic to pretty advanced
More informationAn Integrated Framework to Enhance the Web Content Mining and Knowledge Discovery
An Integrated Framework to Enhance the Web Content Mining and Knowledge Discovery Simon Pelletier Université de Moncton, Campus of Shippagan, BGI New Brunswick, Canada and Sid-Ahmed Selouani Université
More informationChapter 2 & 3: Representations & Reasoning Systems (2.2)
Chapter 2 & 3: A Representation & Reasoning System & Using Definite Knowledge Representations & Reasoning Systems (RRS) (2.2) Simplifying Assumptions of the Initial RRS (2.3) Datalog (2.4) Semantics (2.5)
More informationFoundations of Computer Science Spring Mathematical Preliminaries
Foundations of Computer Science Spring 2017 Equivalence Relation, Recursive Definition, and Mathematical Induction Mathematical Preliminaries Mohammad Ashiqur Rahman Department of Computer Science College
More informationLogic as a Query Language: from Frege to XML. Victor Vianu U.C. San Diego
Logic as a Query Language: from Frege to XML Victor Vianu U.C. San Diego Logic and Databases: a success story FO lies at the core of modern database systems Relational query languages are based on FO:
More information1 Introduction... 1 1.1 A Database Example... 1 1.2 An Example from Complexity Theory...................... 4 1.3 An Example from Formal Language Theory................. 6 1.4 An Overview of the Book.................................
More informationEstimating the Quality of Databases
Estimating the Quality of Databases Ami Motro Igor Rakov George Mason University May 1998 1 Outline: 1. Introduction 2. Simple quality estimation 3. Refined quality estimation 4. Computing the quality
More informationDistributed Database System. Project. Query Evaluation and Web Recognition in Document Databases
74.783 Distributed Database System Project Query Evaluation and Web Recognition in Document Databases Instructor: Dr. Yangjun Chen Student: Kang Shi (6776229) August 1, 2003 1 Abstract A web and document
More informationGalileo Web Services
Galileo Web Services Agenda > Evolution of APIs > What are Web services? > Key business requirements for Galileo Web Services > What are Galileo Web Services > How Web services work > XML Select vs. Galileo
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval WS 2008/2009 25.11.2008 Information Systems Group Mohammed AbuJarour Contents 2 Basics of Information Retrieval (IR) Foundations: extensible Markup Language (XML)
More informationA B2B Search Engine. Abstract. Motivation. Challenges. Technical Report
Technical Report A B2B Search Engine Abstract In this report, we describe a business-to-business search engine that allows searching for potential customers with highly-specific queries. Currently over
More informationXML RETRIEVAL. Introduction to Information Retrieval CS 150 Donald J. Patterson
Introduction to Information Retrieval CS 150 Donald J. Patterson Content adapted from Manning, Raghavan, and Schütze http://www.informationretrieval.org OVERVIEW Introduction Basic XML Concepts Challenges
More informationChapter 1: Introduction
Chapter 1: Introduction Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Outline The Need for Databases Data Models Relational Databases Database Design Storage Manager Query
More informationKeywords Data alignment, Data annotation, Web database, Search Result Record
Volume 5, Issue 8, August 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Annotating Web
More informationLecture 1: Conjunctive Queries
CS 784: Foundations of Data Management Spring 2017 Instructor: Paris Koutris Lecture 1: Conjunctive Queries A database schema R is a set of relations: we will typically use the symbols R, S, T,... to denote
More informationDigital Asset Management 2. Introduction to Digital Media Format
Digital Asset Management 2. Introduction to Digital Media Format 2009-09-24 Outline Image format and coding methods Audio format and coding methods Video format and coding methods Introduction to HTML
More informationa standard database system
user queries (RA, SQL, etc.) relational database Flight origin destination airline Airport code city VIE LHR BA VIE Vienna LHR EDI BA LHR London LGW GLA U2 LGW London LCA VIE OS LCA Larnaca a standard
More informationInteractive Learning of Node Selecting Tree Transducers
Interactive Learning of Node Selecting Tree Transducers Julien Carme, Rémi Gilleron, Aurélien Lemay, Joachim Niehren To cite this version: Julien Carme, Rémi Gilleron, Aurélien Lemay, Joachim Niehren.
More informationData Presentation and Markup Languages
Data Presentation and Markup Languages MIE456 Tutorial Acknowledgements Some contents of this presentation are borrowed from a tutorial given at VLDB 2000, Cairo, Agypte (www.vldb.org) by D. Florescu &.
More informationSemantic Extensions to Defuddle: Inserting GRDDL into XML
Semantic Extensions to Defuddle: Inserting GRDDL into XML Robert E. McGrath July 28, 2008 1. Introduction The overall goal is to enable automatic extraction of semantic metadata from arbitrary data. Our
More informationEXTRACTION INFORMATION ADAPTIVE WEB. The Amorphic system works to extract Web information for use in business intelligence applications.
By Dawn G. Gregg and Steven Walczak ADAPTIVE WEB INFORMATION EXTRACTION The Amorphic system works to extract Web information for use in business intelligence applications. Web mining has the potential
More informationIntroduction to Data Management CSE 344
Introduction to Data Management CSE 344 Lecture 11: XML and XPath 1 XML Outline What is XML? Syntax Semistructured data DTDs XPath 2 What is XML? Stands for extensible Markup Language 1. Advanced, self-describing
More informationChapter 13 XML: Extensible Markup Language
Chapter 13 XML: Extensible Markup Language - Internet applications provide Web interfaces to databases (data sources) - Three-tier architecture Client V Application Programs Webserver V Database Server
More informationAn ECA Engine for Deploying Heterogeneous Component Languages in the Semantic Web
An ECA Engine for Deploying Heterogeneous Component s in the Semantic Web Erik Behrends, Oliver Fritzen, Wolfgang May, and Daniel Schubert Institut für Informatik, Universität Göttingen, {behrends fritzen
More informationWeb Programming Paper Solution (Chapter wise)
What is valid XML document? Design an XML document for address book If in XML document All tags are properly closed All tags are properly nested They have a single root element XML document forms XML tree
More informationBig Trend in Business Intelligence: Data Mining over Big Data Web Transaction Data. Fall 2012
Big Trend in Business Intelligence: Data Mining over Big Data Web Transaction Data Fall 2012 Data Warehousing and OLAP Introduction Decision Support Technology On Line Analytical Processing Star Schema
More informationCMPT 354 Database Systems I. Spring 2012 Instructor: Hassan Khosravi
CMPT 354 Database Systems I Spring 2012 Instructor: Hassan Khosravi Textbook First Course in Database Systems, 3 rd Edition. Jeffry Ullman and Jennifer Widom Other text books Ramakrishnan SILBERSCHATZ
More informationSemantic Optimization of Preference Queries
Semantic Optimization of Preference Queries Jan Chomicki University at Buffalo http://www.cse.buffalo.edu/ chomicki 1 Querying with Preferences Find the best answers to a query, instead of all the answers.
More informationAn Approach To Web Content Mining
An Approach To Web Content Mining Nita Patil, Chhaya Das, Shreya Patanakar, Kshitija Pol Department of Computer Engg. Datta Meghe College of Engineering, Airoli, Navi Mumbai Abstract-With the research
More informationDatabase Theory VU , SS Introduction: Relational Query Languages. Reinhard Pichler
Database Theory Database Theory VU 181.140, SS 2011 1. Introduction: Relational Query Languages Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 8 March,
More informationIntroduction to Database Systems CSE 414
Introduction to Database Systems CSE 414 Lecture 13: XML and XPath 1 Announcements Current assignments: Web quiz 4 due tonight, 11 pm Homework 4 due Wednesday night, 11 pm Midterm: next Monday, May 4,
More informationWeb Data Extraction and Generating Mashup
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 9, Issue 6 (Mar. - Apr. 2013), PP 74-79 Web Data Extraction and Generating Mashup Achala Sharma 1, Aishwarya
More informationPart XII. Mapping XML to Databases. Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321
Part XII Mapping XML to Databases Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321 Outline of this part 1 Mapping XML to Databases Introduction 2 Relational Tree Encoding Dead Ends
More informationGoogle Marketing Boot Camp 3 Days
Google Marketing Boot Camp 3 Days Course Overview If your business is online, you need to know how to successfully implement and analyze Google-based Internet marketing campaigns. There are a large number
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 12 (Wrap-up) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
More informationDatabase Theory VU , SS Introduction: Relational Query Languages. Reinhard Pichler
Database Theory Database Theory VU 181.140, SS 2018 1. Introduction: Relational Query Languages Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 6 March,
More informationFoundations of SPARQL Query Optimization
Foundations of SPARQL Query Optimization Michael Schmidt, Michael Meier, Georg Lausen Albert-Ludwigs-Universität Freiburg Database and Information Systems Group 13 th International Conference on Database
More informationIntroduction to Semistructured Data and XML
Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of Washington Database Management Systems, R. Ramakrishnan 1 How the Web is Today HTML documents often
More informationAutomata Theory for Reasoning about Actions
Automata Theory for Reasoning about Actions Eugenia Ternovskaia Department of Computer Science, University of Toronto Toronto, ON, Canada, M5S 3G4 eugenia@cs.toronto.edu Abstract In this paper, we show
More informationSensor Data Management
Wright State University CORE Scholar Kno.e.sis Publications The Ohio Center of Excellence in Knowledge- Enabled Computing (Kno.e.sis) 8-14-2007 Sensor Data Management Cory Andrew Henson Wright State University
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 1, 2017 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 12 (Wrap-up) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2457
More informationSemantic Web Systems Introduction Jacques Fleuriot School of Informatics
Semantic Web Systems Introduction Jacques Fleuriot School of Informatics 11 th January 2015 Semantic Web Systems: Introduction The World Wide Web 2 Requirements of the WWW l The internet already there
More informationWeb Scraping Framework based on Combining Tag and Value Similarity
www.ijcsi.org 118 Web Scraping Framework based on Combining Tag and Value Similarity Shridevi Swami 1, Pujashree Vidap 2 1 Department of Computer Engineering, Pune Institute of Computer Technology, University
More informationOntology based Model and Procedure Creation for Topic Analysis in Chinese Language
Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Dong Han and Kilian Stoffel Information Management Institute, University of Neuchâtel Pierre-à-Mazel 7, CH-2000 Neuchâtel,
More informationTansu Alpcan C. Bauckhage S. Agarwal
1 / 16 C. Bauckhage S. Agarwal Deutsche Telekom Laboratories GBR 2007 2 / 16 Outline 3 / 16 Overview A novel expert peering system for community-based information exchange A graph-based scheme consisting
More informationSmart Browser: A framework for bringing intelligence into the browser
Smart Browser: A framework for bringing intelligence into the browser Demiao Lin, Jianming Jin, Yuhong Xiong HP Laboratories HPL-2010-1 Keyword(s): smart browser, Firefox extension, XML message, information
More informationExample project: Fedenet portal site
Example project: Fedenet portal site Hans C. Arents Office Future International Services Atlas Park, Weiveldlaan 41 B. 32, B-1930 Zaventem, Belgium Tel: +32 (0)2 725 40 25 -Fax: +32 (0)2 725 40 12 Email:
More informationProcessing Regular Path Queries Using Views or What Do We Need for Integrating Semistructured Data?
Processing Regular Path Queries Using Views or What Do We Need for Integrating Semistructured Data? Diego Calvanese University of Rome La Sapienza joint work with G. De Giacomo, M. Lenzerini, M.Y. Vardi
More information9. Parameter Treewidth
9. Parameter Treewidth COMP6741: Parameterized and Exact Computation Serge Gaspers Semester 2, 2017 Contents 1 Algorithms for trees 1 2 Tree decompositions 2 3 Monadic Second Order Logic 4 4 Dynamic Programming
More informationProbabilistic/Uncertain Data Management
Probabilistic/Uncertain Data Management 1. Dalvi, Suciu. Efficient query evaluation on probabilistic databases, VLDB Jrnl, 2004 2. Das Sarma et al. Working models for uncertain data, ICDE 2006. Slides
More informationIndependent consultant. Oracle ACE Director. Member of OakTable Network. Available for consulting In-house workshops. Performance Troubleshooting
Independent consultant Available for consulting In-house workshops Cost-Based Optimizer Performance By Design Performance Troubleshooting Oracle ACE Director Member of OakTable Network Optimizer Basics
More informationSkyscanner Ltd. A leading global travel search company, providing free search of flights, hotels and car hire around the world.
Founded in 2003 Head Office Countries with offices Edinburgh - UK Company profile Ten offices across the world: Barcelona, Beijing, Budapest, Edinburgh, Glasgow, London, Miami, Shenzhen, Singapore and
More informationUSING SCHEMA MATCHING IN DATA TRANSFORMATIONFOR WAREHOUSING WEB DATA Abdelmgeid A. Ali, Tarek A. Abdelrahman, Waleed M. Mohamed
230 USING SCHEMA MATCHING IN DATA TRANSFORMATIONFOR WAREHOUSING WEB DATA Abdelmgeid A. Ali, Tarek A. Abdelrahman, Waleed M. Mohamed Abstract: Data warehousing is one of the more powerful tools available
More informationAn introduction to web scraping, IT and Legal aspects
An introduction to web scraping, IT and Legal aspects ESTP course on Automated collection of online proces: sources, tools and methodological aspects Olav ten Bosch, Statistics Netherlands THE CONTRACTOR
More informationCSE 544 Data Models. Lecture #3. CSE544 - Spring,
CSE 544 Data Models Lecture #3 1 Announcements Project Form groups by Friday Start thinking about a topic (see new additions to the topic list) Next paper review: due on Monday Homework 1: due the following
More informationFormal Methods for XML: Algorithms & Complexity
Formal Methods for XML: Algorithms & Complexity S. Margherita di Pula September 2004 Thomas Schwentick Schwentick XML: Algorithms & Complexity Introduction 1 Disclaimer About this talk It will be Theory
More informationRange Restriction for General Formulas
Range Restriction for General Formulas 1 Range Restriction for General Formulas Stefan Brass Martin-Luther-Universität Halle-Wittenberg Germany Range Restriction for General Formulas 2 Motivation Deductive
More informationFoundations of AI. 9. Predicate Logic. Syntax and Semantics, Normal Forms, Herbrand Expansion, Resolution
Foundations of AI 9. Predicate Logic Syntax and Semantics, Normal Forms, Herbrand Expansion, Resolution Wolfram Burgard, Andreas Karwath, Bernhard Nebel, and Martin Riedmiller 09/1 Contents Motivation
More informationOutline. Database Management Systems (DBMS) Database Management and Organization. IT420: Database Management and Organization
Outline IT420: Database Management and Organization Dr. Crăiniceanu Capt. Balazs www.cs.usna.edu/~adina/teaching/it420/spring2007 Class Survey Why Databases (DB)? A Problem DB Benefits In This Class? Admin
More informationIndependent consultant. Oracle ACE Director. Member of OakTable Network. Available for consulting In-house workshops. Performance Troubleshooting
Independent consultant Available for consulting In-house workshops Cost-Based Optimizer Performance By Design Performance Troubleshooting Oracle ACE Director Member of OakTable Network Optimizer Basics
More informationanalyzing the HTML source code of Web pages. However, HTML itself is still evolving (from version 2.0 to the current version 4.01, and version 5.
Automatic Wrapper Generation for Search Engines Based on Visual Representation G.V.Subba Rao, K.Ramesh Department of CS, KIET, Kakinada,JNTUK,A.P Assistant Professor, KIET, JNTUK, A.P, India. gvsr888@gmail.com
More informationLabeled graph homomorphism and first order logic inference
ECI 2013 Day 2 Labeled graph homomorphism and first order logic inference Madalina Croitoru University of Montpellier 2, France croitoru@lirmm.fr What is Knowledge Representation? Semantic Web Motivation
More informationIntranet Search. Exploiting Databases for Document Retrieval. Christoph Mangold Universität Stuttgart
Intranet Search Exploiting Databases for Document Retrieval Christoph Mangold Universität Stuttgart 2 /6 The Big Picture: Assume. there is a glueing problem with product P7 Has this happened before? Is
More information6+ years of experience in IT Industry, in analysis, design & development of data warehouses using traditional BI and self-service BI.
SUMMARY OF EXPERIENCE 6+ years of experience in IT Industry, in analysis, design & development of data warehouses using traditional BI and self-service BI. 1.6 Years of experience in Self-Service BI using
More informationConcur Online Booking Tool: Tips and Tricks. Table of Contents: Viewing Past and Upcoming Trips Cloning Trips and Creating Templates
Travel Office: Concur Resource Guides Concur Online Booking Tool: Tips and Tricks This document will highlight some tips and tricks users may take advantage of within the Concur Online Booking Tool. This
More informationDomain Specific Semantic Web Search Engine
Domain Specific Semantic Web Search Engine KONIDENA KRUPA MANI BALA 1, MADDUKURI SUSMITHA 2, GARRE SOWMYA 3, GARIKIPATI SIRISHA 4, PUPPALA POTHU RAJU 5 1,2,3,4 B.Tech, Computer Science, Vasireddy Venkatadri
More informationTrees, Automata and XML
A Mini Course on Trees, Automata and XML Paris June 2004 Thomas Schwentick PODS 2004 Thomas Schwentick Trees, Automata & XML 1 Intro Three Questions Question 1 Why XML? Answer Have a look into the 20 XML
More information