Kurshemsida Lecture 2: Internet/WWW http://www.eit.lth.se/kurs/edt632 Anders Ardö EIT Electrical and Information Technology, Lund University September 11, 2014 http://www.eit.lth.se/kurs/eeia01 September 11, 2014 1 / September 11, 2014 2 / Outline World Wide Web - WWW 6 Tim Berners Lee inventor of WWW 1989 proposed a global hypertext space any information on the network referred to by a single Universal Document Identifier (UDI) Three constituents: HTML + URL + HTTP HTML is a SGML based mark-up language for hypertext URL is an notation for locating files on serves(++) HTTP is a high-level protocol for file transfers(++) 1994 World Wide Web Consortium (W3C) a neutral forum for new standards and protocols September 11, 2014 3 / September 11, 2014 4 /
Uniform Resource Locator - URL Hyper Text Transfer Protocol - HTTP A Web resource is located by a URL http://www.w3.org/tr/html4/def.html protocol server path Optionally a port on the server (default 80) http://www.w3.org:5234/tr/html4/def.html port Relative URLs sgml/dtd.html Fragment identifier http://www.w3.org/tr/html4/def.html#minitoc [ ] [ ] protocol : // server : port / path [? fboxparameters] # fragment September 11, 2014 5 / communication between a browser and a Web server HTTP uses TCP client-server model 1 Client (browser): type a URL - click 2 Server-part of the URL is converted to an IP address 3 TCP connection is made via port 80 the default port for HTTP 4 Client sends the HTTP request to the server 5 Server handles request (generate content) 6 HTTP response (including content) is sent by the server 7 TCP connection is closed (v1.0) 8 Content rendered on screen by browser Stateless v1.0 - RFC 1945 v1.1 - RFC 2616 September 11, 2014 6 / Static vs Dynamic Pages The Design of HTML Static - just copy a file from server to client Dynamic - do some data processing Parameters - Forms Simple, purist design principles HTML describes the logical structure of a document Browsers are free to interpret tags differently HTML is a lightweight file format Size of file containing just Hello World! : Postscript PDF MS Word HTML 11274 bytes 4915 bytes 19456 bytes 28 bytes HTML 4.01 - http://www.w3.org/tr/html401/ (Recommendation) HTML5 - http://www.w3.org/tr/html5/ (Draft) September 11, 2014 7 / September 11, 2014 8 /
Cascading Style Sheets (CSS) Lecture 2 agenda main idea: separate presentation from content allow enhancement of Web pages which would not be possible with HTML only manipulate text, add background images, colors, boxes, borders, margins in an easy way positioning and layout in HTML either through natural flow or by tables in CSS precise control either relatively or to absolute pixel precision Read W3Schools: Semantic Web Tutorial http://www.w3schools.com/semweb/default.asp A. Rodriguez: RESTful Web services: The basics http://www.ibm.com/developerworks/webservices/library/ws-restful/ D. Merrill: Mashups: The new breed of Web app http://www.ibm.com/developerworks/xml/library/x-mashups.html 6 September 11, 2014 9 / September 11, 2014 10 / Outline Publication citation 6 Example from EIT K. Golub, T. Hamon, A. Ardö: Automated Classification of Textual Documents Based on a Controlled Vocabulary in Engineering Knowledge Organisation, Vol. 34, No. 4, pp. 247-263, 2007. Example from LU Koraljka Golub, Thierry Hamon, Anders Ardö Automated Classification of Textual Documents Based on a Controlled Vocabulary in Engineering Article in journal, Knowledge Organisation, 2007, 34, 4, 247-263 September 11, 2014 11 / September 11, 2014 12 /
What is what - XML? K. Golub, T. Hamon, A. Ardö: Automated Classification of Textual Documents Based on a Controlled Vocabulary in Engineering Knowledge Organisation, Vol. 34, No. 4, pp. 247-263, 2007. <?xml version="1.0" encoding="utf-8"?> <publication> <Author>Golub, Koraljka</author> <author>ardö, Anders</author> <title>automated Classification of...</title> <journal>knowledge Organisation</journal> <pages> <start>247</start> <end>263</end> </pages> </publication> What is what - XML? K. Golub, T. Hamon, A. Ardö: Automated Classification of Textual Documents Based on a Controlled Vocabulary in Engineering Knowledge Organisation, Vol. 34, No. 4, pp. 247-263, 2007. <?xml version="1.0" encoding="utf-8"?> <publication> <author>golub, Koraljka</author> <author>ardö, Anders</author> <title>automated Classification of...</title> <journal>knowledge Organisation</journal> <pages><start>247</start><end>263</end></pages> </publication> $ xmllint noout ex1.xml ex1.xml:3: parser error : Opening and ending tag mismatch: Author line 3 and author <Author>Golub, Koraljka</author> ^ September 11, 2014 13 / $ xmllint noout ex2.xml September 11, 2014 14 / XML = Structure - not semantics Tagging Roots in SGML Rules for how (new) markup formats can be defined Lots of standard tools Structure information Nothing about layout Just a tool - useful for computers! Semantics in the application profile extensible Markup Language Google: About 1,440,000,000 results Only one root-element Open/close tag <book>...</book> Tree structured Must be correctly balanced - well formed WRONG <title>what is <author>anders</title></author> CORRECT<title>What is <author>anders</author></title> Size matters Different elements <title>what is</title> <Title>Where is</title> Element can have attributes: <author role="primary">anders</author> <journal vol="34" no="4">knowledge Organisation</journal> September 11, 2014 15 / September 11, 2014 16 /
Outline Semantic Web 6 The Semantic Web = a Web with a meaning The Semantic Web is a web that is able to describe things in a way that computers can understand. The Semantic Web is not about links between web pages. The Semantic Web describes the relationships between things (RDF) To use the semantic web, we will need "Semantic Web Agents" or "Semantic Web Services". September 11, 2014 17 / September 11, 2014 18 / Semantic Web example Linked data at working lunch: go to San Francisco for customer Semantic Web agent: book a non-stop flight to San Francisco aisle seat if it s available. assign the charges to customer info: missing a dentist appointment back home adds a note to your calendar reminding you to reschedule. car service to the client s site - book books you at your favorite hotel in San Francisco Identification: HTTP URIs Information about: RDF/XML publishing structured data interlinked and more useful Relationships: links to related items Example Linking Open Data : goal: build a data commons making various open data sources available on the Web as RDF setting RDF links between data items updates your calendar and your manager s calendar http://linkeddata.org/ September 11, 2014 19 / September 11, 2014 20 /
Linking Open Data cloud diagram, Semantic Web - a reality? Not now :-( But RDF and RSS and Linked Data is! Read: T. Berners-Lee, Long Live the Web: A Call for Continued Open Standards and Neutrality, Scientific American, November 22, 2010. http://www.scientificamerican.com/article.cfm?id=long-live-the-web by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/ A. Ardö, EIT Lecture 2: Internet/WWW September 11, 2014 21 / A. Ardö, EIT Reiteration 2 Introduction XML 3 Semantic Web 4 Web services 5 Mashups 6 7 Privacy, Filter bubble 22 / What is a Web Service? Web Service: software that makes services available on a network using technologies such as XML and HTTP Service-Oriented Architecture (SOA): development of applications from distributed collections of smaller loosely coupled service providers A. Ardö, EIT September 11, 2014 Outline 1 Lecture 2: Internet/WWW Lecture 2: Internet/WWW September 11, 2014 23 / A. Ardö, EIT Lecture 2: Internet/WWW September 11, 2014 24 /
What do We Need? Web Service Standards We already know how to represent information with XML communicate with HTTP Fault tolerance Intermediaries RPC Interface descriptions Locating services SOAP a transport neutral protocol for XML data interchange (but focusing on HTTP) WSDL description of Web service interfaces UDDI registries and discovery of Web services REST - Representational State Transfer September 11, 2014 25 / September 11, 2014 26 / REST - Representational State Transfer REST - Example client/server simple use HTTP (and other protocols) use HTTP methods explicitly To create a resource on the server, use POST. To retrieve a resource, use GET. To change the state of a resource or to update it, use PUT. To remove or delete a resource, use DELETE. stateless expose directory structure-like URIs Create a new user: POST /users HTTP/1.1 Host: myserver Content-Type: application/xml <?xml version="1.0"?> <user> <name>robert</name> </user> Retrieve information about a user: transfer XML, JavaScript Object Notation (JSON), or both predominant model GET /users/robert HTTP/1.1 Host: myserver Accept: application/xml September 11, 2014 27 / September 11, 2014 28 /
Outline Mashups 6 combine data from several sources to create new services main characteristics combination visualization aggregation exempel: Väder, kartor, kameror: http://www.vackertvader.se/ Hälsa, kartor: http://www.germtrax.com/map.aspx Trafik, kartor, tweets, blogs: http://roadskillmap.com/ Katalog: http://www.programmableweb.com/mashups/directory September 11, 2014 29 / September 11, 2014 30 / Mashup architecture Mashup server side Participants Content provider(s) Mashup site Client Web browser Server side CGI, PHP, ASP, Java servlets, Client side client side scripting (JavaScript)... or both September 11, 2014 31 / September 11, 2014 32 /
Mashup client side JavaScript What is JavaScript? a interpreted programming language. widely used and supported simple and easy to use embedded in HTML run by the browser JavaScript use add multimedia elements create pages dynamically validate user input in forms September 11, 2014 33 / September 11, 2014 34 / JavaScript example Outline <head> <script> function dispdate() { document.getelementbyid("demo").innerhtml=date(); } </script> </head> <body> <h1>my First JavaScript</h1> <p id="demo">this is a paragraph.</p> <button type="button" onclick="dispdate()">display Date</button> </body> 6 September 11, 2014 35 / September 11, 2014 36 /
Outline Images <img src="me.jpg" width="800" height="640".../> Sound (html5) <audio controls="controls"> <source src="car.mp3" type="audio/mp3" /> Your browser does not support the audio tag. </audio> Video (html5) <video width="320" height="240" controls="controls"> <source src="train.mp4" type="video/mp4" /> Your browser does not support the video tag. </video> 6 September 11, 2014 37 / September 11, 2014 38 / Filter bubble Filter bubble example I What do search engines or social sites know about me? At least location, search history, click history, likes, and more... Personalize whats shown (search results,... ) using this info Show us what we want/like to see - algorithmically... and not whats relevant (who decides that?) Problem? From http://www.thefilterbubble.com/what-is-the-internet-hiding-lets-find-out September 11, 2014 39 / September 11, 2014 40 /
Filter bubble example II ToS-DR Terms-of-Service Didn t Read; http://tos-dr.info/ you give Google (and those we work with) a worldwide license to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content. Facebook: you grant us a non-exclusive, transferable, sub-licensable, royalty-free, worldwide license to use any IP content that you post on or in connection with Facebook (IP License). From http://www.thefilterbubble.com/what-is-the-internet-hiding-lets-find-out September 11, 2014 41 / September 11, 2014 42 / Privacy Search history, clicks, photos, documents, comments,... leads to a profile that can be used by ads or sold, or even stolen which might lead to it ending up in unwanted places and used against you Beware! För att inte tala om NSA... September 11, 2014 /