= a hypertext system which is accessible via internet

Similar documents
The World Wide Web (WWW) is a hypertext system which is accessible via internet

How the Web Works. Chapter 1. Modified by Marissa Schmidt Pearson

The Internet. Tim Capes. November 7, 2011

The World Wide Web. Internet

A network is a group of two or more computers that are connected to share resources and information.

Internetwork - B. What are. Example. Domain (Top-level domains) Other countries domain names. UserName HostName Subdomain Domain

CS WEB TECHNOLOGY

Web Information System Design. Tatsuya Hagino

From administrivia to what really matters

The Internet Advanced Research Projects Agency Network (ARPANET) How the Internet Works Transport Control Protocol (TCP)

KENDRIYA VIDYALAYA MAHASAMUND

Global Servers. The new masters

Using the Internet and the World Wide Web

Name: Class: Date: Multiple Choice Identify the letter of the choice that best completes the statement or answers the question.

Internet. Class-In charge: S.Sasirekha

ENG224 INFORMATION TECHNOLOGY Part I 3. The Internet. 3. The Internet

History and Backgound: Internet & Web 2.0

Computer Fundamentals : Pradeep K. Sinha& Priti Sinha

CSC 551: Web Programming. Spring 2004

EEC-682/782 Computer Networks I

Local area network (LAN) Wide area networks (WANs) Circuit. Circuit switching. Packets. Based on Chapter 2 of Gary Schneider.

Fundamentals of Information Systems, Seventh Edition

Basics of Web. First published on 3 July 2012 This is the 7 h Revised edition

Basic Internet. Basic Internet

MR AZIZUL ZAMRI BIN MUHAMED AMIN WEB : Your Logo

2. Introduction to Internet Applications

Web Engineering (CC 552)

Full file at

Impact. Course Content. Objectives of Lecture 2 Internet and WWW. CMPUT 499: Internet and WWW Dr. Osmar R. Zaïane. University of Alberta 4

Web Development. Lab. Bases de Dados e Aplicações Web MIEIC, FEUP 10/11. Sérgio Nunes

Introduction. WWW history. Application concepts, WWW and HTTP Perspectives on (mobile) communications. Fredrik Alstorp Torbjörn Söderberg

Unit 4 The Web. Computer Concepts Unit Contents. 4 Web Overview. 4 Section A: Web Basics. 4 Evolution

Background of HTML and the Internet

Full file at Chapter 2: Technology Infrastructure: The Internet and the World Wide Web

UFCEKG Lecture 2. Mashups N. H. N. D. de Silva (Slides adapted from Prakash Chatterjee, UWE)

Cascading Style Sheets - Designing for the Web

Multimedia Information Systems - Introduction

Announcements. 1. Class webpage: Have you been reading the announcements? Lecture slides and coding examples will be posted

Chapter 2: Technology Infrastructure: The Internet and the World Wide Web

Announcements. 1. Class webpage: Have you been reading the announcements? Lecture slides and coding examples will be posted

INTERNET PROGRAMMING INTRODUCTION

5/19/2015. Objectives. JavaScript, Sixth Edition. Introduction to the World Wide Web (cont d.) Introduction to the World Wide Web

Collection of (1) Meta-network. Uses a standard set of protocols Also uses standards for structuring the information transferred

LECTURE SCHEDULE 14. Internet: World Wide Web (WWW), Concepts

Web Design E M I R R A H A M A N WEB DESIGN SIDES 2017 EMIR RAHAMAN 1

CS/MAS 115: COMPUTING FOR THE SOCIO-TECHNO WEB HISTORY OF THE WEB

Agenda. INTRODUCTION TO WEB DEVELOPMENT AND HTML <Lecture 1> 1/20/2013. What is a Web Developer? Rommel Anthony Palomino Spring

Introduction to Web Technologies

HTML, XHTML, and CSS. Sixth Edition. Chapter 1. Introduction to HTML, XHTML, and

Objectives. Introduction to HTML. Objectives. Objectives

Chapter 18: The Internet. The Internet Evolution and basic services on Internet World Wide Web (WWW) WWW browsers Uses of the Internet

Outline. Internet and World Wide Web. History of the Internet. What is the Internet? How Does Data Travel Through the Internet?

M2-R4: INTERNET TECHNOLOGY AND WEB DESIGN

AN OVERVIEW OF SEARCHING AND DISCOVERING WEB BASED INFORMATION RESOURCES

Tutorial 1 Getting Started with HTML5. HTML, CSS, and Dynamic HTML 5 TH EDITION

WEB TECHNOLOGIES CHAPTER 1

Semantic Web Lecture Part 1. Prof. Do van Thanh

Introduction to Bioinformatics

Introduction to Bioinformatics

Introduction to the Internet and World Wide Web p. 1 The Evolution of the Internet p. 2 The Internet, Intranets, and Extranets p. 3 The Evolution of

How A Website Works. - Shobha

Technology in Action. Chapter 13 Behind the Scenes: The Internet: How It Works. Copyright 2010 Pearson Education, Inc. Publishing as Prentice Hall

Technology in Action. Chapter 13 Behind the Scenes: The Internet: How It Works Prentice-Hall, Inc.

Chapter 7. Telecommunications, the Internet, and Wireless Technology

Component 4: Introduction to Information and Computer Science

Announcements Fawzi Emad, Computer Science Department, UMCP

Web Design and Development ACS-1809

World Wide Web History, Architecture, Protocols Web Information Systems. CS/INFO 431 January 29, 2007 Carl Lagoze Spring 2007

Uniform Resource Locators (URL)

Midterm 1 Review Sheet CSS 305 Sp 06

HTML/CSS Essentials. Day Three Ed Crowley

1.1 A Brief Intro to the Internet

PIC 40A. Lecture 1: The Internet. Copyright 2011 Jukka Virtanen UCLA 1 03/31/14

An internet or interconnected network is formed when two or more networks are connected.

powered by Series of Tubes Senator Ted Stevens talking about the Net Neutrality Bill Jul 17, powered by

FBCA-03 April Introduction to Internet and HTML Scripting (New Course)

UR what? ! URI: Uniform Resource Identifier. " Uniquely identifies a data entity " Obeys a specific syntax " schemename:specificstuff

Web Systems & Technologies: An Introduction

Announcements Fawzi Emad, Computer Science Department, UMCP

Grade 9 :The Internet and HTML Code Unit 1

The Internet and the Web. recall: the Internet is a vast, international network of computers

WEB? Basic Concept of WEB. Page. 02

1.6 Case Study: Random Surfer

6 Computer Networks 6.1. Foundations of Computer Science Cengage Learning

Year 8 Computing Science End of Term 3 Revision Guide

Web Design and Application Development

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

Objectives. Connecting with Computer Science 2

INTRODUCTION TO INFORMATION & COMMUNICATION TECHNOLOGY (ICT) LECTURE 1 : WEEK 1 CSC-111-T

Web Systems & Technologies: An Introduction

Background to Internet and the web. Simon Mahony.

Chapter Ten. From Internet to Information Superhighway

Persistent systems. Traditional software: Data stored outside of program. Program

Table of Contents WWW. WWW history (2) WWW history (1) WWW history. Basic concepts. World Wide Web Aka The Internet. Client side.

Internet Client-Server Systems 4020 A

History of communication

"The Internet. All the piracy and none of the scurvy." -- Anonymous

CSC Introduction to Computers and Their Applications

A Balanced Introduction to Computer Science, 3/E David Reed, Creighton University 2011 Pearson Prentice Hall ISBN

Your computer, the internet, and the web. A brief overview

Transcription:

10. The World Wide Web (WWW) = a hypertext system which is accessible via internet (WWW is only one sort of using the internet others are e-mail, ftp, telnet, internet telephone... ) Hypertext: Pages of text containing hyperlinks (short: links) referring to other pages (from www.wikipedia.org, the open www encyclopedia) The link structure of the web forms a very large graph the following is a very small subgraph of it:

The Web can be seen as a sort of database but very different from relational databases: highly distributed, decentralized; based on the hypertext model instead of the entityrelationship model; with only very weak standards to restrict form and content of the pages; very large (in 2001: more than 550 billion documents on the Web) without a universal query language. (Search engines try to compensate the last item; see below.) History of the WWW: Origin: a project at CERN (Geneva) in 1989 Tim Berners-Lee and Robert Cailliau their system: ENQUIRE, realized core ideas of the Web in order to enable access to library information that was scattered on several different computers at CERN proposal for the WWW: published by Berners-Lee on November 12, 1990 first web page on November 13 on a NeXT workstation Christmas 1990: Berners-Lee built the first web browser and the first web server idea of hypertext was older (Vannevar Bush 1945) August 6, 1991: summary of the WWW project posted in a newsgroup in the internet April 30, 1993: CERN annouced that the WWW would be free to anyone 1993: Browser Mosaic (forerunner of Netscape) popularizes the WWW

The three core standards of the Web: Uniform Resource Locator (URL): specifies how each page of information is given a unique address at which it can be found (e.g., http://en.wikipedia.org/wiki/world_wide_web) Hypertext Transfer Protocol (HTTP): specifies how the browser and server send the information to each other Hypertext Markup Language (HTML): a webpage description language used to encode the information so that it can be displayed on a variety of devices and under different operating systems. Later extensions: Cascading Style Sheets (CSS): define the appearance of elements of a web page, separating appearance and content XML: more general language than HTML, designed to enable a better separation of appearance and content; also applicable to other sorts of information ECMAScript (also called JavaScript or JScript): a programming language with commands for the browser, enables embedding of programmes (scripts) into web pages. Thus web pages can be changed dynamically. Hypertext Transfer Protocol Secure (HTTPS): Extension of HTTP where the protocol SSL is evoked to encrypt the complete data transfer Java applets can be embedded in web pages and run on the computer of the Web user The World Wide Web Consortium (W3C) develops and maintains some of these standards (HTML, CSS) in order to enable computers to effectively store and communicate different kinds of information.

Problems with the Web: highly decentralized, no control of the content there is a lot of false and misleading information, hate campaigns, promotion of sexual exploitation and of other crimes... highly dynamic: Web pages change all the time! Links point to nowhere when the target page was removed... when you give a Web address in the References section of a scientific paper or in your thesis, you should add the date when you visited that page! Archive of (a part of) the Web: http://www.archive.org lost Web references can (in some cases) be reconstructed if the date is known highly chaotic: no global index or table of content is available; search for a certain content is complicated and time consuming development of specialized search engines, the most well-known one: Google (http://www.google.com)

How does a search engine work? First component: a web crawler, visiting all accessible web pages worldwide, one after the other, following the hyperlinks but: when you look for a certain keyword, this process would take much too long! second component: a large database, containing keywords and web addresses where these keywords were already found the web crawler is working in the background and does only actualize the database when you invoke Google, you search in Google's database, not in the Web! not all Web pages can be found, because not all are in the database Usually, you get many, many, many Web pages containing a given keyword (often millions...) first remedy: make more intelligent queries e.g., combining several keywords by "and", or looking for phrases instead of keywords (use quotation marks) Google provides such facilities under "extended search" still there are often too many results priorisation of the found web pages necessary

third component of the search engine (and best capital of the Google company): a ranking algorithm for search results Basic principles of Google ranking of web pages (Attention: the exact algorithm is changing continuously and is not published) "Importance" of a web page: recursively defined, using the hyperlink structure of the Web The importance of a page is the larger, the more important pages refer to it! More precisely: Let FLinks(A) be the set of all outgoing links (forward links) of a page A and BLinks(A) the set of all incoming links (backward links) of A A has high page rank if the sum of the page ranks of its incoming links is high, a page B distributes its importance in equal parts to all pages which are referred by it: (c = normalisation factor)

Iterative determination of the page rank: initially, an arbitrary mapping of values to all web pages is done (typically, the constant value 1 is used), iterate the calculation using the above formula for all pages, until the values remain stable, they converge against the Eigenvectors of the adjacency matrix of the graph consisting of the web pages (nodes) and their links (edges). (Cf. chapter 8.4.) Additionally, the Google page rank utilizes: proximity of the given key words to each other (in the text), the anchor texts of the links: these are the texts which can be clicked upon. A page A gets higher importance when the anchor texts of links referring to A contain the keywords, too.

the underlying technology for the WWW: the Internet (short for "Interconnected Networks") predecessor (end of the 60s): ARPANET (U.S. military project) was later used to connect universities and research labs Internet today: A worldwide network of computer networks Computers in this network communicate using the standardized TCP/IP protocol (Transmission Control Protocol / Internet Protocol: Rules governing the communication) Transmission of the information in small portions For identification, each computer in the net has a unique number, the IP address IP address: 32 bit integer; for better comprehensibility usually split in 4 bytes (these 4 bytes are often written as decimal integers, separated by dots: e.g., 194.77.124.35) more than 4 billion addresses to get identifiers which can better be memorized: Domain Name System (DNS) system of (textual) names, association between names and IP addresses hierarchy: Domains, subdomains, sub-subdomains..., e.g., www-gs.informatik.tu-cottbus.de (from right to left!) Top-level domains: Country abbreviations and some others ("generics"):.de,.fr,.com,.edu,.gov... Lowest level: host name of a single computer (here: www-gs, Web server of the graphics systems chair)

domain name corresponds to IP address transformation of domain names into IP addresses and vice versa: Task of special computers, so-called nameservers this transformation takes place any time when you click on a hyperlink on a web page! each nameserver is responsible for a certain part of the hierarchical name space