Web browser architecture Web Oriented Technologies and Systems Master s Degree Course in Computer Engineering - (A.Y. 2017/2018)
What is a web browser? A web browser is a program that retrieves documents from remote web servers and displays them on screen o o o o o displays HTML pages either within the browser window itself or by passing the document to an external helper application allows particular resources to be requested explicitly by URI, or implicitly by following embedded hyperlinks keeps track of recently visited web pages and provide a mechanism for bookmarking pages of interest stores commonly entered form values as well as usernames and passwords provides accessibility features to accommodate users with disabilities such as blindness and low vision, hearing loss, and motor impairments 2 of 34
History and evolution (1/2) o 1991: Tim Berners Lee, wrote the first web browser o was graphical and also served as an HTML editor o 1993: the National Center for Supercomputing Applications (NCSA) released a graphical web browser called Mosaic o allowed users to view images directly interspersed with text o NCSA founded an offshoot company called Spyglass to commercialize its technologies o Mosaic s primary developer left NCSA to cofound his own company, Netscape. o 1994: Berners-Lee founded the World Wide Web Consortium (W3C) o 1995: Microsoft released Internet Explorer (IE), based on code licensed from Spyglass o a period of intense competition with Netscape known as the browser wars o 1998: Netscape released its browser as open source under the name Mozilla 3 of 34
History and evolution (2/2) o Since 1998: several Mozilla variations have appeared, reusing the browser core but offering alternative design decisions for user-level features o Firefox is a standalone browser, eliminating Mozilla s integrated mail, news, and chat clients o Galeon is a browser for the GNOME desktop environment that integrates with other GNOME applications and technologies o The open source Konqueror browser has also been reused o Apple has integrated its core subsystems into its OS X web browser, Safari o Apple s modifications have in turn been reused by other browsers o Internet Explorer s closed source engine has also seen reuse o Maxthon, Avant, and NetCaptor each provide additional features to IE such as tabbed browsing and ad-blocking o Although each browser engine typically produces a similar result, there can be differences as to how web pages look and behave o Netscape 8, based on Firefox, allows the user to switch between IE-based rendering and Mozilla-based rendering on the fly 4 of 34
Web browser timeline (1/2) 5 of 34
Web browser timeline (2/2) Browser fork o In software engineering, a project fork happens when developers take a copy of source code from one software package and start independent development on it, creating a distinct and separate piece of software. o The term often implies not merely a development branch, but a split in the developer community, a form of schism 6 of 34
Web browser architecture 7 of 34
Architecture of Firefox 8 of 34
Architecture of Chrome 9 of 34
Architecture of Internet Explorer 10 of 34
Architecture of Microsoft Edge o Internet App Container (AC): hosts content from Internet sites. o Intranet AC: hosts content from Intranet sites. o enterprise web sites o web sites that are control interfaces for devices on your home network, such as your Wi-Fi router, or IoT devices o Service UI AC: hosts special web pages, such as about:flags, and the default home page 11 of 34
User interface (UI) o The window frame of the web browser o Features o includes title bar and sizing borders o contains tab strip, toolbars, bookmarks bar o handles visual page-load progress, downloads, preferences, printing o Latest innovations o tab navigation o extensions: allow developers to add functionality to the browser and enhance the UI [introduced by Firefox] o custom search bar o multi-touch gestures interaction o UI is not specified in any formal specification o comes from good practices shaped over years of experience and by browsers imitating each other 12 of 34
Data persistence o Manages user data o Features o stores various data associated with the browsing session on disk o high-level data (bookmarks, toolbar settings) o low-level data (cookies, security, certificates, cache) o Latest innovations o Embedded database (SQLite) o HTML5 specification defines web database which is a complete (although light) database in the browser o Cloud storage synchronization o makes bookmarks, history, passwords, form-fill data and open tabs accessible from other browser instances running on other computers and mobile phones o all synced data can be encrypted, no one can read user encrypted data unless they know the passphrase 13 of 34
Browser engine o The controller that marshals actions between the UI and the rendering engine o Features o provides methods to initiate the loading of a URL and other high-level browsing actions (reload, back, forward) o provides the UI with various messages relating to errors and loading progress o allows the querying and manipulation of rendering engine settings o manages plugins: third party libraries that can be embedded inside a web page o affects only a specific page in which it is used o examples of common plugins: o Macromedia Flash o Microsoft Silverlight o Apple Quicktime o Adobe Reader 14 of 34
Rendering engine o Produces the visual representation of a given URL o Features o interprets the HTML and CSS o calculates the exact page layout and may use reflow algorithms to incrementally adjust the position of elements on the page o Latest innovations o multi-process architecture o runs multiple instances of the rendering engine in separate processes (one for each tab) o protects the overall application from bugs and glitches in the rendering engine 15 of 34
Rendering engine frameworks Engine Browser Open source Blink Chrome 28+, Opera 15+ Y Gecko Firefox Y KHTML Konqueror Y Presto Opera 14- N Trident Internet Explorer N Webkit Safari, Chrome 27-, Android browser Y 16 of 34
Rendering engine components o HTML Parser o parses the HTML document and convert elements to DOM nodes in a tree called the DOM tree o CSS Parser o parses the style data, both in external CSS files and in style element together with visual instructions in HTML o returns the render tree o CSS is a context free grammar and can be parsed using flex and bison. In fact the CSS specification defines CSS lexical and syntax grammar o Layout o the render tree goes through a layout process. o each node is given the exact coordinates where it should appear on the screen o Painting o render tree will be traversed and each node will be painted using the UI backend layer 17 of 34
Rendering engine basic flow 18 of 34
DOM Tree 19 of 34
CSSOM Tree 20 of 34
Render tree 21 of 34
Rendering timeline HTML HTML+CSS HTML+CSS+JS 22 of 34
Gecko rendering engine o Style system o contains the CSS Parser and is responsible for getting the CSS data from Necko o Image Library o interacts with Necko to retrieve image data o Content Model o interacts with the various components of Gecko o Frame Constructor o carries out the task of piece together all the information before sending the rendered web page to the display backend 23 of 34
WebKit rendering engine o WebKit embedding API o interface between rendering engine and Browser UI o WebCore o application logic: loading, parsing, layout, style resolution, painting, event handling, editing, javascript bindings o JSCore Engine) (JavaScript o V8 or JavaScriptCore 24 of 34
Ports of WebKit Chrome (OS X) Safari (OS X) QtWebKit Android Browser Chrome for ios Rendering Skia CoreGraphics QtGui Android stack/skia CoreGraphics Networking Chromium network stack CFNetwork QtNetwork Fork of Chromium s network stack Chromium stack Fonts CoreText via Skia CoreText Qt internals Android stack CoreText JavaScript V8 JavaScriptCore JSC (V8 is used elsewhere in Qt) V8 JavaScriptCore (without JITting) * 25 of 34
Blink o Blink is a fork of WebKit o developed as part of the Chromium project by Google o used in o Chrome (28+) o Opera (15+) o Amazon Silk o Android WebView (4.4+) o Qt WebEngine 26 of 34
Networking and XML parser Networking o Features o provides functionality to handle URLs using file transfer protocols such as HTTP and FTP o translates between different character sets, and resolves MIME media types for files. o may implement a cache of recently retrieved resources to minimize network traffic. XML Parser o Features o parses XML documents into a Document Object Model (DOM) tree. o manages XML data exchanged between the browser and the server using AJAX paradigm o almost all browser implementations leverage an existing XML Parser rather than creating their own from scratch. 27 of 34
Javascript interpreter (1/2) Executes the JavaScript code that is embedded in a HTML page o Features o allows DOM manipulation o certain JavaScript functionality, such as the opening of pop-up windows, may be disabled by the Browser Engine or Rendering Engine for security purposes o Latest innovations o The principle problem with the classic architecture is that runtime bytecode interpretation is slow o JIT (Just-In-Time) Compiler o compiles parts of code into machine code o the ambitious objective is to run JavaScript code as fast as native C code o JIT compilers come in a variety of categories, each with their own strategies for optimization 28 of 34
Javascript interpreter (2/2) 29 of 34
Javascript interpreter engines Engine Browser Open source Carakan Opera N Chakra Internet Explorer 9+ N Nitro Safari Y SpiderMonkey Firefox Y V8 Chrome, Android Browser Y 30 of 34
Display backend Provides drawing and windowing primitives, a set of user interface widgets, and a set of fonts o Features o may be tied closely with the operating system o Latest innovations o Hardware acceleration o o o o the Render Tree consists of render objects the elements to be rendered on the page each render object is assigned to a graphic layer. Each layer is uploaded to GPU as a texture the layer may be transformed in the GPU without repainting, like in the case of 3D graphics Memory issue related to hardware acceleration o o loading too many textures to the GPU may cause memory issues this is really critical on mobile devices and can even crash a mobile browser 31 of 34
Proxy based browser o Proxy Based web browsers reduce bandwidth usage by compressing resources of the rendered page on a proxy server (usually the browser vendors), before sending it to the client browser o To make a request 1. the client requests a page from the Proxy Server 2. the proxy server requests the page from the origin server 3. the proxy server renders the page, runs javascript, compresses the page 4. the proxy server sends the client the rendered page 5. client interaction with page is sent to the proxy server, who forwards it onto the origin server 32 of 34
Proxy based browser o Standard web browser o Total loading time of a web page (turnaround time) o T = L cs + T cs + S + C o L cs : latency client-source server connection o T cs : data transfer time o S: server processing time o C: client processing time (parsing and rendering) o Proxy-based browser o T' = L cp + L ps + T cp + T ps + S' + C' o L cp, L ps : client-proxy, proxy-source server latency time o T cp, T ps : client-proxy, proxy-source data transfer time o Proxy-based browser aims to minimize T o L cp, L ps << L cs vendor offers geographically distributed proxy servers o T ps << Tcs proxy server uses cachìng and have large bandwith o T cp << T cs proxy server performs data compression o C' < C proxy server encodes in a more rapidly processable format 33 of 34
References o o A Reference Architecture for Web Browser http://grosskurth.ca/papers/browser-refarch.pdf How Browsers Work: Behind the scenes of modern web browsers o https://www.html5rocks.com/en/tutorials/internals/howbrowserswork/ 34 of 34