HTML Basics ENGS 4 -- Technology of Cyberspace January 8, 2004 Marion Bates mbates@ists.dartmouth.edu
A very brief history ARPAnet (see timeline link from lecture 1) Pre-WWW protocols: FTP, telnet, email, news, gopher Still out there; e.g., Newswatcher But, World Wide Web -- pictures etc. but also changed information processing: Hypertext.
Tim Berners-Lee! In 1980, I wrote a program...it allowed one to store snippets of information, and to link related pieces together in any way. To find information, one progressed via the links from one sheet to another, rather like in the old computer game adventure...it was similar to the application Hypercard produced more recently by Apple for the Macintosh. apple
Before hypertext, everything was a tree view -- difficult to modify, and time-consuming to navigate and use. Now it s hard to imagine it being any other way.
DNS Crash Course 1. You type in www.amazon.com 2. Your computer asks its local network s nameserver (it knows that already, from its own network configuration) to tell it the IP for Amazon. 3. The local NS might already know it, because maybe another user asked for Amazon two minutes ago. But, if not...
and... 4....then the local NS asks the root nameservers: What NS knows about the Amazon.com domain? 5. The root NS answers with something like Here, ask this NS: 1.2.3.4 (it knows this from Amazon s original domain registration; that s what you do when you buy a dot-com.) 6. Your local NS then asks 1.2.3.4 what s the IP for www?
...and... 7. Amazon s NS then tells your local NS that www.amazon.com is at 1.2.3.8. 8. Your local NS gives that info to your computer, and remembers it for awhile, in case you or another local user decide to go to Amazon again right away. That whole process is just to turn the name into an IP address. NOW your computer has to figure out how to connect to that IP.
Routing Routing is another can of worms. Basically, every router on the Internet knows 2 things about life: How to reach IP addresses on its own network, and who to ask when it wants to reach something on another network. There s more to it, like routers telling each other about the shortest or fastest ways to get from point A to B, and they can have multiple interfaces and load balance traffic between them, but we won t get into that.
Cooperation Suffice it to say that the protocols are designed to work in a decentralized way, and each node asks its neighbor to pass the information along until it reaches its intended recipient. And, if the recipient can t be found, the information has a finite lifetime, so it can t bounce around the Internet forever. These processes -- DNS lookup, IP routing -- happen every time you connect to a website.
HyperText Transfer Protocol From RFC 2616: The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia information systems. It is a generic, stateless, protocol which can be used for many tasks beyond its use for hypertext, such as name servers and distributed object management systems, through extension of its request methods, error codes and headers. A feature of HTTP is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred.
What? Okay...the key phrase here is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred. In other words, it s a standardized channel for browsers and servers to exchange data, without knowing or caring about the underlying server type or network infrastructure. That s really the heart of any protocol. And, the language used over this channel is...
HyperText Markup Language Or, HTML. HTML is metadata (data about data) that tells your browser, via HTTP, how to display a web page. This metadata consists of tags -- in most cases, one tag before and one after the thing being marked up. The tags indicate what sort of modification gets applied to whatever s between them.
How it starts The first HTML tag you should know is, rather ironically, the tag which denotes the beginning of HTML. ;) <html> the browser knows to expect html in here.... </html> HTML closing tags contain a forward slash (the one on the? key). Note: Good editors have, among other things, syntax highlighting. Highly recommended.
It Needs A Head After we start the page with <html>, we should define the <head>. <head> <title>this part s in the title bar.</title>... </head> The HEAD can contain other stuff, including META tags (more on that later if we have time). Nothing in the HEAD actually appears in the web page.
It Needs A Body NOW, finally, after closing the HEAD with </head>, we can start on the actual page content -- the BODY of the page.... </head> <body> text, pictures, movies, downloads, more HTML, etc. </body> </html>
How to upload We ll pause briefly here to cover uploading to Dartmouth s webserver. First, if you haven t done this part already, fill out the form here: http://www.dartmouth.edu/~apply One note: Pick a new password that you don t use for anything else. Connecting to Webster sends your password cleartext. Once you get a blitz saying your account s ready, connect to www.dartmouth.edu with the name and password you chose.
How to upload On Windows, you can do this right from Explorer, like Prof. Cybenko did in class. Just make sure you log in with that username and password before you try to upload web files, because you can connect to Webster anonymously, but it won t let you upload. You can also use a separate FTP client program, which some users prefer, since it gives you more/easier access to all the FTP options. For Macs, the standard is and always has been Fetch, developed here at Dartmouth.
How to upload Once you re connected, you will automatically be in your home directory on the webserver. In there, you should see a directory called public_html. This is where you will upload pages to make them viewable on the Internet. The URL (web address) of your pages will be: http://www.dartmouth.edu/~username/page.html If you want people to be able to stop after the ~username part, name a page index.html.
How to upload You can just drag-and-drop your html files (which need to end in.html or.htm ) into the public_html directory and they will then be viewable online. But, if you ever have trouble with images or other non-text files refusing to load, check the upload settings on your ftp program and/ or check the filename suffix.
How to upload Images need to end in.jpg or.gif or.png, and the suffix needs to match the image filetype. Text and HTML files should end in.txt and.html/.htm respectively. Images need to be uploaded as raw data or binary, while html/text files need to go as raw data or text (depending on your ftp program).
How to upload When you want multiple pages to link to each other, keep in mind their filenames and their locations relative to each other; if you start creating links, then later you move or rename the file those links point to, they ll break. This is another reason to use a good editor. BBEdit can do a search/replace on an entire folder of HTML documents. :)
DEMO
Sprucing it up (Everything we re doing is inside the BODY). Most of the basic tags are fairly intuitive: New paragraph <p> New line <br> Boldface <b> Italic <i> Underline <u> Center <center> Keep in mind that carriage returns are ignored by the browser (except when in <PRE> tags). It s as if your HTML code is all on one really long line.
Lots and lots of tags Not gonna print em all here. There are lots of references on the web and zillions of books. So far, it s all about formatting text. What about HYPER text? We use the ANCHOR tag for this, and it s a little different from the ones we ve done so far. <a href= http://www.amazon.com >Books</a>
Anatomy of the anchor tag We start the anchor tag logically enough: <a Then we tell the browser that this anchor will have a hypertext reference to another web page: <a href= http://www.amazon.com > We ve completed the opening anchor tag. It has some data embedded in it, but otherwise it s a normal tag. Now...
Anatomy of the anchor tag...it s time to put something inside the tags. Whatever we put in here, will appear on the web page as a clickable hypertext link. <a href= http://www.amazon.com >Books Close the tag, of course: <a href= http://www.amazon.com >Books</a> Books will link to www.amazon.com. Books is the anchor.
More on links Absolute vs. relative links -- this will be clearer when you start making your own website. Absolute: <a href= http://www.asite.com/page.html > Relative: <a href= somepage.html > <a href= pagesfolder/somepage.html > <a href=./somepage.html > <a href=../somepage.html >
Images Formats: jpg, gif, png, sometimes bmp. <img src= http://www.asite.com/picture.jpg > <img src= picture.jpg border= 0 > Note that there s no closing tag here -- we re sticking something new into the page, rather than formatting something that s already there. This isn t really a tag in that sense, but rather, it s a special kind of content. Don t forget the alt text (see example). Handicapped-enabled browsers use them.
Combo deal <a href= http://www.amazon.com > <img src= picture-of-book.gif > </a> What will this block of HTML do?
Colors HTML colors are expressed as RGB values, in hex. <body bgcolor= #ff0000 > Nice, huh? Let s take it apart. RGB = red, green, blue, which are the primary colors for computers. In the number above, the first pair of characters (after the #) represents the amount of red, the second pair represents green, and the third represents blue.
Colors The amount (saturation) of each of the three colors is a hexadecimal (base 16) number between 00 (zero) and ff (one f=16, so ff=16*16=256). I like to think of ff as full -- the maximum amount of that color possible. So...what color would ff0000 give us? What about 00ff00? And ffffff? And, mathematically, how many web colors should be possible?
Colors No normal human bothers to think in hex for doing web colors. Just Google for html color chart. Here s a good one: http://www.islandnet.com/colors.html Also, newer browsers support stuff like <body bgcolor= red > But to be fully compatible with older software, you should stick with the hex values. (Plus, it s cooler, in that twisted, geeky way.)
Special text Headings: <h1>,<h2>, etc. -- not just aesthetic, also affect search engine placement. Lists: <ul>, <ol>, <li>, <dl> Blockquote: <blockquote> (indents the paragraph on both margins -- useful since there is no tab in html) Tricky: How do you print a < in a page?
Entities! If you just type a < sign into your code, it won t show up, plus it could mess up the rest of the page because the browser thinks it s part of a tag. Answer: < Well, that s obvious, right? ;)
Entities! There are lots of HTML entities, and they exist so you can include characters in a web page that would normally be a problem, either because they re part of HTML syntax (like <), or they re ignored (like runs of more than one space), or they re otherwise weird (like ü and å and ). Some commonly-used entities: < > & "
TABLES At first glance, HTML tables don t seem like anything really special. Basically, a table defines rows and columns, like in a spreadsheet: <table> <tr> <td>row 1, cell 1</td> <td>row 1, cell 2</td> </tr> <tr> <td>row 2, cell 1</td> <td>row 2, cell 2</td> </tr> </table>
Positioning via tables But, tables are very flexible, and can be used in complex ways to achieve exact placement of content. (see examples) Rows can span columns, columns can span rows, rows/cells can be any size and have different colors, tables can have borders or not, sizes of rows/cells/tables can be expressed in absolutes (pixels) or relative (percent of browser window), etc. Tables are really what let web designers create fancy pages.
DEMO
Miscellaneous, if time allows Steal This Code mailto links, and why they re sorta bad Resizing images in the browser, and why THAT s sorta bad Image formats and optimization Formatting and commenting code Frames, style sheets, forms and CGI, other web expansion modules
Getting Help Google for html introduction and the like Dartmouth site has FAQs and tutorials For specific questions or problems, join a discussion forum or list and ask If you want a book, go with O Reilly To contact me: mbates@ists.dartmouth.edu, nu11dev1ce@aim, 646-0739@phone.