Search Like a Pro Nancy Warren AkLA Conference 2010 How Search Engines Work http://computer.howstuffworks.com/search-engine1.htm Google How Search Engines Crawl a Web Site Yahoo Comparison Search Engine MSN http://www.drunkmenworkhere.org/ Google (with a projection based on Page Rank) Comparison Search Engine How Search Engines Work http://computer.howstuffworks.com/search-engine1.htm
Index Freshness Cached Pages Search Engine MSN Yahoo Google Days to Re-Index* 21 21-28 28 Snippet *Outlying pages are indexed less-often Cached Pages How Search Engines Work Highlights your search terms Date and time the web page was visited by the spider http://computer.howstuffworks.com/search-engine1.htm Search Techniques Keywords are King Search Techniques Word Order T/F: Word order influences search results TRUE
Search Techniques Weighting Keywords Search Techniques - Phrases T/F: Quotation marks aren t necessary anymore TRUE most of the time FALSE still useful when searching proper nouns or phrases Search Techniques Boolean Syntax T/F: Search engines support full Boolean syntax FALSE Search Techniques site: command T/F: Using the [site:] command lets you search a specific domain or web site TRUE Search Techniques Stop Words T/F: Search engines are now tolerant of stop words, even without quotation marks TRUE Search Techniques - Wildcards T/F: Google, Bing & Yahoo all support the wildcard * FALSE Google and Yahoo both support the wildcard * but in different ways Yahoo Google Google, Yahoo Google, Yahoo * = one word * = one or many words ** = 2 words *** = 3 words 1. nancy * warren 2. vice president of * 3. motion to * evidence 4. robins * * ciresi
Search Techniques filetype: command T/F: Google, Bing and Yahoo all support limiting your search results by file-type TRUE Search Techniques - Stemming T/F: Stemming means means that we don t need to worry about using different forms of a word in our search query TRUE v Search Techniques - Location T/F: Your location influences your search results TRUE Search: bike trails Google #1 v Yahoo #5-6 Bing #4 Search Techniques the + operator T/F: You can demand the inclusion of search terms using the + operator TRUE Search Engine Optimization
How Search Engines are Changing Visible vs. Deep Web 1. Visible Web Web pages that general-purpose search engines like Google, Yahoo and Bing, can crawl and index and make available to us to search. 2. Visible Web Social Content Fastest growing type of content. Specialty search engines are better at searching this lessstructured and more ephemeral content. 3. Invisible Web Web pages that are not publicly available; not crawled by general-purpose or specialty search engines. 4. Deep web Web pages that are not available through generalpurpose search engines and that usually require a proprietary search engine. This area of the web is one of the most dynamic because sites continually filter up from the deep web to the visible web as the major search engines figure out how to crawl them or get access to them. Partnering for Specialized Content Google + Pandora, Rhapsody, imeem, and Lala Partnering for Specialized Content Bing + Wolfram Alpha Partnering for Specialized Content Bing + Twitter Embedded Material in Results Yahoo with Twitter, News, Photos, Videos
Embedded Material in Results Google with Twitter Google and Bing with statistics Embedded Material in Results Search: bike trails Results Options Keeping Up Thank You
Search Like a Pro Resources Nancy Warren / nlwarren2@yahoo.com AkLA Conference 2010 Selected Sources for Monitoring Search Engines Information Today Online ReadWriteWeb SearchEngineUpdate SearchEngineWatch Comparison Search Engines Browsys <http://www.browsys.com/finder/> Search3 <http://www.search3.com/> Bing vs Google <http://www.bing-vs-google.com/> Google vs Bing <http://www.blackdog.ie/google-bing/> This site also compares Google's and Bing's search engines for various European countries Web Sites Mentioned in the Presentation (other than Google, Yahoo and Bing) On Bots by Drunk Men Work Here <http://drunkmenworkhere.org/219> Large-scale experiment on search engine behavior SearchCloud.net <http://searchcloud.net/> Allows you to weight your search terms Wordle <http://www.wordle.net> Create your own word cloud art Search Tips 1. To find cached pages on Google by adding this in front of the URL of the page you want: http://www.google.com/search?source=ig&hl=en&rlz=&q=cache:[full URL] You can also use, cache:[full URL] 2. To find pages recently added to Google's index, run a search and add the following text to the resulting URL: &tbs=qdr:x##&tbo=1 Replace the x## with s## for the number of seconds n## for the number of minutes (m## is for months) h## for the number of hours Be sure to replace the # signs with the number you want Search Engine Optimization Tips for building web pages 1. Make sure your important keywords are in the title of the page, within headers, and within the first sentence and paragraph 2. Make sure all images have good alt tags 3. Words used for links within the site are important 4. Check how often your terms appear in various forms
Search Like a Pro Resources Nancy Warren / nlwarren2@yahoo.com AkLA Conference 2010 Which search engines save cached copies of pages? A. All search engines B. Only Google C. Only search engines with proprietary crawlers and databases D. Only search engines that get enough advertising revenue How can a webmaster remove a cached page from a search engine s database of cached pages? A. Change of the content of the page, so that the next time the crawler visits the page, it will save the new version and write over the old version. B. Include a robots.txt file with commands like noindex, telling the crawler not to index the page. C. Put a redirect command on the page. D. Call each search engine s customer support center and ask them to remove the cached page. E. A, B and C. T/F: Word order influences search results. T/F: Quotation marks aren t necessary anymore. T/F: Search engines support full Boolean syntax. T/F: Search engines are now tolerant of stop words, even without quotation marks. T/F: Using the [site:] command lets you search a specific domain or web site T/F: Google, Bing and Yahoo all support limiting your search results by file-type T/F: Google, Bing and Yahoo all support the wildcard * T/F: Stemming gives you the same results regardless of whether the keywords are plural or not T/F: Your location influences your search results T/F: You can demand the inclusion of search terms using the + operator