Key words: Wikipedia - a non-profit encyclopedia project available online and written collaboratively by volunteers from around the world

Size: px
Start display at page:

Download "Key words: Wikipedia - a non-profit encyclopedia project available online and written collaboratively by volunteers from around the world"

Transcription

1 Jacob Wenger Programming Paradigms Thursday, May 12, 2011 Six Degrees of Wikipedia Abstract: I created an application which allows you to find the shortest paths between any two Wikipedia articles. Paths, in this sense, follow along the hyperlinks found on every Wikipedia page. Data about Wikipedia was found online in the form of queries returning every row in various tables from the Wikipedia database. I downloaded these files, parsed them, sorted them, and stored them in a local SQLite database. I created a simple, intuitive front-end graphical user interface (GUI) using PyQt. The GUI requires the user to provide only the starting and ending article names and click a large button to find the paths between them. Additional features which determine if the current query is an actual article and which provide suggested article names are also provided by querying the local database. Aspects of functional programming (e.g. list comprehensions) and regular expressions (e.g. command-line utility sed and querying the database for suggested articles) were used in this project. Although Python was the predominant language used, a shell script which retrieved and parsed the online data was also created. Key words: Wikipedia - a non-profit encyclopedia project available online and written collaboratively by volunteers from around the world Page a Wikipedia article Link a hyperlink on a page which leads to another page Redirect a page which does not exist but redirects to an existing page; for example, Airplane is a redirect to Fixed-wing aircraft and Gandhi redirects to Mohandas Karamchand Gandhi Breadth first search a graph search algorithm that begins at the root node and explores all of its neighboring nodes; then, for each of those nodes, it explores their unexplored neighbors; this process continues until the end node is found or until all nodes are explored SQLite - an embedded relational database management system contained in a relatively small C programming library; it is essentially a less-powerful version of an SQL database PyQt a Python binding of the cross platform graphical user interface (GUI) toolkit Qt

2 Introduction: The articles which comprise Wikipedia [5][6] represent, in some sense, the collective knowledge of humanity. In only ten short years since its founding in 2001 Wikipedia has already amassed over 3 million articles, a number which is rising every single day. I was interested in seeing how closely this knowledge is tied together. Concepts such as six degrees of separation, which states that every person is connected to every other person through only six familial links, further piqued my interest. As a result, I decided to create an application called Six Degrees of Wikipedia (SDOW for short) which determines the shortest path from one Wikipedia article to another by following the hyperlinks found on the Wikipedia pages themselves. To my surprise, this problem had not previously been sufficiently solved. Through my research, I discovered that only one person, namely Stephen Dolan, had even attempted to complete a similar project [3]. However, his implementation was simple, buggy, and sometimes incorrect (as I can now prove) and thus was not satisfactory. Besides him, I saw no other mention of anyone who even attempted to tackle the problem of finding the shortest path between two Wikipedia articles. This was an exciting discovery since it meant I was the first person to provide a suitable solution to problem I posed. This project began by obtaining data on Wikipedia articles from an online dump provided by Wikipedia itself [1]. This information was parsed and stored in a local SQLite database. A breadth-first searching algorithm was implemented in Python and made use of the local database to find the paths between two Wikipedia articles. Finally, a graphical user interface (GUI) was created in PyQt to make the experience for users quick and intuitive. Usage examples: To open the GUI for Six Degrees of Wikipedia, simply type the following command: python./sixdegreesofwikipedia.py The GUI is very simple and intuitive. There are two line edits which represent the starting and ending Wikipedia article names. There is also a button titled Find Wikipedia Links! which finds the paths between the two articles when pressed. Finally, there is a search log at the bottom which keeps track of the recent search history. The GUI is shown on the following page.

3 If you wanted to find the shortest path from the Wikipedia articles Search-based application and Wikipedia, you can start by typing Search-based application into the starting article line edit. As you type, you will notice that the color of the text changes. The text will be one of three colors: (1) green for when it corresponds to an actual Wikipedia article, (2) yellow for when it corresponds to a Wikipedia redirect, and (3) red in every other case. Another feature you will notice is suggested search terms, similar to those found in Google. The suggestions are filtered in order to provide you with the pages you most likely care about. The combination of these two features provides a powerful user interface which tells the user if the article they are searching for exists and saves them time by auto-completing their query through suggestions. When Search is typed in the image below, it is colored yellow to signify it is a redirect. In addition, suggested articles, including Search-based application, are given.

4 Once both the starting and ending articles are specified, the Find Wikipedia Links! button can be pressed. If either of the article names does not actually correspond to a real Wikipedia article (i.e. they are red), the search log will alert you to this error. If either of them are redirects (i.e. they are yellow), the search log will tell you to which article they redirect. If both articles exist and neither of them are redirects (i.e. they are green), the search for the shortest path between them will begin. Most searches take between five to ten seconds. A message which tells you how many paths were found between the two articles and of what degree the paths are will be written to the search log. This will be followed by lists of paths, each element of which is separated by a pipe. The results will look like those seen in the image below. Approach and methods: A lot went into the making Six Degrees of Wikipedia. First, the Wikipedia data needed to be obtained. It came from an online dump of the Wikipedia database [1]. I downloaded gzipped files which contained a query of all the rows in three tables: pages, links, and redirects. I also found a file containing the hit counts of every Wikipedia article [2]. These large files were unzipped, parsed using command line utilities like grep, awk, and sed, sorted, and placed into a local SQLite database through the use of Python scripts. A shell script named getdata.sh was run as an SGE job on opteron.crc.nd.edu and took over six hours to run. Next, a searching algorithm needed to be written to find the paths between any two Wikipedia

5 pages. I modeled Wikipedia as a directed graph, where each article was a node and each link between two articles was an edge. In order to find the shortest path between two nodes, a breadth first search (BFS) was the best option. The search began at the node represented by the starting article. All of its neighbors (i.e. the articles to which it links to directly) were found and if any of them were the node representing the ending article, the BFS finished. Otherwise, the unexplored neighbors of the starting node s neighbors were found and again checked to see if they were the end node. The BFS continues until the end node is reached or all the nodes are explored. This procedure guarantees that the shortest path between the two nodes would be found if it exists. However, it is quite slow because even after only a few iterations, millions of nodes are being searched. To speed up this process I implemented a bi-directional BFS. This essentially did two separate BFS, one from the starting node and one from the ending node. The two BFS switch back and forth after each one completes one level of the search. Whenever these two BFS overlapped (i.e. contained the same node), the entire process finished. This sped up things considerably. One final enhancement I made was implementing an intelligent bidirectional BFS. By intelligent, I mean that the algorithm always chose to explore the next level of neighbors of the BFS that had the fewest neighbors. This was an improvement over the plain bi-directional BFS and again resulted in a good speedup. The Python script breadthfirstsearch.py contains the BFS algorithm. The BFS algorithm made use of the local databases I created. First, I used the pages table to convert the article names provided in the GUI to article IDs. Next, I ran the BFS from the starting article ID to the ending article ID suing the links and redirects databases. This returned a list of paths of IDs, which were converted to article names again using the pages database. Although the GUI is rather plain, it contains a lot of functionality. I created a modified QLineEdit class which changes colors according to its current text and which includes a dynamically changing QCompleter which provides search suggestions. The color is determined by querying the redirects and pages databases and returns nearly instantaneous results. The suggestions are found by using a regular expression to query the pages database and are filtered by using the page hit counts database. Finally, the Find Wikipedia Links! button is a simple QButton which calls the BFS when it is pressed and the search log is a simple QTextBrowser.

6 Discussion and future work: I accomplished the goal I set out to achieve. My Six Degrees Of Wikipedia application properly finds the shortest path between any two Wikipedia articles. However, there are still many way I could improve and expand SDOW. My main concern is speeding up the BFS algorithm. The best solution to his problem would most likely be parallelizing the algorithm using MPI or Work Queue. Another area for improvement is suggested articles when a user is typing in the article name. Providing suggestions which are more relevant through the use of more rigorous filters would be a nice improvement. Also, my program does not accept article names with special characters (e.g. é, ü, ñ, etc.) so integration with these is another area for improvement. Conversion to a fully functioning web-based application which allows you to visualize the paths between articles, as show in the image below, is already underway. However, much progress can still be made. Finally, an analysis of the results (e.g. finding strongly connected components or determining the average distance between any two nodes through an all-pair analysis) could yield interesting results on the relatedness of human knowledge. References: [1] Database dump progress. Wikipedia. Accessed November Available at wikimedia.org/ backup-index.html. [2] Index of /wikistats/. Domas Mituzas. Accessed 2 May Available wikistats/. [3] Six Degrees of Wikipedia. Stephen Dolan. Accessed November Available at ~mu/wiki/. [4] The top 500 sites on the web. Accessed 4 May Available at topsites.

7 [5] Wikipedia. Wikipedia. Accessed 4 May Available at Wikipedia. [6] Wikipedia: The Free Encyclopedia. Wikipedia. Accessed November Available at Acknowledgements: First, I would like to thank my brother, Aaron Wenger, for his help when this project was in its infancy. This began as a collaboration between the two of us and would not have gotten off the ground without him. Stephen Dolan should also be thanked for the information I obtained from the description he wrote about how he completed a project similar to SDOW. Although he never actually responded to my s he still gave me some helpful pointers. I would also like to thank Chad Heise for his help coming up with this project idea and being a tester throughout the evolution of the code. Finally, thanks goes out to RJ Nowling for showing me how to submit jobs to SGE. Authors: Jacob Wenger is a junior majoring in computer science and the third in a line of Wenger s (preceded by his brother 05 and sister 10) who have attended the University of Notre Dame. When he is not coding the night away, he enjoys watching and attending sporting events, playing basketball and football out on the quad, going out to trivia with his friends, and watching movies. Jacob has spent the past two summers in a research setting and studying abroad in Alcoy, Spain, but will spend next summer out in Seattle working for Microsoft. Upon graduation, Jacob will most likely work in industry for a few years and eventually attend graduate school to pursue a degree in higher education.

Figure 1: A directed graph.

Figure 1: A directed graph. 1 Graphs A graph is a data structure that expresses relationships between objects. The objects are called nodes and the relationships are called edges. For example, social networks can be represented as

More information

Server monitoring for Tor exit nodes

Server monitoring for Tor exit nodes CASE STUDY Server monitoring for Tor exit nodes We had a chance to catch up with Kenan Sulayman, who runs some of the biggest Tor servers in the world. Read on to learn about server monitoring for highthroughput

More information

The Untold Story of. Debugiano. By Bill Gowans

The Untold Story of. Debugiano. By Bill Gowans The Untold Story of Debugiano By Bill Gowans Before we start, is there anything you would like to say first? asked CBS news correspondent Carol Elliot. My name is Debugiano, and I am the long lost brother

More information

The Crossed Swords wargame: Catching NATO red teams with cyber deception

The Crossed Swords wargame: Catching NATO red teams with cyber deception The Crossed Swords wargame: Catching NATO red teams with cyber deception 2015-2018 Cymmetria Inc. All rights reserved. 2 BACKSTORY Once a year, the pentesters* and red teams of the countries of NATO descend

More information

(Refer Slide Time: 02.06)

(Refer Slide Time: 02.06) Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 27 Depth First Search (DFS) Today we are going to be talking

More information

Personal Health Assistant: Final Report Prepared by K. Morillo, J. Redway, and I. Smyrnow Version Date April 29, 2010 Personal Health Assistant

Personal Health Assistant: Final Report Prepared by K. Morillo, J. Redway, and I. Smyrnow Version Date April 29, 2010 Personal Health Assistant Personal Health Assistant Ishmael Smyrnow Kevin Morillo James Redway CSE 293 Final Report Table of Contents 0... 3 1...General Overview... 3 1.1 Introduction... 3 1.2 Goal...3 1.3 Overview... 3 2... Server

More information

Friend Cloud: Identifying Networks and Connections Between Facebook Friends. Ben Fuja and Joey Rich. Programming Paradigms May 12, 2011

Friend Cloud: Identifying Networks and Connections Between Facebook Friends. Ben Fuja and Joey Rich. Programming Paradigms May 12, 2011 Friend Cloud: Identifying Networks and Connections Between Facebook Friends Ben Fuja and Joey Rich Programming Paradigms May 12, 2011 1 Abstract Facebook has become a ubiquitous aspect of the social world,

More information

γ : constant Goett 2 P(k) = k γ k : degree

γ : constant Goett 2 P(k) = k γ k : degree Goett 1 Jeffrey Goett Final Research Paper, Fall 2003 Professor Madey 19 December 2003 Abstract: Recent observations by physicists have lead to new theories about the mechanisms controlling the growth

More information

Class Search Plus Programming Paradigms Final Project Spring Hayley Hawkinson & Ryan Solava

Class Search Plus Programming Paradigms Final Project Spring Hayley Hawkinson & Ryan Solava Class Search Plus Programming Paradigms Final Project Spring 2011 Hayley Hawkinson & Ryan Solava Abstract Class Search Plus extends the functionality of Notre Dame s Class Search by adding scheduling functionality.

More information

Table of Contents. #2921 Differentiated Nonfiction Reading 2 Teacher Created Resources, Inc.

Table of Contents. #2921 Differentiated Nonfiction Reading 2 Teacher Created Resources, Inc. Table of Contents Introduction....3 Practice Reading Passage: Geothermal Power........................................... 7 Practice Comprehension Questions....................................................

More information

CASE STUDY IT. Albumprinter Adopting Redgate DLM

CASE STUDY IT. Albumprinter Adopting Redgate DLM CASE STUDY IT Albumprinter Adopting Redgate DLM "Once the team saw they could deploy all their database changes error-free at the click of a button, with no more manual scripts, it spread by word of mouth.

More information

FINAL REPORT 04/25/2015 FINAL REPORT SUNY CANTON MOBILE APPLICATION

FINAL REPORT 04/25/2015 FINAL REPORT SUNY CANTON MOBILE APPLICATION FINAL REPORT SUNY CANTON MOBILE APPLICATION GROUP MEMBERS: Alexander Royce & Luke Harper SUNY CANTON SPRING 2015 Table of Contents List of Figures... 2 Research... 4 Programming Language... 4 Android Studio...

More information

Python for Verification!

Python for Verification! Python for Verification! Donald McCarthy 23 April 2018 - restricted - Who is this guy? I First learnt to program in 1975 on a minicomputer, by sneaking into Trent Polytechnic when I was 15 I ve programmed

More information

EPS Import Functionality for ReportLab

EPS Import Functionality for ReportLab A Proposal for the Synopsis By Mark Peters mark.peters@ivanhouse.com ReportLab is a Python Library designed to easily implement PDF output functionality into Python programs. Currently, ReportLab can import

More information

Graph Algorithms. Revised based on the slides by Ruoming Kent State

Graph Algorithms. Revised based on the slides by Ruoming Kent State Graph Algorithms Adapted from UMD Jimmy Lin s slides, which is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States. See http://creativecommons.org/licenses/by-nc-sa/3.0/us/

More information

Computational Steering

Computational Steering Computational Steering Nate Woody 10/13/2009 www.cac.cornell.edu 1 Lab Materials I ve placed some sample code in ~train100 that performs the operations that I ll demonstrate during this talk. We ll walk

More information

My Favorite bash Tips and Tricks

My Favorite bash Tips and Tricks 1 of 6 6/18/2006 7:44 PM My Favorite bash Tips and Tricks Prentice Bisbal Abstract Save a lot of typing with these handy bash features you won't find in an old-fashioned UNIX shell. bash, or the Bourne

More information

GRAPHS Lecture 17 CS2110 Spring 2014

GRAPHS Lecture 17 CS2110 Spring 2014 GRAPHS Lecture 17 CS2110 Spring 2014 These are not Graphs 2...not the kind we mean, anyway These are Graphs 3 K 5 K 3,3 = Applications of Graphs 4 Communication networks The internet is a huge graph Routing

More information

Week - 01 Lecture - 04 Downloading and installing Python

Week - 01 Lecture - 04 Downloading and installing Python Programming, Data Structures and Algorithms in Python Prof. Madhavan Mukund Department of Computer Science and Engineering Indian Institute of Technology, Madras Week - 01 Lecture - 04 Downloading and

More information

Learn Windows PowerShell in a Month of Lunches

Learn Windows PowerShell in a Month of Lunches Learn Windows PowerShell in a Month of Lunches by Don Jones Chapter 4 Copyright 2011 Manning Publications brief contents 1 Before you begin 1 2 Running commands 9 3 Using the help system 23 4 The pipeline:

More information

Usability Report. Author: Stephen Varnado Version: 1.0 Date: November 24, 2014

Usability Report. Author: Stephen Varnado Version: 1.0 Date: November 24, 2014 Usability Report Author: Stephen Varnado Version: 1.0 Date: November 24, 2014 2 Table of Contents Executive summary... 3 Introduction... 3 Methodology... 3 Usability test results... 4 Effectiveness ratings

More information

Basic Network Concepts

Basic Network Concepts Basic Network Concepts Basic Vocabulary Alice Graph Network Edges Links Nodes Vertices Chuck Bob Edges Alice Chuck Bob Edge Weights Alice Chuck Bob Apollo 13 Movie Network Main Actors in Apollo 13 the

More information

Problem and Solution Overview: An elegant task management solution, that saves busy people time.

Problem and Solution Overview: An elegant task management solution, that saves busy people time. An elegant task management solution, that saves busy people time. Team: Anne Aoki: Storyboarding, design, user studies, writing Alex Anderson: User studies, design Matt Willden: Ideation, writing, user

More information

In today s video I'm going show you how you can set up your own online business using marketing and affiliate marketing.

In today s video I'm going show you how you can set up your own online business using  marketing and affiliate marketing. Hey guys, Diggy here with a summary of part two of the four part free video series. If you haven't watched the first video yet, please do so (https://sixfigureinc.com/intro), before continuing with this

More information

Exploring Performance Tradeoffs in a Sudoku SAT Solver CS242 Project Report

Exploring Performance Tradeoffs in a Sudoku SAT Solver CS242 Project Report Exploring Performance Tradeoffs in a Sudoku SAT Solver CS242 Project Report Hana Lee (leehana@stanford.edu) December 15, 2017 1 Summary I implemented a SAT solver capable of solving Sudoku puzzles using

More information

BiblioGRAPH. Will Frank Advisor: Dr. Oleg Sokolsky. April 10, 2006

BiblioGRAPH. Will Frank Advisor: Dr. Oleg Sokolsky. April 10, 2006 BiblioGRAPH Will Frank wmfrank@stwing.org Advisor: Dr. Oleg Sokolsky April 10, 2006 1 Abstract The basics of this project are taken from a suggestion by Professor Sokolsky on the CSE 400 Project Ideas

More information

CS781 Lecture 2 January 13, Graph Traversals, Search, and Ordering

CS781 Lecture 2 January 13, Graph Traversals, Search, and Ordering CS781 Lecture 2 January 13, 2010 Graph Traversals, Search, and Ordering Review of Lecture 1 Notions of Algorithm Scalability Worst-Case and Average-Case Analysis Asymptotic Growth Rates: Big-Oh Prototypical

More information

B a s h s c r i p t i n g

B a s h s c r i p t i n g 8 Bash Scripting Any self-respecting hacker must be able to write scripts. For that matter, any selfrespecting Linux administrator must be able to script. Hackers often need to automate commands, sometimes

More information

Introduction To Graphs and Networks. Fall 2013 Carola Wenk

Introduction To Graphs and Networks. Fall 2013 Carola Wenk Introduction To Graphs and Networks Fall 2013 Carola Wenk What is a Network? We have thought of a computer as a single entity, but they can also be connected to one another. Internet What are the advantages

More information

The Paperless Classroom with Google Docs by - Eric Curts

The Paperless Classroom with Google Docs by - Eric Curts The Paperless Classroom with Google Docs by - Eric Curts Table of Contents Overview How to name documents and folders How to choose sharing options: Edit, Comment, and View How to share a document with

More information

COGNOS DYNAMIC CUBES: SET TO RETIRE TRANSFORMER? Update: Pros & Cons

COGNOS DYNAMIC CUBES: SET TO RETIRE TRANSFORMER? Update: Pros & Cons COGNOS DYNAMIC CUBES: SET TO RETIRE TRANSFORMER? 10.2.2 Update: Pros & Cons GoToWebinar Control Panel Submit questions here Click arrow to restore full control panel Copyright 2015 Senturus, Inc. All Rights

More information

Kearney High School Class of Search Guide Guide to Searching for Missing Classmates as of 12/04/2009

Kearney High School Class of Search Guide Guide to Searching for Missing Classmates as of 12/04/2009 Thanks for your interest to serve on our Search Team to help search for our Missing Classmates! Our goal is to obtain current contact information for each Missing Classmate, so we can invite them to visit

More information

Lesson 3 Transcript: Part 2 of 2 Tools & Scripting

Lesson 3 Transcript: Part 2 of 2 Tools & Scripting Lesson 3 Transcript: Part 2 of 2 Tools & Scripting Slide 1: Cover Welcome to lesson 3 of the DB2 on Campus Lecture Series. Today we are going to talk about tools and scripting. And this is part 2 of 2

More information

DRACULA. CSM Turner Connor Taylor, Trevor Worth June 18th, 2015

DRACULA. CSM Turner Connor Taylor, Trevor Worth June 18th, 2015 DRACULA CSM Turner Connor Taylor, Trevor Worth June 18th, 2015 Acknowledgments Support for this work was provided by the National Science Foundation Award No. CMMI-1304383 and CMMI-1234859. Any opinions,

More information

Depth First Search A B C D E F G A B C 5 D E F 3 2 G 2 3

Depth First Search A B C D E F G A B C 5 D E F 3 2 G 2 3 Depth First Search A B C D E F G A 4 3 2 B 4 5 4 3 C 5 D 3 4 2 E 2 2 3 F 3 2 G 2 3 Minimum (Weight) Spanning Trees Let G be a graph with weights on the edges. We define the weight of any subgraph of G

More information

Computational Steering

Computational Steering Computational Steering Nate Woody 10/23/2008 www.cac.cornell.edu 1 What is computational steering? Generally, computational steering can be thought of as a method (or set of methods) for providing interactivity

More information

While You Were Sleeping - Scheduling SAS Jobs to Run Automatically Faron Kincheloe, Baylor University, Waco, TX

While You Were Sleeping - Scheduling SAS Jobs to Run Automatically Faron Kincheloe, Baylor University, Waco, TX While You Were Sleeping - Scheduling SAS Jobs to Run Automatically Faron Kincheloe, Baylor University, Waco, TX ABSTRACT If you are tired of running the same jobs over and over again, this paper is for

More information

A detailed comparison of EasyMorph vs Tableau Prep

A detailed comparison of EasyMorph vs Tableau Prep A detailed comparison of vs We at keep getting asked by our customers and partners: How is positioned versus?. Well, you asked, we answer! Short answer and are similar, but there are two important differences.

More information

Tripster. An web application to make your road trip planning easy and enjoyable. Junlin Zhang. Minfa Wang. Shenghan Yao

Tripster. An web application to make your road trip planning easy and enjoyable. Junlin Zhang. Minfa Wang. Shenghan Yao Tripster An web application to make your road trip planning easy and enjoyable Minfa Wang California Institute of Technology mwang5@caltech.edu Shenghan Yao California Institute of Technology syao@caltech.edu

More information

CASE STUDY INSURANCE. Innovation at Moody's Analytics: A new approach to database provisioning using SQL Clone

CASE STUDY INSURANCE. Innovation at Moody's Analytics: A new approach to database provisioning using SQL Clone CASE STUDY INSURANCE Innovation at Moody's Analytics: A new approach to database provisioning using SQL Clone We already had a one-click process for database provisioning, but it was still taking too much

More information

Study Guide Processes & Job Control

Study Guide Processes & Job Control Study Guide Processes & Job Control Q1 - PID What does PID stand for? Q2 - Shell PID What shell command would I issue to display the PID of the shell I'm using? Q3 - Process vs. executable file Explain,

More information

Algorithm Design and Analysis

Algorithm Design and Analysis Algorithm Design and Analysis LECTURE 4 Graphs Definitions Traversals Adam Smith 9/8/10 Exercise How can you simulate an array with two unbounded stacks and a small amount of memory? (Hint: think of a

More information

OTRS Quick Reference

OTRS Quick Reference OTRS Quick Reference WWW.IXIASOFT.COM / OTRS / Copyright 2015 IXIASOFT Technologies. All rights reserved. Last revised: March 02, 2015 2 OTRS Quick Reference Table of contents Using OTRS OTRS overview

More information

Copyright 2000, Kevin Wayne 1

Copyright 2000, Kevin Wayne 1 Chapter 3 - Graphs Undirected Graphs Undirected graph. G = (V, E) V = nodes. E = edges between pairs of nodes. Captures pairwise relationship between objects. Graph size parameters: n = V, m = E. Directed

More information

W4231: Analysis of Algorithms

W4231: Analysis of Algorithms W4231: Analysis of Algorithms 10/21/1999 Definitions for graphs Breadth First Search and Depth First Search Topological Sort. Graphs AgraphG is given by a set of vertices V and a set of edges E. Normally

More information

Texas Death Row. Last Statements. Data Warehousing and Data Mart. By Group 16. Irving Rodriguez Joseph Lai Joe Martinez

Texas Death Row. Last Statements. Data Warehousing and Data Mart. By Group 16. Irving Rodriguez Joseph Lai Joe Martinez Texas Death Row Last Statements Data Warehousing and Data Mart By Group 16 Irving Rodriguez Joseph Lai Joe Martinez Introduction For our data warehousing and data mart project we chose to use the Texas

More information

Get First Page in One Month. How I ranked my blog in Google Page 1 in a month

Get First Page in One Month. How I ranked my blog in Google Page 1 in a month Get First Page in One Month How I ranked my blog in Google Page 1 in a month 2015 Dipendra Pokharel, DipIncome.com Contents Background and Introduction(This is where I have introduced myself and shared

More information

"SQL Monitor now makes the team look more professional."

SQL Monitor now makes the team look more professional. PRUDENTIAL CASE STUDY "SQL Monitor now makes the team look more professional." How Redgate s SQL Monitor makes monitoring a host of production servers easier, faster, and more professional 91% of Fortune

More information

Assignment 3 ITCS-6010/8010: Cloud Computing for Data Analysis

Assignment 3 ITCS-6010/8010: Cloud Computing for Data Analysis Assignment 3 ITCS-6010/8010: Cloud Computing for Data Analysis Due by 11:59:59pm on Tuesday, March 16, 2010 This assignment is based on a similar assignment developed at the University of Washington. Running

More information

Jordan Boyd-Graber University of Maryland. Thursday, March 3, 2011

Jordan Boyd-Graber University of Maryland. Thursday, March 3, 2011 Data-Intensive Information Processing Applications! Session #5 Graph Algorithms Jordan Boyd-Graber University of Maryland Thursday, March 3, 2011 This work is licensed under a Creative Commons Attribution-Noncommercial-Share

More information

Computer Science 572 Midterm Prof. Horowitz Thursday, March 8, 2012, 2:00pm 3:00pm

Computer Science 572 Midterm Prof. Horowitz Thursday, March 8, 2012, 2:00pm 3:00pm Computer Science 572 Midterm Prof. Horowitz Thursday, March 8, 2012, 2:00pm 3:00pm Name: Student Id Number: 1. This is a closed book exam. 2. Please answer all questions. 3. There are a total of 40 questions.

More information

RESEARCH NOTE. Autonomous Data. Oracle s Self-Driving Database. What ADWC Actually Is. WinterCorp

RESEARCH NOTE. Autonomous Data. Oracle s Self-Driving Database. What ADWC Actually Is. WinterCorp W I N T E R C O R P O R A T I O N RESEARCH NOTE BY RICHARD WINTER WinterCorp www.wintercorp.com ORACLE HAS ANNOUNCED Autonomous Data Warehouse Cloud (ADWC), a service whereby a user can rapidly create

More information

Database infrastructure for electronic structure calculations

Database infrastructure for electronic structure calculations Database infrastructure for electronic structure calculations Fawzi Mohamed fawzi.mohamed@fhi-berlin.mpg.de 22.7.2015 Why should you be interested in databases? Can you find a calculation that you did

More information

Lecture 1: Overview

Lecture 1: Overview 15-150 Lecture 1: Overview Lecture by Stefan Muller May 21, 2018 Welcome to 15-150! Today s lecture was an overview that showed the highlights of everything you re learning this semester, which also meant

More information

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi. Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 18 Tries Today we are going to be talking about another data

More information

Office Hours: Hidden gems in Excel 2007

Office Hours: Hidden gems in Excel 2007 Page 1 of 6 Help and How-to Office Hours: Hidden gems in Excel 2007 October 1, 2007 Jean Philippe Bagel Sometimes love at first sight lasts for years. This week's columnist offers new and interesting ways

More information

Enable Spark SQL on NoSQL Hbase tables with HSpark IBM Code Tech Talk. February 13, 2018

Enable Spark SQL on NoSQL Hbase tables with HSpark IBM Code Tech Talk. February 13, 2018 Enable Spark SQL on NoSQL Hbase tables with HSpark IBM Code Tech Talk February 13, 2018 https://developer.ibm.com/code/techtalks/enable-spark-sql-onnosql-hbase-tables-with-hspark-2/ >> MARC-ARTHUR PIERRE

More information

CSI Lab 02. Tuesday, January 21st

CSI Lab 02. Tuesday, January 21st CSI Lab 02 Tuesday, January 21st Objectives: Explore some basic functionality of python Introduction Last week we talked about the fact that a computer is, among other things, a tool to perform high speed

More information

SI Networks: Theory and Application, Fall 2008

SI Networks: Theory and Application, Fall 2008 University of Michigan Deep Blue deepblue.lib.umich.edu 2008-09 SI 508 - Networks: Theory and Application, Fall 2008 Adamic, Lada Adamic, L. (2008, November 12). Networks: Theory and Application. Retrieved

More information

COSC 2P95. Introduction. Week 1. Brock University. Brock University (Week 1) Introduction 1 / 18

COSC 2P95. Introduction. Week 1. Brock University. Brock University (Week 1) Introduction 1 / 18 COSC 2P95 Introduction Week 1 Brock University Brock University (Week 1) Introduction 1 / 18 Lectures and Labs Lectures are Thursdays, from 3pm 5pm (AS/STH 217) There are two lab sections Lab 1 is Mondays,

More information

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) January 11, 2018 Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 In this lecture

More information

Time - Experience Report. By Thanou Thirakul

Time - Experience Report. By Thanou Thirakul Large Scale Testing In Agile Time - Experience Report Large Scale Testing In Agile Time - Experience Report What we re going to talk about: 1. Background on the application 2. Background on the build process

More information

3.1 Basic Definitions and Applications. Chapter 3. Graphs. Undirected Graphs. Some Graph Applications

3.1 Basic Definitions and Applications. Chapter 3. Graphs. Undirected Graphs. Some Graph Applications Chapter 3 31 Basic Definitions and Applications Graphs Slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley All rights reserved 1 Undirected Graphs Some Graph Applications Undirected graph G = (V,

More information

Vikramjit clears CCIE Routing & Switching in 1st attempt after Cisco Training in Network Bulls, India

Vikramjit clears CCIE Routing & Switching in 1st attempt after Cisco Training in Network Bulls, India Vikramjit clears CCIE Routing & Switching in 1st attempt after Cisco Training in Network Bulls, India Today we look at success story of Vikramjit, an Indian NRI who came to India from Ireland for Cisco

More information

Database Table Editor for Excel. by Brent Larsen

Database Table Editor for Excel. by Brent Larsen Database Table Editor for Excel by Brent Larsen Executive Summary This project is a database table editor that is geared toward those who use databases heavily, and in particular those who frequently insert,

More information

Merge Conflicts p. 92 More GitHub Workflows: Forking and Pull Requests p. 97 Using Git to Make Life Easier: Working with Past Commits p.

Merge Conflicts p. 92 More GitHub Workflows: Forking and Pull Requests p. 97 Using Git to Make Life Easier: Working with Past Commits p. Preface p. xiii Ideology: Data Skills for Robust and Reproducible Bioinformatics How to Learn Bioinformatics p. 1 Why Bioinformatics? Biology's Growing Data p. 1 Learning Data Skills to Learn Bioinformatics

More information

Project Report Number Plate Recognition

Project Report Number Plate Recognition Project Report Number Plate Recognition Ribemont Francois Supervisor: Nigel Whyte April 17, 2012 Contents 1 Introduction............................... 2 2 Description of Submitted Project...................

More information

Molecular Statistics Exercise 1. As was shown to you this morning, the interactive python shell can add, subtract, multiply and divide numbers.

Molecular Statistics Exercise 1. As was shown to you this morning, the interactive python shell can add, subtract, multiply and divide numbers. Molecular Statistics Exercise 1 Introduction This is the first exercise in the course Molecular Statistics. The exercises in this course are split in two parts. The first part of each exercise is a general

More information

uplift - Interactive Prototype #2

uplift - Interactive Prototype #2 uplift - Interactive Prototype #2 Daniel Kitt (Project Manager) Alisa Yamanaka (Documentation) Haochen Wei (Usability) Yuna Shim (Design) Jared Jones (Development) 1. Problem description People are generally

More information

This is an oral history interview conducted on. October 30, 2003, with IBM researcher Chieko Asakawa and IBM

This is an oral history interview conducted on. October 30, 2003, with IBM researcher Chieko Asakawa and IBM This is an oral history interview conducted on October 30, 2003, with IBM researcher Chieko Asakawa and IBM Corporate Archivist, Paul Lasewicz, conducted the interview. Thank you, and welcome. Thank you

More information

TESTING SOFTWARE COMBINED WITH CONVENTIONAL AUTOMATED SOFTWARE QUALITY (ASQ) PRODUCTS

TESTING SOFTWARE COMBINED WITH CONVENTIONAL AUTOMATED SOFTWARE QUALITY (ASQ) PRODUCTS TESTING SOFTWARE COMBINED WITH CONVENTIONAL AUTOMATED SOFTWARE QUALITY (ASQ) PRODUCTS 5 FIELD OF INVENTION The invention generally relates to automated software quality and performance testing. BACKGROUND

More information

CS140 Final Project. Nathan Crandall, Dane Pitkin, Introduction:

CS140 Final Project. Nathan Crandall, Dane Pitkin, Introduction: Nathan Crandall, 3970001 Dane Pitkin, 4085726 CS140 Final Project Introduction: Our goal was to parallelize the Breadth-first search algorithm using Cilk++. This algorithm works by starting at an initial

More information

GWNMS NeDi. About NeDi. Configuring the NeDi Package. Managing User Access. Managing User Accounts

GWNMS NeDi. About NeDi. Configuring the NeDi Package. Managing User Access. Managing User Accounts GWNMS NeDi This section reviews the GroundWork Monitor NMS NeDi. About NeDi NeDi is an open source toolkit for managing network infrastructure devices such as switches and routers, and is integrated into

More information

Programming Project #6: Password File Cracker

Programming Project #6: Password File Cracker CSE231 Spring 2017 Programming Project #6: Password File Cracker (Edits: changed itertools permutations to product either works for these passwords, but product is the correct one. Removed lists and tuples

More information

1 Counting triangles and cliques

1 Counting triangles and cliques ITCSC-INC Winter School 2015 26 January 2014 notes by Andrej Bogdanov Today we will talk about randomness and some of the surprising roles it plays in the theory of computing and in coding theory. Let

More information

Savvy Auto Group Drives Down Costs

Savvy Auto Group Drives Down Costs Savvy Auto Group Drives Down Costs Avanade database project saves $135,000 in annual costs Business Situation Dollar Thrifty Auto Group (DTAG) operates two well-known vehicle rental brands, Dollar Rent

More information

University of Maryland. Tuesday, March 2, 2010

University of Maryland. Tuesday, March 2, 2010 Data-Intensive Information Processing Applications Session #5 Graph Algorithms Jimmy Lin University of Maryland Tuesday, March 2, 2010 This work is licensed under a Creative Commons Attribution-Noncommercial-Share

More information

A program that performs lexical analysis may be termed a lexer, tokenizer, or scanner, though scanner is also a term for the first stage of a lexer.

A program that performs lexical analysis may be termed a lexer, tokenizer, or scanner, though scanner is also a term for the first stage of a lexer. Compiler Design A compiler is computer software that transforms computer code written in one programming language (the source language) into another programming language (the target language). The name

More information

Brent Kastor GIS Coordinator Coweta County, GA Newnan, GA Ph: E mail:

Brent Kastor GIS Coordinator Coweta County, GA Newnan, GA Ph: E mail: Brent Kastor GIS Coordinator Coweta County, GA Newnan, GA 30265 Ph: 678.854.0029 E mail: bkastor@coweta.ga.us WinGap Sketch Conversion Process An analysis of the specific steps in Mr. Chad Rupert and Mr.

More information

Balancing the pressures of a healthcare SQL Server DBA

Balancing the pressures of a healthcare SQL Server DBA Balancing the pressures of a healthcare SQL Server DBA More than security, compliance and auditing? Working with SQL Server in the healthcare industry presents many unique challenges. The majority of these

More information

Title Unknown Annapurna Valluri

Title Unknown Annapurna Valluri Title Unknown Annapurna Valluri 1. Introduction There are a number of situations, one comes across in one s life, in which one has to find the k nearest neighbors of an object, be it a location on a map,

More information

CS 595: Cryptography Final Project

CS 595: Cryptography Final Project CS 595: Cryptography Final Project Tim Wylie December 7, 2009 Project Overview I have implemented a basic covert multi-party communication instant messaging program. The users can communicate with any

More information

(Worth 50% of overall Project 1 grade)

(Worth 50% of overall Project 1 grade) 第 1 页共 8 页 2011/11/8 22:18 (Worth 50% of overall Project 1 grade) You will do Part 3 (the final part) of Project 1 with the same team as for Parts 1 and 2. If your team partner dropped the class and you

More information

Essential Skills for Bioinformatics: Unix/Linux

Essential Skills for Bioinformatics: Unix/Linux Essential Skills for Bioinformatics: Unix/Linux WORKING WITH COMPRESSED DATA Overview Data compression, the process of condensing data so that it takes up less space (on disk drives, in memory, or across

More information

Smart Video Replay Game Day Preparation & Troubleshooting Guide

Smart Video Replay Game Day Preparation & Troubleshooting Guide Smart Video Replay Game Day Preparation & Troubleshooting Guide Game Day Preparation Make sure your batteries are charged. Make sure you have enough batteries for the entire game. Make sure all your equipment

More information

Checkpoint. User s Guide

Checkpoint. User s Guide Checkpoint User s Guide Welcome to Checkpoint. This user guide will show you everything you need to know to access and utilize the wealth of tax information available from Checkpoint. The Checkpoint program

More information

Undirected Graphs. V = { 1, 2, 3, 4, 5, 6, 7, 8 } E = { 1-2, 1-3, 2-3, 2-4, 2-5, 3-5, 3-7, 3-8, 4-5, 5-6 } n = 8 m = 11

Undirected Graphs. V = { 1, 2, 3, 4, 5, 6, 7, 8 } E = { 1-2, 1-3, 2-3, 2-4, 2-5, 3-5, 3-7, 3-8, 4-5, 5-6 } n = 8 m = 11 Chapter 3 - Graphs Undirected Graphs Undirected graph. G = (V, E) V = nodes. E = edges between pairs of nodes. Captures pairwise relationship between objects. Graph size parameters: n = V, m = E. V = {

More information

Michael Greenberg. September 13, 2004

Michael Greenberg. September 13, 2004 Finite Geometries for Those with a Finite Patience for Mathematics Michael Greenberg September 13, 2004 1 Introduction 1.1 Objective When my friends ask me what I ve been studying this past summer and

More information

Midterm Exam Solutions March 7, 2001 CS162 Operating Systems

Midterm Exam Solutions March 7, 2001 CS162 Operating Systems University of California, Berkeley College of Engineering Computer Science Division EECS Spring 2001 Anthony D. Joseph Midterm Exam March 7, 2001 CS162 Operating Systems Your Name: SID AND 162 Login: TA:

More information

Siemens Digitalized Production Cell Boosts Auto Parts Manufacturing by 20% siemens.com/global/en/home/products/automation

Siemens Digitalized Production Cell Boosts Auto Parts Manufacturing by 20% siemens.com/global/en/home/products/automation Siemens Digitalized Production Cell Boosts Auto Parts Manufacturing by 20% siemens.com/global/en/home/products/automation Overview Increases in order size and quantity led Wisconsin-based auto parts manufacturer

More information

TRANSANA and Chapter 8 Retrieval

TRANSANA and Chapter 8 Retrieval TRANSANA and Chapter 8 Retrieval Chapter 8 in Using Software for Qualitative Research focuses on retrieval a crucial aspect of qualitatively coding data. Yet there are many aspects of this which lead to

More information

Contents. Mount Holyoke College Volunteer Hub UPDATED 03/23/18 2

Contents. Mount Holyoke College Volunteer Hub UPDATED 03/23/18 2 Contents Welcome to the Mount Holyoke College Volunteer Hub!... 3 Web Browsers... 3 Logging in to the Volunteer Hub... 4 Navigating the Volunteer Hub... 5 Volunteer Groups... 9 Help... 14 Search Center...

More information

Integrated Math 1 Module 3 Honors Sequences and Series Ready, Set, Go! Homework

Integrated Math 1 Module 3 Honors Sequences and Series Ready, Set, Go! Homework 1 Integrated Math 1 Module 3 Honors Sequences and Series Ready, Set, Go! Homework Adapted from The Mathematics Vision Project: Scott Hendrickson, Joleigh Honey, Barbara Kuehl, Travis Lemon, Janet Sutorius

More information

Collaborative projects:

Collaborative projects: Collaborative projects: Mail Art and other collaborative development have helped to connect people who were on the TEA programme and has continued in many forms since then. Some of the projects are short

More information

MSc(IT) Program. MSc(IT) Program Educational Objectives (PEO):

MSc(IT) Program. MSc(IT) Program Educational Objectives (PEO): MSc(IT) Program Master of Science (Information Technology) is an intensive program designed for students who wish to pursue a professional career in Information Technology. The courses have been carefully

More information

How I helped Enterprise DNA launch a Power BI course and grow their list by 2,401% in less than 1 year.

How I helped Enterprise DNA launch a Power BI course and grow their  list by 2,401% in less than 1 year. How I helped Enterprise DNA launch a Power BI course and grow their email list by 2,401% in less than 1 year www.zoranorak.com THE CLIENT Enterprise DNA is one of the leading Power BI training solutions

More information

Expert Reference Series of White Papers. Five Simple Symbols You Should Know to Unlock Your PowerShell Potential

Expert Reference Series of White Papers. Five Simple Symbols You Should Know to Unlock Your PowerShell Potential Expert Reference Series of White Papers Five Simple Symbols You Should Know to Unlock Your PowerShell Potential 1-800-COURSES www.globalknowledge.com Five Simple Symbols You Should Know to Unlock Your

More information

Algorithms: Lecture 10. Chalmers University of Technology

Algorithms: Lecture 10. Chalmers University of Technology Algorithms: Lecture 10 Chalmers University of Technology Today s Topics Basic Definitions Path, Cycle, Tree, Connectivity, etc. Graph Traversal Depth First Search Breadth First Search Testing Bipartatiness

More information

Mayhem Make a little Mayhem in your world.

Mayhem Make a little Mayhem in your world. Mayhem Make a little Mayhem in your world. Team Group Manager - Eli White Documentation - Meaghan Kjelland Design - Jabili Kaza & Jen Smith Testing - Kyle Zemek Problem and Solution Overview Most people

More information

H1-212 Capture the Flag Solution Author: Corben Douglas

H1-212 Capture the Flag Solution Author: Corben Douglas H1-212 Capture the Flag Solution Author: Corben Douglas (@sxcurity) Description: An engineer of acme.org launched a new server for a new admin panel at http://104.236.20.43/. He is completely confident

More information