Key words: Wikipedia - a non-profit encyclopedia project available online and written collaboratively by volunteers from around the world
|
|
- Regina Paul
- 6 years ago
- Views:
Transcription
1 Jacob Wenger Programming Paradigms Thursday, May 12, 2011 Six Degrees of Wikipedia Abstract: I created an application which allows you to find the shortest paths between any two Wikipedia articles. Paths, in this sense, follow along the hyperlinks found on every Wikipedia page. Data about Wikipedia was found online in the form of queries returning every row in various tables from the Wikipedia database. I downloaded these files, parsed them, sorted them, and stored them in a local SQLite database. I created a simple, intuitive front-end graphical user interface (GUI) using PyQt. The GUI requires the user to provide only the starting and ending article names and click a large button to find the paths between them. Additional features which determine if the current query is an actual article and which provide suggested article names are also provided by querying the local database. Aspects of functional programming (e.g. list comprehensions) and regular expressions (e.g. command-line utility sed and querying the database for suggested articles) were used in this project. Although Python was the predominant language used, a shell script which retrieved and parsed the online data was also created. Key words: Wikipedia - a non-profit encyclopedia project available online and written collaboratively by volunteers from around the world Page a Wikipedia article Link a hyperlink on a page which leads to another page Redirect a page which does not exist but redirects to an existing page; for example, Airplane is a redirect to Fixed-wing aircraft and Gandhi redirects to Mohandas Karamchand Gandhi Breadth first search a graph search algorithm that begins at the root node and explores all of its neighboring nodes; then, for each of those nodes, it explores their unexplored neighbors; this process continues until the end node is found or until all nodes are explored SQLite - an embedded relational database management system contained in a relatively small C programming library; it is essentially a less-powerful version of an SQL database PyQt a Python binding of the cross platform graphical user interface (GUI) toolkit Qt
2 Introduction: The articles which comprise Wikipedia [5][6] represent, in some sense, the collective knowledge of humanity. In only ten short years since its founding in 2001 Wikipedia has already amassed over 3 million articles, a number which is rising every single day. I was interested in seeing how closely this knowledge is tied together. Concepts such as six degrees of separation, which states that every person is connected to every other person through only six familial links, further piqued my interest. As a result, I decided to create an application called Six Degrees of Wikipedia (SDOW for short) which determines the shortest path from one Wikipedia article to another by following the hyperlinks found on the Wikipedia pages themselves. To my surprise, this problem had not previously been sufficiently solved. Through my research, I discovered that only one person, namely Stephen Dolan, had even attempted to complete a similar project [3]. However, his implementation was simple, buggy, and sometimes incorrect (as I can now prove) and thus was not satisfactory. Besides him, I saw no other mention of anyone who even attempted to tackle the problem of finding the shortest path between two Wikipedia articles. This was an exciting discovery since it meant I was the first person to provide a suitable solution to problem I posed. This project began by obtaining data on Wikipedia articles from an online dump provided by Wikipedia itself [1]. This information was parsed and stored in a local SQLite database. A breadth-first searching algorithm was implemented in Python and made use of the local database to find the paths between two Wikipedia articles. Finally, a graphical user interface (GUI) was created in PyQt to make the experience for users quick and intuitive. Usage examples: To open the GUI for Six Degrees of Wikipedia, simply type the following command: python./sixdegreesofwikipedia.py The GUI is very simple and intuitive. There are two line edits which represent the starting and ending Wikipedia article names. There is also a button titled Find Wikipedia Links! which finds the paths between the two articles when pressed. Finally, there is a search log at the bottom which keeps track of the recent search history. The GUI is shown on the following page.
3 If you wanted to find the shortest path from the Wikipedia articles Search-based application and Wikipedia, you can start by typing Search-based application into the starting article line edit. As you type, you will notice that the color of the text changes. The text will be one of three colors: (1) green for when it corresponds to an actual Wikipedia article, (2) yellow for when it corresponds to a Wikipedia redirect, and (3) red in every other case. Another feature you will notice is suggested search terms, similar to those found in Google. The suggestions are filtered in order to provide you with the pages you most likely care about. The combination of these two features provides a powerful user interface which tells the user if the article they are searching for exists and saves them time by auto-completing their query through suggestions. When Search is typed in the image below, it is colored yellow to signify it is a redirect. In addition, suggested articles, including Search-based application, are given.
4 Once both the starting and ending articles are specified, the Find Wikipedia Links! button can be pressed. If either of the article names does not actually correspond to a real Wikipedia article (i.e. they are red), the search log will alert you to this error. If either of them are redirects (i.e. they are yellow), the search log will tell you to which article they redirect. If both articles exist and neither of them are redirects (i.e. they are green), the search for the shortest path between them will begin. Most searches take between five to ten seconds. A message which tells you how many paths were found between the two articles and of what degree the paths are will be written to the search log. This will be followed by lists of paths, each element of which is separated by a pipe. The results will look like those seen in the image below. Approach and methods: A lot went into the making Six Degrees of Wikipedia. First, the Wikipedia data needed to be obtained. It came from an online dump of the Wikipedia database [1]. I downloaded gzipped files which contained a query of all the rows in three tables: pages, links, and redirects. I also found a file containing the hit counts of every Wikipedia article [2]. These large files were unzipped, parsed using command line utilities like grep, awk, and sed, sorted, and placed into a local SQLite database through the use of Python scripts. A shell script named getdata.sh was run as an SGE job on opteron.crc.nd.edu and took over six hours to run. Next, a searching algorithm needed to be written to find the paths between any two Wikipedia
5 pages. I modeled Wikipedia as a directed graph, where each article was a node and each link between two articles was an edge. In order to find the shortest path between two nodes, a breadth first search (BFS) was the best option. The search began at the node represented by the starting article. All of its neighbors (i.e. the articles to which it links to directly) were found and if any of them were the node representing the ending article, the BFS finished. Otherwise, the unexplored neighbors of the starting node s neighbors were found and again checked to see if they were the end node. The BFS continues until the end node is reached or all the nodes are explored. This procedure guarantees that the shortest path between the two nodes would be found if it exists. However, it is quite slow because even after only a few iterations, millions of nodes are being searched. To speed up this process I implemented a bi-directional BFS. This essentially did two separate BFS, one from the starting node and one from the ending node. The two BFS switch back and forth after each one completes one level of the search. Whenever these two BFS overlapped (i.e. contained the same node), the entire process finished. This sped up things considerably. One final enhancement I made was implementing an intelligent bidirectional BFS. By intelligent, I mean that the algorithm always chose to explore the next level of neighbors of the BFS that had the fewest neighbors. This was an improvement over the plain bi-directional BFS and again resulted in a good speedup. The Python script breadthfirstsearch.py contains the BFS algorithm. The BFS algorithm made use of the local databases I created. First, I used the pages table to convert the article names provided in the GUI to article IDs. Next, I ran the BFS from the starting article ID to the ending article ID suing the links and redirects databases. This returned a list of paths of IDs, which were converted to article names again using the pages database. Although the GUI is rather plain, it contains a lot of functionality. I created a modified QLineEdit class which changes colors according to its current text and which includes a dynamically changing QCompleter which provides search suggestions. The color is determined by querying the redirects and pages databases and returns nearly instantaneous results. The suggestions are found by using a regular expression to query the pages database and are filtered by using the page hit counts database. Finally, the Find Wikipedia Links! button is a simple QButton which calls the BFS when it is pressed and the search log is a simple QTextBrowser.
6 Discussion and future work: I accomplished the goal I set out to achieve. My Six Degrees Of Wikipedia application properly finds the shortest path between any two Wikipedia articles. However, there are still many way I could improve and expand SDOW. My main concern is speeding up the BFS algorithm. The best solution to his problem would most likely be parallelizing the algorithm using MPI or Work Queue. Another area for improvement is suggested articles when a user is typing in the article name. Providing suggestions which are more relevant through the use of more rigorous filters would be a nice improvement. Also, my program does not accept article names with special characters (e.g. é, ü, ñ, etc.) so integration with these is another area for improvement. Conversion to a fully functioning web-based application which allows you to visualize the paths between articles, as show in the image below, is already underway. However, much progress can still be made. Finally, an analysis of the results (e.g. finding strongly connected components or determining the average distance between any two nodes through an all-pair analysis) could yield interesting results on the relatedness of human knowledge. References: [1] Database dump progress. Wikipedia. Accessed November Available at wikimedia.org/ backup-index.html. [2] Index of /wikistats/. Domas Mituzas. Accessed 2 May Available wikistats/. [3] Six Degrees of Wikipedia. Stephen Dolan. Accessed November Available at ~mu/wiki/. [4] The top 500 sites on the web. Accessed 4 May Available at topsites.
7 [5] Wikipedia. Wikipedia. Accessed 4 May Available at Wikipedia. [6] Wikipedia: The Free Encyclopedia. Wikipedia. Accessed November Available at Acknowledgements: First, I would like to thank my brother, Aaron Wenger, for his help when this project was in its infancy. This began as a collaboration between the two of us and would not have gotten off the ground without him. Stephen Dolan should also be thanked for the information I obtained from the description he wrote about how he completed a project similar to SDOW. Although he never actually responded to my s he still gave me some helpful pointers. I would also like to thank Chad Heise for his help coming up with this project idea and being a tester throughout the evolution of the code. Finally, thanks goes out to RJ Nowling for showing me how to submit jobs to SGE. Authors: Jacob Wenger is a junior majoring in computer science and the third in a line of Wenger s (preceded by his brother 05 and sister 10) who have attended the University of Notre Dame. When he is not coding the night away, he enjoys watching and attending sporting events, playing basketball and football out on the quad, going out to trivia with his friends, and watching movies. Jacob has spent the past two summers in a research setting and studying abroad in Alcoy, Spain, but will spend next summer out in Seattle working for Microsoft. Upon graduation, Jacob will most likely work in industry for a few years and eventually attend graduate school to pursue a degree in higher education.
Figure 1: A directed graph.
1 Graphs A graph is a data structure that expresses relationships between objects. The objects are called nodes and the relationships are called edges. For example, social networks can be represented as
More informationServer monitoring for Tor exit nodes
CASE STUDY Server monitoring for Tor exit nodes We had a chance to catch up with Kenan Sulayman, who runs some of the biggest Tor servers in the world. Read on to learn about server monitoring for highthroughput
More informationThe Untold Story of. Debugiano. By Bill Gowans
The Untold Story of Debugiano By Bill Gowans Before we start, is there anything you would like to say first? asked CBS news correspondent Carol Elliot. My name is Debugiano, and I am the long lost brother
More informationThe Crossed Swords wargame: Catching NATO red teams with cyber deception
The Crossed Swords wargame: Catching NATO red teams with cyber deception 2015-2018 Cymmetria Inc. All rights reserved. 2 BACKSTORY Once a year, the pentesters* and red teams of the countries of NATO descend
More information(Refer Slide Time: 02.06)
Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 27 Depth First Search (DFS) Today we are going to be talking
More informationPersonal Health Assistant: Final Report Prepared by K. Morillo, J. Redway, and I. Smyrnow Version Date April 29, 2010 Personal Health Assistant
Personal Health Assistant Ishmael Smyrnow Kevin Morillo James Redway CSE 293 Final Report Table of Contents 0... 3 1...General Overview... 3 1.1 Introduction... 3 1.2 Goal...3 1.3 Overview... 3 2... Server
More informationFriend Cloud: Identifying Networks and Connections Between Facebook Friends. Ben Fuja and Joey Rich. Programming Paradigms May 12, 2011
Friend Cloud: Identifying Networks and Connections Between Facebook Friends Ben Fuja and Joey Rich Programming Paradigms May 12, 2011 1 Abstract Facebook has become a ubiquitous aspect of the social world,
More informationγ : constant Goett 2 P(k) = k γ k : degree
Goett 1 Jeffrey Goett Final Research Paper, Fall 2003 Professor Madey 19 December 2003 Abstract: Recent observations by physicists have lead to new theories about the mechanisms controlling the growth
More informationClass Search Plus Programming Paradigms Final Project Spring Hayley Hawkinson & Ryan Solava
Class Search Plus Programming Paradigms Final Project Spring 2011 Hayley Hawkinson & Ryan Solava Abstract Class Search Plus extends the functionality of Notre Dame s Class Search by adding scheduling functionality.
More informationTable of Contents. #2921 Differentiated Nonfiction Reading 2 Teacher Created Resources, Inc.
Table of Contents Introduction....3 Practice Reading Passage: Geothermal Power........................................... 7 Practice Comprehension Questions....................................................
More informationCASE STUDY IT. Albumprinter Adopting Redgate DLM
CASE STUDY IT Albumprinter Adopting Redgate DLM "Once the team saw they could deploy all their database changes error-free at the click of a button, with no more manual scripts, it spread by word of mouth.
More informationFINAL REPORT 04/25/2015 FINAL REPORT SUNY CANTON MOBILE APPLICATION
FINAL REPORT SUNY CANTON MOBILE APPLICATION GROUP MEMBERS: Alexander Royce & Luke Harper SUNY CANTON SPRING 2015 Table of Contents List of Figures... 2 Research... 4 Programming Language... 4 Android Studio...
More informationPython for Verification!
Python for Verification! Donald McCarthy 23 April 2018 - restricted - Who is this guy? I First learnt to program in 1975 on a minicomputer, by sneaking into Trent Polytechnic when I was 15 I ve programmed
More informationEPS Import Functionality for ReportLab
A Proposal for the Synopsis By Mark Peters mark.peters@ivanhouse.com ReportLab is a Python Library designed to easily implement PDF output functionality into Python programs. Currently, ReportLab can import
More informationGraph Algorithms. Revised based on the slides by Ruoming Kent State
Graph Algorithms Adapted from UMD Jimmy Lin s slides, which is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States. See http://creativecommons.org/licenses/by-nc-sa/3.0/us/
More informationComputational Steering
Computational Steering Nate Woody 10/13/2009 www.cac.cornell.edu 1 Lab Materials I ve placed some sample code in ~train100 that performs the operations that I ll demonstrate during this talk. We ll walk
More informationMy Favorite bash Tips and Tricks
1 of 6 6/18/2006 7:44 PM My Favorite bash Tips and Tricks Prentice Bisbal Abstract Save a lot of typing with these handy bash features you won't find in an old-fashioned UNIX shell. bash, or the Bourne
More informationGRAPHS Lecture 17 CS2110 Spring 2014
GRAPHS Lecture 17 CS2110 Spring 2014 These are not Graphs 2...not the kind we mean, anyway These are Graphs 3 K 5 K 3,3 = Applications of Graphs 4 Communication networks The internet is a huge graph Routing
More informationWeek - 01 Lecture - 04 Downloading and installing Python
Programming, Data Structures and Algorithms in Python Prof. Madhavan Mukund Department of Computer Science and Engineering Indian Institute of Technology, Madras Week - 01 Lecture - 04 Downloading and
More informationLearn Windows PowerShell in a Month of Lunches
Learn Windows PowerShell in a Month of Lunches by Don Jones Chapter 4 Copyright 2011 Manning Publications brief contents 1 Before you begin 1 2 Running commands 9 3 Using the help system 23 4 The pipeline:
More informationUsability Report. Author: Stephen Varnado Version: 1.0 Date: November 24, 2014
Usability Report Author: Stephen Varnado Version: 1.0 Date: November 24, 2014 2 Table of Contents Executive summary... 3 Introduction... 3 Methodology... 3 Usability test results... 4 Effectiveness ratings
More informationBasic Network Concepts
Basic Network Concepts Basic Vocabulary Alice Graph Network Edges Links Nodes Vertices Chuck Bob Edges Alice Chuck Bob Edge Weights Alice Chuck Bob Apollo 13 Movie Network Main Actors in Apollo 13 the
More informationProblem and Solution Overview: An elegant task management solution, that saves busy people time.
An elegant task management solution, that saves busy people time. Team: Anne Aoki: Storyboarding, design, user studies, writing Alex Anderson: User studies, design Matt Willden: Ideation, writing, user
More informationIn today s video I'm going show you how you can set up your own online business using marketing and affiliate marketing.
Hey guys, Diggy here with a summary of part two of the four part free video series. If you haven't watched the first video yet, please do so (https://sixfigureinc.com/intro), before continuing with this
More informationExploring Performance Tradeoffs in a Sudoku SAT Solver CS242 Project Report
Exploring Performance Tradeoffs in a Sudoku SAT Solver CS242 Project Report Hana Lee (leehana@stanford.edu) December 15, 2017 1 Summary I implemented a SAT solver capable of solving Sudoku puzzles using
More informationBiblioGRAPH. Will Frank Advisor: Dr. Oleg Sokolsky. April 10, 2006
BiblioGRAPH Will Frank wmfrank@stwing.org Advisor: Dr. Oleg Sokolsky April 10, 2006 1 Abstract The basics of this project are taken from a suggestion by Professor Sokolsky on the CSE 400 Project Ideas
More informationCS781 Lecture 2 January 13, Graph Traversals, Search, and Ordering
CS781 Lecture 2 January 13, 2010 Graph Traversals, Search, and Ordering Review of Lecture 1 Notions of Algorithm Scalability Worst-Case and Average-Case Analysis Asymptotic Growth Rates: Big-Oh Prototypical
More informationB a s h s c r i p t i n g
8 Bash Scripting Any self-respecting hacker must be able to write scripts. For that matter, any selfrespecting Linux administrator must be able to script. Hackers often need to automate commands, sometimes
More informationIntroduction To Graphs and Networks. Fall 2013 Carola Wenk
Introduction To Graphs and Networks Fall 2013 Carola Wenk What is a Network? We have thought of a computer as a single entity, but they can also be connected to one another. Internet What are the advantages
More informationThe Paperless Classroom with Google Docs by - Eric Curts
The Paperless Classroom with Google Docs by - Eric Curts Table of Contents Overview How to name documents and folders How to choose sharing options: Edit, Comment, and View How to share a document with
More informationCOGNOS DYNAMIC CUBES: SET TO RETIRE TRANSFORMER? Update: Pros & Cons
COGNOS DYNAMIC CUBES: SET TO RETIRE TRANSFORMER? 10.2.2 Update: Pros & Cons GoToWebinar Control Panel Submit questions here Click arrow to restore full control panel Copyright 2015 Senturus, Inc. All Rights
More informationKearney High School Class of Search Guide Guide to Searching for Missing Classmates as of 12/04/2009
Thanks for your interest to serve on our Search Team to help search for our Missing Classmates! Our goal is to obtain current contact information for each Missing Classmate, so we can invite them to visit
More informationLesson 3 Transcript: Part 2 of 2 Tools & Scripting
Lesson 3 Transcript: Part 2 of 2 Tools & Scripting Slide 1: Cover Welcome to lesson 3 of the DB2 on Campus Lecture Series. Today we are going to talk about tools and scripting. And this is part 2 of 2
More informationDRACULA. CSM Turner Connor Taylor, Trevor Worth June 18th, 2015
DRACULA CSM Turner Connor Taylor, Trevor Worth June 18th, 2015 Acknowledgments Support for this work was provided by the National Science Foundation Award No. CMMI-1304383 and CMMI-1234859. Any opinions,
More informationDepth First Search A B C D E F G A B C 5 D E F 3 2 G 2 3
Depth First Search A B C D E F G A 4 3 2 B 4 5 4 3 C 5 D 3 4 2 E 2 2 3 F 3 2 G 2 3 Minimum (Weight) Spanning Trees Let G be a graph with weights on the edges. We define the weight of any subgraph of G
More informationComputational Steering
Computational Steering Nate Woody 10/23/2008 www.cac.cornell.edu 1 What is computational steering? Generally, computational steering can be thought of as a method (or set of methods) for providing interactivity
More informationWhile You Were Sleeping - Scheduling SAS Jobs to Run Automatically Faron Kincheloe, Baylor University, Waco, TX
While You Were Sleeping - Scheduling SAS Jobs to Run Automatically Faron Kincheloe, Baylor University, Waco, TX ABSTRACT If you are tired of running the same jobs over and over again, this paper is for
More informationA detailed comparison of EasyMorph vs Tableau Prep
A detailed comparison of vs We at keep getting asked by our customers and partners: How is positioned versus?. Well, you asked, we answer! Short answer and are similar, but there are two important differences.
More informationTripster. An web application to make your road trip planning easy and enjoyable. Junlin Zhang. Minfa Wang. Shenghan Yao
Tripster An web application to make your road trip planning easy and enjoyable Minfa Wang California Institute of Technology mwang5@caltech.edu Shenghan Yao California Institute of Technology syao@caltech.edu
More informationCASE STUDY INSURANCE. Innovation at Moody's Analytics: A new approach to database provisioning using SQL Clone
CASE STUDY INSURANCE Innovation at Moody's Analytics: A new approach to database provisioning using SQL Clone We already had a one-click process for database provisioning, but it was still taking too much
More informationStudy Guide Processes & Job Control
Study Guide Processes & Job Control Q1 - PID What does PID stand for? Q2 - Shell PID What shell command would I issue to display the PID of the shell I'm using? Q3 - Process vs. executable file Explain,
More informationAlgorithm Design and Analysis
Algorithm Design and Analysis LECTURE 4 Graphs Definitions Traversals Adam Smith 9/8/10 Exercise How can you simulate an array with two unbounded stacks and a small amount of memory? (Hint: think of a
More informationOTRS Quick Reference
OTRS Quick Reference WWW.IXIASOFT.COM / OTRS / Copyright 2015 IXIASOFT Technologies. All rights reserved. Last revised: March 02, 2015 2 OTRS Quick Reference Table of contents Using OTRS OTRS overview
More informationCopyright 2000, Kevin Wayne 1
Chapter 3 - Graphs Undirected Graphs Undirected graph. G = (V, E) V = nodes. E = edges between pairs of nodes. Captures pairwise relationship between objects. Graph size parameters: n = V, m = E. Directed
More informationW4231: Analysis of Algorithms
W4231: Analysis of Algorithms 10/21/1999 Definitions for graphs Breadth First Search and Depth First Search Topological Sort. Graphs AgraphG is given by a set of vertices V and a set of edges E. Normally
More informationTexas Death Row. Last Statements. Data Warehousing and Data Mart. By Group 16. Irving Rodriguez Joseph Lai Joe Martinez
Texas Death Row Last Statements Data Warehousing and Data Mart By Group 16 Irving Rodriguez Joseph Lai Joe Martinez Introduction For our data warehousing and data mart project we chose to use the Texas
More informationGet First Page in One Month. How I ranked my blog in Google Page 1 in a month
Get First Page in One Month How I ranked my blog in Google Page 1 in a month 2015 Dipendra Pokharel, DipIncome.com Contents Background and Introduction(This is where I have introduced myself and shared
More information"SQL Monitor now makes the team look more professional."
PRUDENTIAL CASE STUDY "SQL Monitor now makes the team look more professional." How Redgate s SQL Monitor makes monitoring a host of production servers easier, faster, and more professional 91% of Fortune
More informationAssignment 3 ITCS-6010/8010: Cloud Computing for Data Analysis
Assignment 3 ITCS-6010/8010: Cloud Computing for Data Analysis Due by 11:59:59pm on Tuesday, March 16, 2010 This assignment is based on a similar assignment developed at the University of Washington. Running
More informationJordan Boyd-Graber University of Maryland. Thursday, March 3, 2011
Data-Intensive Information Processing Applications! Session #5 Graph Algorithms Jordan Boyd-Graber University of Maryland Thursday, March 3, 2011 This work is licensed under a Creative Commons Attribution-Noncommercial-Share
More informationComputer Science 572 Midterm Prof. Horowitz Thursday, March 8, 2012, 2:00pm 3:00pm
Computer Science 572 Midterm Prof. Horowitz Thursday, March 8, 2012, 2:00pm 3:00pm Name: Student Id Number: 1. This is a closed book exam. 2. Please answer all questions. 3. There are a total of 40 questions.
More informationRESEARCH NOTE. Autonomous Data. Oracle s Self-Driving Database. What ADWC Actually Is. WinterCorp
W I N T E R C O R P O R A T I O N RESEARCH NOTE BY RICHARD WINTER WinterCorp www.wintercorp.com ORACLE HAS ANNOUNCED Autonomous Data Warehouse Cloud (ADWC), a service whereby a user can rapidly create
More informationDatabase infrastructure for electronic structure calculations
Database infrastructure for electronic structure calculations Fawzi Mohamed fawzi.mohamed@fhi-berlin.mpg.de 22.7.2015 Why should you be interested in databases? Can you find a calculation that you did
More informationLecture 1: Overview
15-150 Lecture 1: Overview Lecture by Stefan Muller May 21, 2018 Welcome to 15-150! Today s lecture was an overview that showed the highlights of everything you re learning this semester, which also meant
More informationData Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.
Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 18 Tries Today we are going to be talking about another data
More informationOffice Hours: Hidden gems in Excel 2007
Page 1 of 6 Help and How-to Office Hours: Hidden gems in Excel 2007 October 1, 2007 Jean Philippe Bagel Sometimes love at first sight lasts for years. This week's columnist offers new and interesting ways
More informationEnable Spark SQL on NoSQL Hbase tables with HSpark IBM Code Tech Talk. February 13, 2018
Enable Spark SQL on NoSQL Hbase tables with HSpark IBM Code Tech Talk February 13, 2018 https://developer.ibm.com/code/techtalks/enable-spark-sql-onnosql-hbase-tables-with-hspark-2/ >> MARC-ARTHUR PIERRE
More informationCSI Lab 02. Tuesday, January 21st
CSI Lab 02 Tuesday, January 21st Objectives: Explore some basic functionality of python Introduction Last week we talked about the fact that a computer is, among other things, a tool to perform high speed
More informationSI Networks: Theory and Application, Fall 2008
University of Michigan Deep Blue deepblue.lib.umich.edu 2008-09 SI 508 - Networks: Theory and Application, Fall 2008 Adamic, Lada Adamic, L. (2008, November 12). Networks: Theory and Application. Retrieved
More informationCOSC 2P95. Introduction. Week 1. Brock University. Brock University (Week 1) Introduction 1 / 18
COSC 2P95 Introduction Week 1 Brock University Brock University (Week 1) Introduction 1 / 18 Lectures and Labs Lectures are Thursdays, from 3pm 5pm (AS/STH 217) There are two lab sections Lab 1 is Mondays,
More informationLecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1
CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) January 11, 2018 Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 In this lecture
More informationTime - Experience Report. By Thanou Thirakul
Large Scale Testing In Agile Time - Experience Report Large Scale Testing In Agile Time - Experience Report What we re going to talk about: 1. Background on the application 2. Background on the build process
More information3.1 Basic Definitions and Applications. Chapter 3. Graphs. Undirected Graphs. Some Graph Applications
Chapter 3 31 Basic Definitions and Applications Graphs Slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley All rights reserved 1 Undirected Graphs Some Graph Applications Undirected graph G = (V,
More informationVikramjit clears CCIE Routing & Switching in 1st attempt after Cisco Training in Network Bulls, India
Vikramjit clears CCIE Routing & Switching in 1st attempt after Cisco Training in Network Bulls, India Today we look at success story of Vikramjit, an Indian NRI who came to India from Ireland for Cisco
More informationDatabase Table Editor for Excel. by Brent Larsen
Database Table Editor for Excel by Brent Larsen Executive Summary This project is a database table editor that is geared toward those who use databases heavily, and in particular those who frequently insert,
More informationMerge Conflicts p. 92 More GitHub Workflows: Forking and Pull Requests p. 97 Using Git to Make Life Easier: Working with Past Commits p.
Preface p. xiii Ideology: Data Skills for Robust and Reproducible Bioinformatics How to Learn Bioinformatics p. 1 Why Bioinformatics? Biology's Growing Data p. 1 Learning Data Skills to Learn Bioinformatics
More informationProject Report Number Plate Recognition
Project Report Number Plate Recognition Ribemont Francois Supervisor: Nigel Whyte April 17, 2012 Contents 1 Introduction............................... 2 2 Description of Submitted Project...................
More informationMolecular Statistics Exercise 1. As was shown to you this morning, the interactive python shell can add, subtract, multiply and divide numbers.
Molecular Statistics Exercise 1 Introduction This is the first exercise in the course Molecular Statistics. The exercises in this course are split in two parts. The first part of each exercise is a general
More informationuplift - Interactive Prototype #2
uplift - Interactive Prototype #2 Daniel Kitt (Project Manager) Alisa Yamanaka (Documentation) Haochen Wei (Usability) Yuna Shim (Design) Jared Jones (Development) 1. Problem description People are generally
More informationThis is an oral history interview conducted on. October 30, 2003, with IBM researcher Chieko Asakawa and IBM
This is an oral history interview conducted on October 30, 2003, with IBM researcher Chieko Asakawa and IBM Corporate Archivist, Paul Lasewicz, conducted the interview. Thank you, and welcome. Thank you
More informationTESTING SOFTWARE COMBINED WITH CONVENTIONAL AUTOMATED SOFTWARE QUALITY (ASQ) PRODUCTS
TESTING SOFTWARE COMBINED WITH CONVENTIONAL AUTOMATED SOFTWARE QUALITY (ASQ) PRODUCTS 5 FIELD OF INVENTION The invention generally relates to automated software quality and performance testing. BACKGROUND
More informationCS140 Final Project. Nathan Crandall, Dane Pitkin, Introduction:
Nathan Crandall, 3970001 Dane Pitkin, 4085726 CS140 Final Project Introduction: Our goal was to parallelize the Breadth-first search algorithm using Cilk++. This algorithm works by starting at an initial
More informationGWNMS NeDi. About NeDi. Configuring the NeDi Package. Managing User Access. Managing User Accounts
GWNMS NeDi This section reviews the GroundWork Monitor NMS NeDi. About NeDi NeDi is an open source toolkit for managing network infrastructure devices such as switches and routers, and is integrated into
More informationProgramming Project #6: Password File Cracker
CSE231 Spring 2017 Programming Project #6: Password File Cracker (Edits: changed itertools permutations to product either works for these passwords, but product is the correct one. Removed lists and tuples
More information1 Counting triangles and cliques
ITCSC-INC Winter School 2015 26 January 2014 notes by Andrej Bogdanov Today we will talk about randomness and some of the surprising roles it plays in the theory of computing and in coding theory. Let
More informationSavvy Auto Group Drives Down Costs
Savvy Auto Group Drives Down Costs Avanade database project saves $135,000 in annual costs Business Situation Dollar Thrifty Auto Group (DTAG) operates two well-known vehicle rental brands, Dollar Rent
More informationUniversity of Maryland. Tuesday, March 2, 2010
Data-Intensive Information Processing Applications Session #5 Graph Algorithms Jimmy Lin University of Maryland Tuesday, March 2, 2010 This work is licensed under a Creative Commons Attribution-Noncommercial-Share
More informationA program that performs lexical analysis may be termed a lexer, tokenizer, or scanner, though scanner is also a term for the first stage of a lexer.
Compiler Design A compiler is computer software that transforms computer code written in one programming language (the source language) into another programming language (the target language). The name
More informationBrent Kastor GIS Coordinator Coweta County, GA Newnan, GA Ph: E mail:
Brent Kastor GIS Coordinator Coweta County, GA Newnan, GA 30265 Ph: 678.854.0029 E mail: bkastor@coweta.ga.us WinGap Sketch Conversion Process An analysis of the specific steps in Mr. Chad Rupert and Mr.
More informationBalancing the pressures of a healthcare SQL Server DBA
Balancing the pressures of a healthcare SQL Server DBA More than security, compliance and auditing? Working with SQL Server in the healthcare industry presents many unique challenges. The majority of these
More informationTitle Unknown Annapurna Valluri
Title Unknown Annapurna Valluri 1. Introduction There are a number of situations, one comes across in one s life, in which one has to find the k nearest neighbors of an object, be it a location on a map,
More informationCS 595: Cryptography Final Project
CS 595: Cryptography Final Project Tim Wylie December 7, 2009 Project Overview I have implemented a basic covert multi-party communication instant messaging program. The users can communicate with any
More information(Worth 50% of overall Project 1 grade)
第 1 页共 8 页 2011/11/8 22:18 (Worth 50% of overall Project 1 grade) You will do Part 3 (the final part) of Project 1 with the same team as for Parts 1 and 2. If your team partner dropped the class and you
More informationEssential Skills for Bioinformatics: Unix/Linux
Essential Skills for Bioinformatics: Unix/Linux WORKING WITH COMPRESSED DATA Overview Data compression, the process of condensing data so that it takes up less space (on disk drives, in memory, or across
More informationSmart Video Replay Game Day Preparation & Troubleshooting Guide
Smart Video Replay Game Day Preparation & Troubleshooting Guide Game Day Preparation Make sure your batteries are charged. Make sure you have enough batteries for the entire game. Make sure all your equipment
More informationCheckpoint. User s Guide
Checkpoint User s Guide Welcome to Checkpoint. This user guide will show you everything you need to know to access and utilize the wealth of tax information available from Checkpoint. The Checkpoint program
More informationUndirected Graphs. V = { 1, 2, 3, 4, 5, 6, 7, 8 } E = { 1-2, 1-3, 2-3, 2-4, 2-5, 3-5, 3-7, 3-8, 4-5, 5-6 } n = 8 m = 11
Chapter 3 - Graphs Undirected Graphs Undirected graph. G = (V, E) V = nodes. E = edges between pairs of nodes. Captures pairwise relationship between objects. Graph size parameters: n = V, m = E. V = {
More informationMichael Greenberg. September 13, 2004
Finite Geometries for Those with a Finite Patience for Mathematics Michael Greenberg September 13, 2004 1 Introduction 1.1 Objective When my friends ask me what I ve been studying this past summer and
More informationMidterm Exam Solutions March 7, 2001 CS162 Operating Systems
University of California, Berkeley College of Engineering Computer Science Division EECS Spring 2001 Anthony D. Joseph Midterm Exam March 7, 2001 CS162 Operating Systems Your Name: SID AND 162 Login: TA:
More informationSiemens Digitalized Production Cell Boosts Auto Parts Manufacturing by 20% siemens.com/global/en/home/products/automation
Siemens Digitalized Production Cell Boosts Auto Parts Manufacturing by 20% siemens.com/global/en/home/products/automation Overview Increases in order size and quantity led Wisconsin-based auto parts manufacturer
More informationTRANSANA and Chapter 8 Retrieval
TRANSANA and Chapter 8 Retrieval Chapter 8 in Using Software for Qualitative Research focuses on retrieval a crucial aspect of qualitatively coding data. Yet there are many aspects of this which lead to
More informationContents. Mount Holyoke College Volunteer Hub UPDATED 03/23/18 2
Contents Welcome to the Mount Holyoke College Volunteer Hub!... 3 Web Browsers... 3 Logging in to the Volunteer Hub... 4 Navigating the Volunteer Hub... 5 Volunteer Groups... 9 Help... 14 Search Center...
More informationIntegrated Math 1 Module 3 Honors Sequences and Series Ready, Set, Go! Homework
1 Integrated Math 1 Module 3 Honors Sequences and Series Ready, Set, Go! Homework Adapted from The Mathematics Vision Project: Scott Hendrickson, Joleigh Honey, Barbara Kuehl, Travis Lemon, Janet Sutorius
More informationCollaborative projects:
Collaborative projects: Mail Art and other collaborative development have helped to connect people who were on the TEA programme and has continued in many forms since then. Some of the projects are short
More informationMSc(IT) Program. MSc(IT) Program Educational Objectives (PEO):
MSc(IT) Program Master of Science (Information Technology) is an intensive program designed for students who wish to pursue a professional career in Information Technology. The courses have been carefully
More informationHow I helped Enterprise DNA launch a Power BI course and grow their list by 2,401% in less than 1 year.
How I helped Enterprise DNA launch a Power BI course and grow their email list by 2,401% in less than 1 year www.zoranorak.com THE CLIENT Enterprise DNA is one of the leading Power BI training solutions
More informationExpert Reference Series of White Papers. Five Simple Symbols You Should Know to Unlock Your PowerShell Potential
Expert Reference Series of White Papers Five Simple Symbols You Should Know to Unlock Your PowerShell Potential 1-800-COURSES www.globalknowledge.com Five Simple Symbols You Should Know to Unlock Your
More informationAlgorithms: Lecture 10. Chalmers University of Technology
Algorithms: Lecture 10 Chalmers University of Technology Today s Topics Basic Definitions Path, Cycle, Tree, Connectivity, etc. Graph Traversal Depth First Search Breadth First Search Testing Bipartatiness
More informationMayhem Make a little Mayhem in your world.
Mayhem Make a little Mayhem in your world. Team Group Manager - Eli White Documentation - Meaghan Kjelland Design - Jabili Kaza & Jen Smith Testing - Kyle Zemek Problem and Solution Overview Most people
More informationH1-212 Capture the Flag Solution Author: Corben Douglas
H1-212 Capture the Flag Solution Author: Corben Douglas (@sxcurity) Description: An engineer of acme.org launched a new server for a new admin panel at http://104.236.20.43/. He is completely confident
More information