Project Problems SNU , Fall 2014 Chung-Kil Hur due: 12/21(Sun), 24:00

Similar documents
Agenda. Math Google PageRank algorithm. 2 Developing a formula for ranking web pages. 3 Interpretation. 4 Computing the score of each page

Web search before Google. (Taken from Page et al. (1999), The PageRank Citation Ranking: Bringing Order to the Web.)

.. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar..

Lecture 8: Linkage algorithms and web search

Page rank computation HPC course project a.y Compute efficient and scalable Pagerank

CENTRALITIES. Carlo PICCARDI. DEIB - Department of Electronics, Information and Bioengineering Politecnico di Milano, Italy

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Finite Math - J-term Homework. Section Inverse of a Square Matrix

Pagerank Scoring. Imagine a browser doing a random walk on web pages:

Math 414 Lecture 30. The greedy algorithm provides the initial transportation matrix.

MITOCW ocw f99-lec07_300k

Chapter 1. Introduction

ALGEBRA I Summer Packet

Application of PageRank Algorithm on Sorting Problem Su weijun1, a

Computer Project #2 (Matrix Operations)

Einführung in Web und Data Science Community Analysis. Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme

(Lec 14) Placement & Partitioning: Part III

Web consists of web pages and hyperlinks between pages. A page receiving many links from other pages may be a hint of the authority of the page

Graph Theory Problem Ideas

Link Analysis and Web Search

Modeling web-crawlers on the Internet with random walksdecember on graphs11, / 15

Outline. What is semantics? Denotational semantics. Semantics of naming. What is semantics? 2 / 21

Watkins Mill High School. Algebra 2. Math Challenge

Introduction to Information Retrieval

Advanced Computer Architecture: A Google Search Engine

Vector Calculus: Understanding the Cross Product

UNC Charlotte 2010 Comprehensive

Output: For each size provided as input, a figure of that size is to appear, followed by a blank line.

A brief history of Google

Lecture 27: Learning from relational data

AMS /672: Graph Theory Homework Problems - Week V. Problems to be handed in on Wednesday, March 2: 6, 8, 9, 11, 12.

Math Summer 2012

Vector Algebra Transformations. Lecture 4

Math 340 Fall 2014, Victor Matveev. Binary system, round-off errors, loss of significance, and double precision accuracy.

Lecture 2: SML Basics

Single link clustering: 11/7: Lecture 18. Clustering Heuristics 1

Collaborative filtering based on a random walk model on a graph

(Refer Slide Time 01:41 min)

The PageRank Computation in Google, Randomized Algorithms and Consensus of Multi-Agent Systems

Discrete Mathematics Course Review 3

Introduction to Computer Science. Homework 1

Reasoning About Imperative Programs. COS 441 Slides 10

MATRIX REVIEW PROBLEMS: Our matrix test will be on Friday May 23rd. Here are some problems to help you review.

Using Spam Farm to Boost PageRank p. 1/2

Web Search Engines: Solutions to Final Exam, Part I December 13, 2004

EXTRACTION OF RELEVANT WEB PAGES USING DATA MINING

8.NS.1 8.NS.2. 8.EE.7.a 8.EE.4 8.EE.5 8.EE.6

UNIVERSITY OF CALIFORNIA, SANTA CRUZ BOARD OF STUDIES IN COMPUTER ENGINEERING

The PageRank Computation in Google, Randomized Algorithms and Consensus of Multi-Agent Systems

Efficient Sequential Algorithms, Comp309. Problems. Part 1: Algorithmic Paradigms

END-TERM EXAMINATION

Information Retrieval. Lecture 11 - Link analysis

CS101 Lecture 04: Binary Arithmetic

Lecture 1: Overview

COMP Page Rank

Social Network Analysis

Introduction to Data Mining

WEEK 4 REVIEW. Graphing Systems of Linear Inequalities (3.1)

Data Partitioning. Figure 1-31: Communication Topologies. Regular Partitions

Framework for Design of Dynamic Programming Algorithms

n = 1 What problems are interesting when n is just 1?

Chapter 18 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M. Cargal.

The Matrix-Tree Theorem and Its Applications to Complete and Complete Bipartite Graphs

Allocating Storage for 1-Dimensional Arrays

Carnegie Learning Math Series Course 2, A Florida Standards Program

CS 3114 Data Structures and Algorithms READ THIS NOW!

Clustering and Visualisation of Data

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

Due Date: Friday, September 9 th Attached is your summer review packet for the Algebra 1 course.

Chapter 5. Transforming Shapes

Large-Scale Networks. PageRank. Dr Vincent Gramoli Lecturer School of Information Technologies

Introduction to numerical algorithms

CS 506, Sect 002 Homework 5 Dr. David Nassimi Foundations of CS Due: Week 11, Mon. Apr. 7 Spring 2014

University of Toronto Department of Electrical and Computer Engineering. Midterm Examination. ECE 345 Algorithms and Data Structures Fall 2012

COMPUTER SCIENCE TRIPOS Part II (General) DIPLOMA IN COMPUTER SCIENCE

1 Random Walks on Graphs

Markov Chains and Multiaccess Protocols: An. Introduction

PageRank and related algorithms

Homework # 2 Due: October 6. Programming Multiprocessors: Parallelism, Communication, and Synchronization

CW Middle School. Math RtI 7 A. 4 Pro cient I can add and subtract positive fractions with unlike denominators and simplify the result.

Rectangular Coordinates in Space

Ratios and Proportional Relationships (RP) 6 8 Analyze proportional relationships and use them to solve real-world and mathematical problems.

REGULAR GRAPHS OF GIVEN GIRTH. Contents

Section 1.8. Simplifying Expressions

22. Two-Dimensional Arrays. Topics Motivation The numpy Module Subscripting functions and 2d Arrays GooglePage Rank

Data Structure for Language Processing. Bhargavi H. Goswami Assistant Professor Sunshine Group of Institutions

Conic Duality. yyye

Searching the Web [Arasu 01]

Linear and quadratic Taylor polynomials for functions of several variables.

The Simplex Algorithm

EE 301 Signals & Systems I MATLAB Tutorial with Questions

Practice Problems for the Final

Dynamic Programming. Lecture Overview Introduction

Computer Science Spring 2005 Final Examination, May 12, 2005

Integrated Algebra/Geometry 1

Lecture #3: PageRank Algorithm The Mathematics of Google Search

CS 380 ALGORITHM DESIGN AND ANALYSIS

Sequences and Series. Copyright Cengage Learning. All rights reserved.

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #16 Loops: Matrix Using Nested for Loop

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1

Transcription:

Project Problems SNU 4910.210, Fall 2014 Chung-Kil Hur due: 12/21(Sun), 24:00 Problem 1 (10%) Dust Storm The number of Mars exploration robots are more than one hundred now in A.D. 2032, and the number is still growing fast in order to gather rare metals found at the planet. A satellite is revolving around Mars so that it exchanges signals with the robots which are carrying out their missions on the surface of Mars. One of the tasks of this satellite is that, when a dust storm at the surface of Mars is expected, it should evacuate all robots to the shelters. The satellite should assign one particular shelter to each robot by transmitting signals to the robot. One shelter can accommodate only one robot at a time. After a robot receives the signal, it immediately moves to the assigned shelter along the shortest (linear) path with its maximum speed. A problem is, there is a risk of collisions between robots during the evacuation process. The satellite should assign the shelters to the robots without 1

that risk. Write a program shelterassign which does this job, according to the module type below. module type DUSTSTORM = sig type robot = string (* robot s name *) type shelter = int (* shelta s id number *) type location = int * int (* coordinate *) type robot_locs = (robot * location) list type shelter_locs = (shelter * location) list val shelterassigcommentatedn: robot_locs -> shelter_locs -> (robot * shelter) list end Assume the surface of Mars is a 2-dimensional plane (no donut-shaped or spheral.) The top-left corner is (0,0) and the bottom-right is (100,100) in the coordinate. Also, write a comment in the program that explains why this program always calculates a right answer within finite time. Right answer : There should be no possibility that any robots collide with others during evacuation. Within finite time : The program always halts with an answer. A hint: First, make pairs of robots and shelters in random, and repeat the following: find pairs which are possible to collide and then do. 2

Problem 2 (10%) Conjecture of Ranking Google s PageRank technology speculates the rank of importance of all web pages in the world. The Google search engine first picks web pages containing the phrase that user entered, and then uses this PageRank in order to show the pages to user in the order of the importance ranking, hoping that the ranking matches well with the user s want. So, how can we calculate this importance? The idea of the founders of Google, Page and Brin, is like this: Assumption: Pages that people visit frequently are important pages. Then, how can we find out the frequencies of visits? Idea: Let s simulate people s behavior while surfing the Internet. People start from one certain page and then move along the links on the page to visit another page. Often they type a URL directly in order to move to a new page, and then start surfing again. If we simulate this behavior to calculate the relative ratio of visits among the web pages, then maybe the ratio represents the importance of each page. If we assume that, how can we simulate the behavior? Answer: We can simulate it by constructing a Markov chain model. Example 1 One example of Markov chain modelling Every year, the movement of populations between Seoul and Sejong follows this: 5% of Seoul s people move to Sejong, and 15% of Sejong s people move to Seoul. The population of Seoul S n and that of Sejong J n satisfy the following relation. S n = 0.95 S n 1 + 0.15 J n 1 J n = 0.05 S n 1 + 0.85 J n 1 3

in other words, ( Sn ) = J n ( ) ( ) 0.95 0.15 Sn 1 0.05 0.85 J n 1 Let S 0 and J 0 be the populations of the two cities at the beginning. Then we can obtain the final populations of them by calculating the equations above or the recurrence formulas over and over. Like the matrix above, we call a matrix a Markov matrix if the sum of elements in each column is exactly 1. We call the sequence of values (such as (S n, J n ) for all n) produced by the recurrence formulas of it a Markov chain. Back to the PageRank problem, we can model the ratio of visits for each page like this Markov chain: Suppose that there are only 3 web pages A, B, and C. Each page has some links that point to other pages. Assume that the ratio that A is visited is proportional to the number of links from A, B, and C to A. For example, if B has 3 links and one of them points to A, then a person who is currently in B will visit A at the next time with the probability of 1/3. For example: (there are 2 links in A, 3 links in B, and 1 link in A.) A n = 1 2 A n 1 + 1 3 B n 1 + 0 1 C n 1 B n = 1 2 A n 1 + 1 3 B n 1 + 1 1 C n 1 C n = 0 2 A n 1 + 1 3 B n 1 + 0 1 C n 1 In other words, A n 1/2 1/3 0 A n 1 B n = 1/2 1/3 1 B n 1 C n 0 1/3 0 C n 1 Generally, for N pages the N N Markov matrix M is like this. x ij M: If page i has a link to page j then 1/r i (r i = the number of links in page i) otherwise 0. Then, the change of ratio of pages along the time is expressed with this recurrence formulas: S n = MS n 1 Goal: We want to calculate the eventual ratio, in other words, we want to calculate the limit of S. S = lim i S i We can let the first vector S 0 have 1/N for each element. (i.e. We suppose that the probability of each page to be the start page is uniform.) 4

Problem: does the limit of the Markov chain exist uniquely? And does the program that calculates the limit by repeatedly solving the recurrence formulas always halt within finite time? hint: Perron-Frobenius Theorem Find the answer for the questions above, and write a function markov_limit that converts the input matrix into a Markov matrix with the unique limit. It should satisfy the module type below. (The functions row, column, add_row, add_column, size, ij are the functions for constructing/using matrices.) module type MARKOV = sig type matrix val row: float list -> matrix val column: float list -> matrix val add_row: float list -> matrix -> matrix val add_column: float list -> matrix -> matrix val size: matrix -> int * int (* numbers of columns and rows *) val ij: matrix -> int -> int -> float (* Given a Markov matrix and an initial column, markov_limit returns the limit of the Markov chain. *) val markov_limit: matrix -> matrix -> matrix end 5

Problem 3 (10%) Transformers Let s create a Transformer that transforms objects in one world into objects in another world. The objects in our transformation are programs which handle integers. The objects that we are going to transform are integer expressions which are constructed according to these rules: E n integer E+E addition expression -E sign-change expression read input to be read from outside if E E E conditional expression repeat E E repeated expression The input expression read is an integer which is entered from outside. The range of input is from -100 to 100. The result of the conditional expression if E 1 E 2 E 3 is determined according to the value of E 1 : if E 1 s value is not 0 then the result becomes the value of E 2, and otherwise, the result becomes the value of E 3. The value of the repeated expression repeat E 1 E 2 is defined only when the value of E 1 is non-negative, and then the result value is by adding the value of E 2 in E 1 times. In other words, it is (E 1 s value) (E 2 s value). Therefore, when the value of E 1 is 0, the result is 0. 6

The resulting objects from the transformation are imperative programs constructed by these rules: C x has n variable x has integer n x has x variable has value of another variable x has x+x variable has the result of addition of two variables x has x-x has result of subtraction of two variables x has read has integer from input say x print value of variable goto l on x jump according to value of variable l: C tagged command C ; C commands executed sequencially As you see, variables play a crucial role in these programs. All values are always stored in variables, and all calculations are executed with variables. Before using variables, they should contain values. The command x has read gets the input from outside and store it at x. As we saw earlier, the value of the input is between -100 and 100. The command goto l on x terminates without any operation when the value of x is 0, and otherwise it jumps to the command to which the tag l is attached. For example, the following program prints 1: x has 1 ; y has 2 ; x has x+y ; goto L on x ; x has 0 ; L: z has x-y ; say z The following program cannot be executed because the variable z is used while it has no value. x has 1 ; y has x+x ; z has y+z ; say z The transformer outputs an imperative program that does the same thing with the input integer expression. The output program should print the same value with the value that the input expression is evaluated to. 7

Check before transformation: there exist some meaningless expressions that the transformer should reject before transformation. There is only one case for the meaningless programs, which happens when the repeated expression repeat E 1 E 2 has a negative value of E 1. In this case the calculation is not defined. The transformer should check whether the input expression contains this meaningless repeated expression. Before transformation, it should predict the value of E 1 in order to check whether it is negative. Check after transformation: you should check whether the resulting program C works correctly. In fact, you should check whether the input and output programs do the same thing, but here we are going to check these two conditions: All values should contain values before use. The jump statements should have well-defined target commands, which means, for each jump statement there should be only one command which has the corresponding tag. The transformer transform and the checkers, check_exp and check_cmd, should satisfy the following module type: module type TRANS = sig type exp = Num of int Add of exp * exp Minus of exp Read If of exp * exp * exp Repeat of exp * exp type var = string type tag = string type cmd = HasNum of var * int HasVar of var * var HasSum of var * var * var HasSub of var * var * var HasRead of var Say of var Goto of tag * var Tag of tag * cmd Seq of cmd * cmd val transform: exp -> cmd val check_exp: exp -> bool val check_cmd: cmd -> bool end 8