COSC 6397 Big Data Analytics Master Worker Programming Pattern Edgar Gabriel Fall 2018 Master-Worker pattern General idea: distribute the work among a number of processes Two logically different entities: master process: manages assignments to worker processes worker processes executing a task assigned to them by the master Communication only occurs between master process and worker processes No direct worker to worker communication possible 1
Master-Worker framework Master Process Worker Process 1 Worker Process 2 Result queue Task queue Master process pseudo code for each worker w=0 no. of workers { f = FETCH next piece of work SEND f to w while ( not done) { RECEIVE result from any process w = process who sent result STORE result f = FETCH next piece of work if f!= no more work left SEND f to w else SEND termination message to w 2
Master process List of things to worry about: Number of workers: Want to maximize performance -> many workers Too many workers: master process can become the bottleneck Work distribution: too little work per worker can lead to large management overhead for the master SEND/RECEIVE can be implemented using any communication paradigm (sockets, MPI, http, files on a shared filesystem, ) More than one message might be required per communication step Master process Support for process failure Master needs to maintain which work items have been assigned to which worker If a worker fails, work is reassigned to new worker Resilience of master can be achieved through reliable back-end storage Correctness verification: assign the same work to more than one worker compare results obtained 3
Worker process pseudo code while ( not done ) { RECEIVE work f from master If f equals termination signal exit else result r = execute work on f SEND r to master Worker process Dynamically generating worker instances vs. using a fixed number of workers Worker can support multiple functions Additional message might be required to identify the task that the worker needs to execute 4
1 st Example: Word Count Application counting the number of occurrences of each word in a given input file Input: a file with n number of lines Output: list of words and the number of occurrences Two step approach: Step 1: Each worker gets one/a fixed no. of line(s) of the file Worker returns a list of <word, #occurrences> to master. Master writes list of words, #occurrences into a orary file Step 2: the same word will appear multiple times in the orary file Sort orary file Add the number of occurrences of the same word Write word and final number of occurrences to end file Word count step 1 for each worker w=0 no. of workers { f = READLINE (file) SEND f to w while ( not done) { RECEIVE number of elements in the list returned allocate orary buffer to receive list RECEIVE list <words, occurrences> WRITE list of <words, occurrences> to orary file release orary buffer f = READLINE (file) if f!= EOF SEND f to w else SEND termination message to w 5
Word count step 2 SORT orary file based on word <word, occurrence> = READ (orary file); currentword = word; currentcount = occurrence; while ( not done) { <word, occurrence> = READ (orary file); if ( word!= currentword ) { WRITE (outputfile, currentword, currentcount) currentword = word; currentcount = 0; currentcount += occurrence; Alternative design possibilities Master searches for word occurrence and adds value received from worker process immediately Avoids orary file Requires large number of search operations in the file Only final output file is sorted Worker process does not return list of <word, occurrence> for each line received Just signals that its ready for a new assignment returns only final list after termination signal 6
Alternative design possibilities (II) Using workers for step 2 often also necessary Using workers for sorting Requires sending large data volumes Requires direct worker to worker communication -> virtually impossible in a classic master-worker setting -> Amdahl s law strikes! Aggregation of multiple entries for the same word can be distributed to workers 2 nd Example: k-means clustering An unsupervised clustering algorithm K stands for number of clusters, typically a user input to the algorithm Iterative algorithm Might obtain only a local minimum Works only for numerical data x 1,, x N are data points Each data point (vector x i ) will be assigned to exactly one cluster Dissimilarity measure: Euclidean distance metric 7
K-means Algorithm C(i): cluster that data point x i is currently assigned to For a current set of cluster means m k, assign each data point to exactly one cluster: C( i) arg min x m 2 i k, i 1,, 1 k K N k : number of data points in cluster k For a given cluster assignment C of the data points, compute the cluster means m k : x i i: C( i) k mk, k 1,, K. Nk Iterate above two steps until convergence N K-means clustering: sequential version Input : X = {x 1,,x k Instances to be clustered K : Number of clusters Output M = {m 1,,m k :cluster centroids m: X C : cluster membership Algorithm Set initial value for M while m has changed c n= 0, n = 1,, k N n = 0, n = 1,, k for each x ϵ X C(j) = min distance (x j, c n ), n = 1,, k c n += x j N n ++ end recompute C based on c and N 8
K-means clustering Initial cluster means Using some other/simpler pre-clustering algorithms Random values Random values difficult when using multiple processes Pseudo random number generators Master process creates random numbers and distributes it to workers K-means example Lets assume 2-D problem, 6 data points, 2 clusters x 0 = 0.6 0.325 x 1 = 1.0 0.87 x 2 = 0.35 0.5 x 3 = 0.2 0.7 x 4 = 0.9 0.65 x 5 = 0.85 0.9 Initial cluster centroids (chosen arbitrarily) m 0 = 0.8 0.4 m 1 = 1.1 1.0 9
K-means example Distance calculation, e.g. data point 0 to cluster centroid 0 d 0 0 = (x 0 (0) m 0 (0)) 2 +(x 0 1 m 0 (1)) 2 = (0.6 0.8) 2 +(0.325 0.4) 2 = 0.2136 Distance 0 1 2 3 4 5 Cluster 0 0.2136 0.510784 0.460977 0.67082 0.269258 0.502494 Cluster 1 0.840015 0.164012 0.901388 0.948683 0.403113 0.269258 Chosen cluster 0 1 0 0 0 1 Distance 0 1 2 3 4 5 Cluster 0 0.2136 0.510784 0.460977 0.67082 0.269258 0.502494 Cluster 1 0.840015 0.164012 0.901388 0.948683 0.403113 0.269258 Chosen cluster 0 1 0 0 0 1 C 0 = x0 + x 2 + x 3 + x 4 = 0.6 0.325 + 0.35 0.5 + 0.2 0.7 + 0.9 0.65 = 2.05 2.175 C 1 = x1 + x 5 = 1.0 0.87 + 0.85 0.9 = 1.85 1.77 N 0 = 4 N 1 = 2 m 0 = 2.05 4 = 0.5125 2.175 = 0.54375 4 m 1 = 1.85 2 = 0.925 1.77 2 = 0.885 10
Master-worker k-means m= randomly generated initial cluster means x = data points for each worker w=0 no. of workers { SEND portion of x for worker w Do { for each worker w=0 no. of workers { SEND m to w for each worker w=0 no. of workers { RECEIVE C from w Calculate N and c for all cluster Recalculate m while (m has changed ) 11
Master worker k-means Portion of data points x assigned to a worker remains constant -> has to be sent only once Need to separate loops to send m and receive C to avoid serialization of the worker processes Problem is strongly synchronized All worker processes have to finish before master process can recalculate N k, c and m Calculating N k, c and m sequential in this version -> Amdahl s law strikes! Worker 0 Worker 1 x 0 = 0.6 0.325 x 1 = 1.0 0.87 x 2 = 0.35 0.5 x 3 = 0.2 0.7 x 4 = 0.9 0.65 x 5 = 0.85 0.9 m 0 = 0.8 0.4 m 1 = 1.1 1.0 m 0 = 0.8 0.4 m 1 = 1.1 1.0 Distance 0 1 2 Distance 3 4 5 Cluster 0 0.2136 0.510784 0.460977 Cluster 0 0.67082 0.269258 0.502494 Cluster 1 0.840015 0.164012 0.901388 Cluster 1 0.948683 0.403113 0.269258 Chosen cluster 0 1 0 Chosen cluster 0 0 1 N 0 =2 N 0 =2 C 0 = x0 + x 2 C 0 = x3 + x 4 C 1 = x1 N 1 = 1 C 1 = x5 N 1 = 1 Master: C 0 = C 0 of Worker 0 + C 0 of Worker 1 N 0 = N 0 of Worker 0 + N 0 of Worker 1 recalculate m 0 and m 1 send new m 0 and m 1 to Worker 0 and Worker 1 12
Summary Master worker pattern useful for a number of application scenarios Mostly trivially parallel problems Supports dynamic load balancing A more powerful worker will return work faster and get more work assigned than a slow worker Well suited for heterogeneous environments Easy to integrate fault tolerance Not well suited for synchronized problems Amdahl s law will prevent the model from scaling up to extreme problems 13