ACE: And/Or-parallel Copying-based Execution of Logic Programs

Size: px
Start display at page:

Download "ACE: And/Or-parallel Copying-based Execution of Logic Programs"

Transcription

1 ACE: An/Or-parallel Copying-base Execution of Logic Programs Gopal GuptaJ Manuel Hermenegilo* Enrico PontelliJ an Vítor Santos Costa' Abstract In this paper we present a novel execution moel for parallel implementation of logic programs which is capable of exploiting both inepenent an-parallelism an or-parallelism in an efficient way. This moel extens the stack copying approach, which has been successfully applie in the Muse system to implement or-parallelism, by integrating it with proven techniques use to support inepenent an-parallelism. We show how all solutions to non-eterministic anparallel goals are foun without repetitions. This is one through recomputation as in Prolog (an in various an-parallel systems, like &-Prolog an DDAS), i.e., solutions of an-parallel goals are not share. We propose a scheme for the efficient management of the aress space in a way that is compatible with the apparently incompatible requirements of both an- an or-parallelism. We also show how the full Prolog language, with all its extra-logical features, can be supporte in our an-or parallel system so that its sequential semantics is preserve. The resulting system retains the avantages of both purely or-parallel systems as well as purely an-parallel systems. The stack copying scheme together with our propose memory management scheme can also be use to implement moels that combine epenent an-parallelism an or-parallelism, such as Anorra an Prometheus. 1 Introuction Recently, stack copying has been emonstrate to be a very successful alternative for representing múltiple environments in or-parallel execution of logic programs. In this approach, stack frames are explicitly copie from the stack(s) of one processor 1 to that of another whenever the latter processor nees to share a branch of the or-parallel tree of the former. In practice, by having an ientical logical aress space for tlaboratory for Logic, Datábase, an Avance Programming, Dept. of Computer Science, New México State University, Las Cruces, NM, USA. * Faculta e Informática, U. Mari (UPM), Mari - Spain. ^Dept. of Computer Science, University of Bristol, Bristol, UK. Throughout the paper we will often refer to the "stack" of a "processor" meaning the memory áreas that a computíng agent is using. all processors an allocating the stack(s) of each processor in ientical locations of this aress space, the copying of stack frames can be reuce to copying large contiguous blocks of memory from the aress space of one processor to that of the other an operation which most current architectures perform quite efficiently without requiring any sort of pointer relocation. The chief avantage of the stack copying approach is that program execution in a single processor is exactly the same as in a sequential system. This consierably simplifies the builing of parallel systems from existing sequential systems, as was shown by MUSE [2, 1] which was built using the sequential SICStus Prolog System. Similar arguments can also be mae for the esign of inepenent an-parallel systems base on program annotation (i.e. using Conitional Graph Expressions) an recomputation of subgoals [7] (i.e. noneterministic an-parallel subgoals are recompute an not share), as proven by the experiences of &- Prolog [16] an DDAS [23]. Briefly, a program annotate for inepenent an-parallelism contains expressions of the form,(< conitions > => HteraliSz <kliteral n ), where literali,..., literal n will be execute in (an-) parallel only if the < conitions > are satisfie. A long staning goal of parallel logic programming systems esigners has been to obtain more general systems by combining ifferent forms of parallelism into a single framework. In particular, one woul expect that inepenent an-parallelism an or-parallelism, that have been exploite so well in Prolog, coul naturally be exploite together. In fact, this is a har problem, as the ifficulties (e.g. supporting full Prolog) face by several previous proposals [14, 26, 22] o show. Recently an abstract moel, calle the Composition Tree [12], has been esigne that allows efficient realization of systems that combine both forms of parallelism while supporting full Prolog. In this paper we esign a novel moel, a realization of the C-tree, exploiting or- an inepenent an-parallelism, which subsumes both the stack copying approach (for orparallelism) an the subgoal recomputation approach (for an-parallelism). The resulting an-or parallel system, calle ACE, is in the same category as PEPSys [26], ROPM [22] an the AO-WAM System [14]. However, our system is arguably better than the above systems in many re-

2 spects, the chief ones being ease of implementation, sequential efficiency, an better support for the full Prolog language, in particular being able to incorpórate sie-effects in a more elegant way. These avantages are ue to several factors. One of them is that ACE recomputes inepenent an-parallel goals, rather than sharing their solutions (solution sharing was aopte in all the previously propose moels [14, 26, 22]). Recomputation means that, given a goal a(x) & b(y), where the two subgoals a an b are inepenent, the solutions for the subgoal b will be recompute for every solution foun for subgoal a. Recomputation has important avantages (they are are iscusse at length in [12, 25]), an was funamental in the esign of the C-Tree moel. In [12] we presente the C-tree, along with a few preliminary ieas on how to realize the C-tree using an environment representation technique base on stack copying as well as bining arrays. In this paper we show how the complete inepenent an- an or-parallel system base on C-Tree can be constructe using stack-copying. ACE subsumes both MUSE [2] an &-Prolog [16] in terms of execution behaviour. One of our aims is to have ACE subsume performance characteristics of MUSE an &-Prolog as well, namely, their low parallel overhea, their consierable speeups for interesting applications, an their support for the full Prolog language. To accomplish this we nee to carefully aress the many issues that arise in combining both forms of parallelism. These issues are: Synchronization between inepenent an-work an or-work: that is, eciing when shoul the alternatives create by goals working in inepenent an-parallel be mae available for or-parallel processing. In ACE we lay own a set of sharing requirements that a choicepoint shoul satisfy before a processor can pick an or-parallel alternative from it. Memory management: for or-parallel execution in MUSE the stacks of one processor shoul not be visible to the other processors (except uring copying), while in inepenent an-parallel execution in &-Prolog the stacks of one processor shoul be visible to all other processors. In ACE processors are organize into teams to get aroun these conflicting requirements for or- an inepenent an-parallelism respectively. Scheuling: ACE can use the existing scheulers of MUSE an &-Prolog for scheuling or- an inepenent an-parallelism respectively. However, in aition, it shoul also balance the amount of resources allocate for exploiting or- an inepenent an-parallelism. Efficient implementation of copying: while MUSE copies stacks of a single processor at a time, ACE nees to copy stacks of múltiple processors. Therefore, eveloping optimize copying techniques is even more funamental for ACE. Implementation of Prolog's extra-logical features (such as cuts an sie-effects): Both MUSE an &-Prolog have evelope techniques for supporting full Prolog. In ACE we nee to exten these techniques to support sequential Prolog semantics in presence of both or- an inepenent anparallelism. Here we can benefit from the principies esigne for the C-tree abstraction [10]. In esigning the solutions to these problems, our aim is to obtain a full Parallel Prolog system that will have low sequential overheas an goo parallel speeups. Also, we try to follow the techniques that have been use for &-Prolog an MUSE as much as possible, as they have proven to be effective in practice. Our perspective so far of ACE is as concretizing the C-tree framework for combining inepenent an- an or-parallelism using stack copying. Once ACE is fully escribe, it will be apparent that ACE can be seen in quite a ifferent perspective. In this new perspective, ACE generalizes the principie of copying, from the copying use in MUSE to obtain or-parallelism between sequential computations, to copying to obtain or-parallelism between an-parallel computations. In the paper we show that this principie, Generalize Copying, not only gives a way to unerstan ACE, but it applies, an shoul be useful, to combine orparallelism with many forms of an-parallelism, such as parallelism between eterminate an-goals as exploite in Basic Anorra [6], or with epenent an inepenent an-parallelism as exploite in DDAS [23]. The paper is organize as follows. We first present the ACE moel. Although the C-tree abstraction is implicit to our reasoning, it is not neee for unerstaning the rest of the paper. We use the stack copying approach to give a more intuitive feel for the moel. We then enter the more specific problems of memory management, an how copying can be implemente between stacks sets. We give a brief overview of the new scheuling problems that arise, an present an iscuss two schemes for the optimization of copying in ACE. We also propose a scheme to support cut an sie-effects in ACE. We finally iscuss the effectiveness of ACE an show how our scheme can be generalize to epenent an-parallel systems. Throughout the paper, we assume some familiarity with the implementation of Prolog, &-Prolog, an MUSE. Like in &-Prolog, we assume that programs are annotate to express the an-parallelism using basic Conitional Graph Expressions (CGEs) before execution commences. The &-Prolog parallelization tools [20] will be use to automatically genérate such annotations from stanar Prolog coe. Alternatively,

3 programs can always be annotate by the user. 2 2 The Stack Copying Approach to An-Or Parallelism In ACE, the múltiple environments that are neee to implement or-parallelism are supporte through explicit stack copying. We first summarize the stack copying approach (as use by the MUSE system). In a stack-copying or-parallel system several processors explore ifferent alternatives in the search tree inepenently (moulo sie-effect synchronization). The execution of each processor is ientical to a sequential Prolog execution. Whenever a processor Pl exhausts its branch an wants to share work with another processor P2 it selects an untrie alternative from a choice point cp in P2's stack. It then copies the entire stack of P2, backtracks up to the choice point cp in orer to uno all the conitional binings mae below cp, an starts executing one of the untrie alternatives. In this approach, provie there is a mechanism for copying stacks, the only cells that nee to be share uring execution are those corresponing to the choice points. Share choice points are thus copie from P2's private memory to share memory where they can be accesse from both Pl's an P2's private memory via pointers 3 (these choice points are sai to have been mae public, following MUSE's terminology). If we consier the presence of an-parallelism in aition to or-parallelism, then, epening on the actual types of parallelism appearing in the program an the nesting relation between them, a number of cases can be istinguishe. The simplest two cases are of course those where the execution is purely or-parallel or purely an-parallel. Trivially, in these situations stanar or-parallel an inepenent an-parallel execution respectively apply, moulo the memory management issues, which will be ealt with later. We next iscuss the interesting cases where both forms of parallelism are present in the computation. 2.1 An "Uner" Or An "uner" or refers to cases where or-parallelism present insie an-parallel goals is not exploite [25]. Thus, only alternatives in those choice points that are not neste insie any CGE, i.e. not create uring processing of an-parallel goals, are mae available for or-parallel processing. The two cases are, first the 2 In &-Prolog unrestricte epenency graphs can be expresse (i.e. more general than those possible with CGEs), by combining the "&" operator an synchronizationbuiltins. However, since such graphs can be hanle in a similar way to that given in the escription that follows, the iscussion will be limite for simplicity an without loss of generality to CGEs. To be precise, share choice points are not copie but a recor representing the choice point is create in the share área. case in which the goal that gives rise to or-parallelism is not precee by any CGE; an secon, the case in which this goal is in the continuation (but not insie) of some CGE. The first case is illustrate in Figure 1. Consier the tree, shown in Figure l.(i), that is generate as a result of executing a query a which uring its execution calis a clause containing a CGE (true => b(x) & c(y)). In Figure l.(i) processor Pl has starte execution of goal a, left an untrie alternative ("embryonic branch") a2, an then entere the CGE. Anparallel execution can remain ientical to the stanar subgoal recomputation approach (like in &-Prolog), henee a processor P2 can simply pick up the execution of goal c. Or-parallel execution can also remain ientical to puré stack-copying. If processor P3 wants to pick up the a2 branch left behin by Pl, it can simply copy the portion of the tree from the root to the embryonic noe, an continué with the untrie alternative (Figure l.(ii)). This resembles a stanar stack-copying execution (as in MUSE). Figure l.(iü) an Figure l.(iv) present the secon case, when a processor selects an untrie alternative from a choice point create uring execution of a goal gj in the boy of a goal which oceurs after a CGE. In other wors, there has been an-parallelism above the selecte alternative, but all the an-tasks are finishe. A processor selecting such an alternative will have to copy all the stack portions in the branch from the root to the CGE, the portions of stacks corresponing to all the an-tasks insie the CGE, an those of the goals between the en of the CGE an gj. All these portions have in principie to be copie because the untrie alternative may nee access to variables in all of them. In Figure l.(iii), processor Pl starte execution of the goal creating a CGE (b & c), an fully executes b. Processor P2 execute the goal c in an-parallel. Both have finishe execution of the CGE (leaving no choice points behin) an then processor Pl has taken the continuation an left an untrie alternative 2. This alternative can be picke up by another processor P3. The processor P3 has therefore to copy the portion of the tree from the root to the CGE, the portions insie the CGE, an the portion of the continuation up to the embryonic noe. The processor P3 can then start execution of the 2 alternative (Figure l.(iv)). 2.2 Or "Uner" An In "Or Uner An" the untrie alternatives of choice points create within an-parallel goals in CGEs are also mae available for or-parallel processing. One coul simplify, an isallow or-parallel processing of such alternatives, trying them sequentially via backtracking instea, but there is experimental evience that a consierable amount of or-parallelism may be lost [25]. Therefore, ACE oes support oruner-an parallelism. When an alternative create within an-parallel goals in a CGE is selecte, one

4 P3,a P3 a ( & e) i el el (i) (ü) branch execute locally copie branch O bl (iv) (b & c) \ \cl 2 embryonic branch (untrie alternative) en of an-parallel goal, beginning of execution of continuation of CGE Figure 1: An Uner Or nees to carefully ecie which portions of the stacks to copy. Our guiing principie is the following: copy all branches that woul be copie in an equivalent or-parallel (MUSE in this case) execution, an recompute all those branches that woul be recompute in an equivalent puré an-parallel computation. As far as the an-parallel execution is concerne, we want to be as cióse as possible to the recomputation approach henee implementing the PWAM "point backtracking" strategy [19] use in &-Prolog. As we will see, our strategy results in copying only parts that &-Prolog reuses uring backtracking an recomputing those that &-Prolog (an also MUSE an Prolog) recomputes. Consier a CGE (true => g\!k...gi...<kg n ) that is encountere uring execution, an whose goal < r has an untrie alternative in one of the choice points in its search tree. Assume a processor picks up this untrie alternative for or-parallel processing. Then this processor will have to copy all the stack portions in the branch from the root to the CGE incluing the CGE escriptor 41 (calle C'-noe in [12] an parcall frame in &-Prolog [18]). It will also have to copy the stack portions corresponing to the goals g\... <7 _i (i.e. goals to the left of < r ). The stack portions up to the CGE nee to be copie because each ifferent alternative within gi might prouce a ifferent bining for a variable, X, efine in an ancestor goal of the CGE. The stack portions corresponing to goals g\ through gi-\ have to be copie because execution of the goals fol- The CGE escriptor recors the control information for the CGE an its inepenent an-parallel goals for exploiting anparallelism. Figure 2: Execution tree with alternatives insie the CGE lowing the CGE may nee to access some of the binings generate by the goals g\... <7 _i. The stack portions corresponing to the goals gi+i...g n nee not be copie, because these goals woul be recompute. The issue is further illustrate with a simple example. Figure 2 shows the an-or tree for the query q containing a CGE (true => a(x) & b(y)), each of whose goals leas to two solutions. For sake of simplicity, we have only shown the path from root of the tree to the CGE. Execution in ACE begins with processor Pl executing the top level query q. When Pl encounters the

5 Pl P2 P3 P4 P5 P6 (a & b) (a & b) (a & b) (a & b) & ^ S / & al bl bl al (i) (ii) a2 (iii) b2 (iv) a2 b2 branch execute locally - /"^V embryonic branch * \J (unt rie alternative) copie branch. choice point (branch point) Figure 3: Or Uner An CGE, it picks the subgoal a for execution, leaving b for some other processor. Let us assume that processor P2 picks up goal b for execution (Figure 3.(i)). As execution continúes Pl fins solution al for a, generating a choice point along the way. Likewise, P2 fins solution bl for b. Since we allow for full or-parallelism within anparallel goals, a processor can steal the untrie alternative in the choice point create uring execution of a by Pl. Let us assume that processor P3 steals this alternative, an sets itself up for executing it (before P3 can steal the alternative, Pl has to move the choicepoint into the share área). To begin execution of this untrie alternative P3 copies the stack of processor Pl (Figure 3.(ii) shows this process; see Ínex at the bottom of Figure 3 for explanation of the symbols). P3 then simulates failure to remove conitional binings mae below the choice point, an restarts the goals to its right (i.e. the goal b). Processor P4 picks up the restarte goal b an fins a solution, bl, for it. In the meantime, P3 fins the solution a2 for a (see Figure 3. (ii)). Note that before P3 can commence with the execution of the untrie alternative an P4 can execute the restarte goal b, they have to make sure that any conitional binings mae by Pl after the selecte choice point, as in Muse, an that any binings mae by P2 while executing b have been cleare. The former can be implemente by either (i) P4 copying b from P2 an completely backtracking over it 5 ; or, (ii) P3 (or P4) getting a copy of the trail stack of P2 an resetting all the variables that appear in it (see later). At this point, two copies of b are being execute in or-parallel, one for each solution of a. Note that the process of fining the solution bl for b leaves a choice point behin. The untrie alternative in this choice point can be picke up for execution by another processor. This is inee what is one by processors P5 an P6 for each copy of b that is executing. These processors copy the stack of P2 an P4 respectively, up to the choice point. The stack portions corresponing to goal a are also copie (Figures 3.(iii), 3.(iv)) from processors Pl an P3, respectively. The processors P5 an P6 then procee to fin the solution b2 for b. Note that if there were no processors available to steal the alternative (corresponing to solution b2) from b then this solution woul have been foun by processors P2 an P4 (in their respective copies of b) through backtracking as in &-Prolog. The same woul apply if no processors were available to steal the al- An optimization coul be that P4 choses not to backtrack over b or recompute it again, rather P4 simply copies b an reuses it. This optimization is only vali if b has not yet generate a solution (or at least, execution of the continuation of the CGE, which may bin variables conitionally in b, shoul not have begun). Some problems may also arise with extra-logical preicates in b, an in general only the part before such an extra-logical preicate can be copie into P4.

6 ternative from a corresponing to solution a2. In the above example, all other operations that are performe uring an-parallel execution remain the same as in &-Prolog. Thus, execution of the continuation of the CGE can begin only after at least one solution has been foun for all goals in the CGE. Also, backtracking in the CGE takes place just as in &-Prolog, i.e. goals to the right shoul be completely explore before a processor can backtrack insie a goal to the left. We place a restriction (calle the sharing reqmrement) on choice points insie a CGE that can be mae available for or-parallel processing: given a goal < r in a CGE, choice points arising in it can be mae available for or-parallel processing only if the goals to the left of <7 in that CGE have reache a solution. If the CGE containing < r is neste insie another CGE, then all goals to the left of the goal leaing to the inner CGE shoul also have foun a solution, an so on. Thus, in the example above (Fig. 3(i)), the alternative b2 of b cannot be picke up by any team 6 until the solution al has been foun. The sharing requirement serves two purposes: (i) as far as or-parallel scheuling is concerne it keeps us very cióse to the scheuling strategy employe by MUSE; (ii) it avois (a form of) speculative or-parallelism, because if the goal to the left (for which a solution ha not been foun yet) faile, the work woul have gone waste. We coul go one step further, an stipulate that choice points insie CGEs will be mae available only if all goals in the CGE have foun at least one solution. Although this will keep us closer to &-Prolog an enable us to o a limite form of intelligent backtracking (the kin that is also present in &-Prolog), this will overly restrict the amount of or-parallelism. So this restriction is not aopte, although its lack might result in extra work in some situations. For instance, in the example above (Figure 3), if b were a failing goal, i.e. a goal without any solutions, then trying múltiple alternatives of a in or-parallel woul result in waste work: b's failure woul be iscovere múltiple times since b is recompute for every alternative of a. 3 Memory Management in ACE One of the main features of stack-copying base orparallel systems which greatly facilitates stack copying is that each processor has an ientical logical memory aress space. This enables one processor to copy (part of) the stack of another without relocating any pointer. In the presence of an-parallelism this feature may be har to ensure, as each goal in a CGE The iea of viewing a set of workers as a team will be analyze in etails later on. may be execute in an-parallel by a ifferent processor. In other wors, as far as an-parallel execution is concerne, all the participating processors shoul work on sepárate segments of a common aress space, whereas for or-parallel execution each processor shoul have an ientical but inepenent logical aress space (so that stack portions can be copie without any pointer relocation). Thus, the requirements for or- an an-parallelism seem to be antithetical to each other. The problem can be resolve by iviing all the available processors into teams such that an-parallel work can only be share between processors of the same team, an or-parallel work can only be share between teams. All processors in a team thus share the same logical aress space, but each team has its own inepenent logical aress space (which must be ientical to the aress space of all other teams to allow copying without any pointer relocation). To implement an-parallelism, the aress space of each team is ivie up into k memory segments (as happens in &-Prolog), where k is the máximum number of processors allowe in any given team. The memory segments are numbere from 1 to k. Each processor of the team allocates its stack set (heap, local stacks, trail etc.) in one of the segments. The sizes of the k ifferent memory segments in the aress space of a team are not require to be the same. However, once one team's aress space has been ivie into segments using some scheme for ivisión, the aress spaces of all other teams shoul be ivie into segments in an ientical way, so that uring copying of stacks no pointer relocation is neee 7 (Figure 4). Processors belonging to other teams are allowe to join a ifferent team as long as there is a memory segment available for them to allocate their stacks in the new team's aress space. Consier the simple scenario where a choice point, belonging to a team TI an outsie the scope of any CGE, is picke by a team T2. Let i be the memory segment number in TI in which this choice point lies. For simplicity, we assume that the root of the Prolog execution tree also lies in memory segment i. T2 will thus copy the stack from the th memory segment of TI into its own th memory segment. Since the logical aress space of each team is ientical an is ivie into ientical segments, no pointer relocation is neee. Failure is then simulate an the execution of the untrie alternative of the stolen choice point begun. Now consier the more interesting scenario where a choice point, create by a team TI an which lies within the scope of a CGE, is picke up by a processor in a team T2. Let this CGE be (true => g\!k...8 g n ) an let < r be the goal in the CGE whose sub-tree con- This constraint may be relaxe quite a bit, since ientical ivisión of aress space nees to be one only for those teams that will share computatíon, an then only for the parts that are share.

7 0...0 f...f Proc 1 Proc n Segment 1 Segment m Team 1 aressing space f...f Proc n+1 Proc 2n Segment 1 Team 2 aressing space Segment m f...f Proc p-n+1 Proc p Segment 1 Team t aressing space Segment m Choice Point Parcall Frame Figure 4: Aress Space in Muse tains the stolen choice point. T2 nees to copy the stack segments corresponing to the computation from the root up to the CGE an the stack segments corresponing to the goals g\ through < r. Let us assume these stack segments lie in the niemory segments of team TI numbere ii,...,i. They will be copie, at the same position, into the niemory segments numbere ii,...,i of team T2. (section 7 escribes a strategy for incremental copying). Failure woul then be simulate on < r. We further nee to remove the conitional binings mae uring the execution of the goal <7 + i... g n by team TI. Let ik+i... be the stack segments where < r _ _i...g n are executing in team TI. As before, we can either copy the trail stacks of these segments an reinitialize (i.e. mark unboun) all variables that appear in them an then iscar the copie trails, or we can copy the stack segments corresponing to goal <7 + i... g n themselves in the appropriate niemory segments of T2 an then backtrack over them. Once removal of conitional binings is one the execution of the untrie alternative of the stolen choice point is begun. The execution of the goals < r _ _i...g n is reinitiate (since we are following a recomputation approach) an these can be execute by other processors which are members of the team (some of this re-computation can be avoie, as mentione earlier). Note that whereas copie stack segments occupy the same niemory segments as the original stack segments restarte goals can be execute in any of the available niemory segments (clearly if T2 ecies to copy the computations one by team TI for goals gi+i through g n to save recomputation or for untrailing, as men- Figure 5: Illustration of Stack Copying tione earlier, then the corresponing stack segments will have to be copie in the same niemory segments, i.e. ik+i through i, of T2). Returning to the earlier example (fig. 3), for execution to procee as shown there, each pair of processors (Pl, P2), an (P3, P4) woul have to be in the same team (respectively teams TI an T2). Each of processors P5 an P6 will also have to be in a sepárate team (respectively teams T3 an T4). Assuming that Pl starts the execution of query q in niemory segment numbere 1, an P2 starts the execution of b in niemory segment numbere 2 (in the aress space of team TI), then P3 woul be force to copy the stack segment corresponing to a in niemory segment number 1 of its aress space. Assuming that only the trail stack of b is copie (to reset conitional binings), P4 is free to execute b in any niemory segment of T2 (which will be a segment other than niemory segment 1, because only one processor in a team can use a niemory segment at a time). Suppose P4 has its stacks locate in segment 4 of the aress space of team T2; then, it will execute b in niemory segment number 4. When P5 an P6 steal the alternative corresponing to solution b2 then each of them will copy stack segments corresponing to a to niemory segment 1 of their respective aress spaces, an the stack segment corresponing to b to niemory segments 2 an 4 of their respective aress spaces. The copying of stacks by team T2 from team TI corresponing to figures 3.(i) an 3.(ii) is further illustrate in figure 5. To keep the figure simple only

8 the local stacks are shown. In reality, the heap an the trail will be copie too. Also note that copie choice points are transferre to a share área to which the choice points in the local aress space now point, as in MUSE. The share memory área is not shown in Figure 5. Note that because goals in a CGE are recompute, parcall frames an any other structure use to support an-parallelism (as the various markers use by the PWAM [18]) are copie rather than share (see Figure 5) Note also that each memory segment in a team's aress space has a complete set of stacks for a processor to work on corresponing to the "stack set" of &-Prolog [16]. Thus, the segmente memory management propose can also be viewe as each team having a number of stack sets on which ifferent processors ("agents") can work on. This view allows the immeiate application of stanar memory management techniques evelope for inepenent an-parallelism [17] within each team. This leas to a layering of the parallelism exploitation in ACE: at the lower layer, within each team, the computation is purely an-parallel, as in a group of "stack sets" in &-Prolog to which a number of "agents" are attache; at the higher layer, among the teams, the computation is purely or-parallel (as in MUSE). Thus, it is easy to see that in the presence of only an-parallelism our system woul be as efñcient as &-Prolog, while in the presence of only orparallelism it will be as efñcient as the MUSE system. Also notice that the amount of stack copying that will be one, in the presence of both an- an or- parallelism, woul be ientical to that one in the MUSE system provie, of course, that the scheuling strategy is the same. However, the set up time for executing the untrie alternative choice points that fall within the scope of a CGE may be ifferent than in MUSE, ue to the spreaing of the computation across ifferent stacks. On the other sie, the actual copying operation may result even faster than in MUSE, since ACE can take avantage of having múltiple processors transferring in parallel parts of the computation tree. Note that these ifferences only appear when both an- an orparallelism are being exploite simultaneously. An interesting property of ACE also relate to memory management is that it aapts quite naturally to a hybri multiprocessor in which parts of the aress space are share among subsets of processors, as, for example, in a system containing múltiple share-memory multiprocessors connecte by a message passing or broacast local network [4]. In this kin of system each share-memory multiprocessor woul naturally be a caniate for constituting a team with its own memory aress space, an the various teams woul then be sprea over the ifferent multiprocessors an communicate by the message-passing or broacast local network. An-parallelism woul be exploite within the share-memory system while or-parallelism woul be exploite over the istribute network of these share memory systems. The argument above has been base on locality of aressing space issues, but a perhaps even more important factor involve is that of the access time epening on location. It also makes sense from this point of view to keep processors in a team, which communicate more often, within the fast communication área an put ifferent teams, which communicate less often, at a larger istance from the point of view of communication. Similar principies apply an a similar approach can be taken for implementing an-or parallel systems on general NUMA (Non Uniform Memory Access) Machines even if they have a global aressing space. 4 Scheuling The nee to scheule work arises at two inepenent levéis: (i) an-parallel work at the level of processors within a team an, (ii) or-parallel work at the level of teams. Thus a processor can steal anparallel work only from members of its own team an an ile team can steal an untrie alternative from a choice point. This suggests that sepárate scheulers can be use for managing the an-parallel an orparallel work respectively. Scheulers evelope for &-Prolog an for MUSE can be use for this purpose. For example, an-parallel work can be manage exactly as in the &-Prolog scheuler: ile processors steal available goals from goal-stacks of other processors in their team. Or-parallel work can be manage as in MUSE: ile teams request work from other teams (as in MUSE, it will be convenient to share as much work as possible). A istinction can be mae between the public part an the prívate parí of the execution tree: the choice points in the public part have been mae sharable, while those in private part have not been mae sharable yet. Execution in a team continúes normally as in an an-parallel system (as escribe above), until the team is interrupte by another that is looking for work. At this point all choice points in the private part that satisfy the sharing requirement are mae sharable or public. The requesting team picks an alternative from one of these choice points for or-parallel processing. Finally, one has the new problem of balancing the number of teams an the number of processors in each team, in orer to fully exploit all the an- an orparallelism available in a given Prolog program. In orer to solve this problem ynamically processors can migrate from one team to another or start new teams. An ile processor first looks for an-parallel work in its own team. If no an-parallel work is foun, it can ecie to migrate to another team where there is work, provie (a) it is not the last remaining processor in that team, an (b) there is a free memory segment

9 in the memory space of the team it joins. If no such team exists it can start a new team of its own, perhaps with ile processors of other teams, provie there is a free aress space available for the new team. The new team can now steal or-parallel work from other teams. Some of the 'flexible scheuling' techniques [8] that are being evelope for eciing when a processor shoul switch teams etc. in the Anorra-I system can also be use in ACE. 5 Implementation of ACE The iscussion so far has aime at proviing a general, high-level escription of the ACE execution moel. In this section we will present a number of practical issues which arise in the implementation of the moel an propose a number of solutions for efficiently resolving them. The two main issues we stuy are relate to memory management an how efficient copying through only copying parts of the stack sets between teams can be implemente, an how full Prolog shoul be supporte in ACE. (Further etails on implementation of ACE, that are not inclue here for lack of space, can be foun in [13]). 5.1 Goal Execution Orer in CGEs Memory management is a complex problem in the implementation of parallel logic programming systems, one that is closely relate to scheuling [17]. Memory management is simplifie in MUSE because each processor manipulates a sepárate Prolog stack set. In contrast, in ACE a team manipulates múltiple stack sets that may have to be copie when teams fetch work from other teams. Furthermore, epening on an-scheuling, only parts of such stack sets may be neee: the orer in which stack frames are pushe on the processor's stack may not obey the orer in which they woul have been pushe in a sequential Prolog implementation, an thus a stack segment may contain "trappe" stack frames (actually, whole "stack sections") that are not part of the computation surrouning it [17]. As a result of this, when copying stack segments we may copy sections that are unrelate to the branch we nee. We can completely avoi copying these non-relevant parts, but then many small fragments of the stack will have to be copie making the copying operation somewhat inefficient[13], an, in any case, the hole create by the trappe goal woul remain in the copying stacks because copying is aress-to-aress to avoi pointer relocation. Incremental copying is also mae ifficult by this potential lack of orer. We explain these practical issues in more etail next. Ieally, we woul prefer that a parallel stack-base system implementing Prolog semantics woul obey the seniority constraint: Given two stack frames, /i an 2, corresponing to two noes in the Prolog search tree, then /i shoul be allowe to appear above j' (there might be other intervening noes between /i an j' ) if an only if /i will appear above j' in the stack in stanar sequential execution of Prolog. Thus frames of escenent noes in the execution tree must appear on top of frames of their ancestors. The reason why this is helpful is that, while backtracking, if we reach a frame / then we know that / is on top of the stack an that the frames corresponing to all its escenents have been backtracke over an reclaime, thus consierably simplifying memory management. If the seniority constraint is not obeye, then holes may appear in the stack both in an- an or-parallel systems, an "trappe goals" may appear in an-parallel systems [17]. In fact, the seniority constraint may impose severe constraints on parallel systems. Enforcing this constraint for an inepenent an-parallel system such as &-Prolog (or ACE) may severely constrain the way an-parallel goals can be scheule. Given a CGE (true =>a&b&c) ifa processor picks the goal c for an-parallel processing, then following this constraint will effectively shut it out from picking any goals (after it has finishe processing c) to the left of c, or goals from CGEs create uring execution of a or b, or goals from ancestor CGEs that are to the left. The seniority constraint is obviously too severe in this case an inee systems such as &- Prolog [16] an Aurora [9] o not obey it. Rather they let holes (an trappe goals, in the case of &- Prolog) be create in the stack, that will be reclaime when everything above the hole gets reclaime (see Figure 6). This creates many problems in ACE, because now when we copy a stack set, we may copy many trappe goals that are not part of the current alternative stolen. These trappe goals may nee to be ientifie before execution can begin in the copying team. This problem an our solutions are further iscusse in the next section in the context of techniques for incremental copymg of stacks Incremental Copying in ACE An optimization that significantly improves the performance of stack-copying or-parallel systems, like MUSE, is incremental copying, i.e., when a processor copies a stack of another, then only those parts are copie in which the two processors iffer. This is illustrate in Figure 7 (only local stacks are shown). Suppose processor Pl is working on branch 1, an P2 on branch 2. At this point both Pl an P2 have a common stack up till the branch noe a (moulo conitional binings). Suppose now that after exploring branch 2, P2 ecies to pick an alternative from Pl Note that goal-recomputation, as use in ACE, actually helps in maintaining seniority constraints, because every time we recompute goals, we execute them on top of existing solve goals that are to the left, thus righting the orer somewhat.

10 choicepoint Processors Pl, P2 an P3 are processing the an-or tree shown on the left. Processor P3 picks goal from Pl for an-parallel processing, fins solution i, an starts helping P2 to solve the goals in its CGE by picking up goal h. The goal is now trappe uner goal h, because it woul be backtracke over first. P1 b i j i j&k b P2 9 i g P3 k i h i h Similarly, k is picke by P3 after finishing hl, which inturn traps h. Note that b&c& etc. in the stacks enote the Parcall frame for that CGE. b&c& a g&h&i c i Figure 6: Trappe Goals in An-parallel Execution (along branch marke 3) in noe b. To o so it backtracks up to noe a an steals the secon alternative from b in Pl. Therefore, before P2 can procee, it nees to créate on its stacks the state that existe in Pl at the time the choicepoint corresponing to b was create. To o so it copies Pl's stacks. The copying an restoring of state can be one in three ways [3] (Figure 7): i optiniize incremerital _ Figure 7: Incremental Copying in MUSE i ' P2 (i) Total: copy the entire stacks of Pl (everything from the root to the bottom most noe along branch 1), then backtrack until choicepoint b is reache. Thus, the hatche, gray, an the black shae segments of Pl's stack in Figure 7 will be copie; (ii) Incremental: copy only frames below choice point a (those above a are alreay on P2's stack), then backtrack until choice point b. Thus, only the gray an black segments in Pl's stack in Figure 7 will be copie); (iii) Optiniize incremental: copy only the stack segments between choice points b an a because those above a are alreay on P2's stack, an those below b are not neee for execution. Thus, only the gray shae portion of Pl's stack in Figure 7 is copie. The exception is that the entire trail stack below a is copie, so that the parts of the trail stack below choice point b can be use for removing conitional binings. Clearly, option (i) involves unnecessary copying 9 because there are copie parts that are immeiately Experiments on the Sequent Symmetry have shown that for memory chunks larger than 4K the copying time is proportional to the size of the memory chunk being copie [13].

11 backtracke over an reclaime. Option (ii) also oes unnecessary copying, unless the black shae part of the stack in Figure 7 is very small in size. In MUSE the ifference between Incremental an Optimize is almost irrelevant, since in most of the cases there will be harly anything on the stack below the choice point (assuming the stack is growing ownwar as in Figure 7) from which the new alternative is taken. This is a consequence of the scheuling policy aopte by MUSE, in which alternatives are always taken from the bottommost choice point (known as ispatchingon-bottommost [2]). In ACE, however, things are ifferent ue to the presence of an-parallelism. Referring again to the an-or tree shown in Figure 6, suppose that a team, T2, was working on alternative g2 of goal g in the inner CGE (which it stole from a team, TI, earlier). It fins a solution, looks for more work, an ecies to pick an alternative h.2 from noe h (corresponing to solution gi from TI). T2 an TI have a common stack up to the CGE labele (g & h & i). The stack frames leaing up to choicepoint g are also present in both. Applying the iea of incremental copying, T2 will have to copy the ifference between TI an itself. As before, there are two ways of copying incrementally: (i) blinly copying the ifference between corresponing stacks (of ifferent processors of the two teams) on T2's stacks (Incremental Copying); (ii) copying only those parts which will be useful for T2, i.e. leaving out the parts that will be immeiately backtracke over (e.g. the frames corresponing to hi, i, ci an i) copying only the trail for such parts. (Incremental Copying is illustrate in Figure 8). While in MUSE Incremental copying (rather than Optimize incremental copying) results in very little space being copie that gets immeiately backtracke over, in ACE this may not be the case, as ACE supports an-parallelism an follows the sharing reqmrement to make noes public. Consier the following scenario: suppose as before that T2 tries to steal alternative h.2 from choicepoint h, an that TI ha not yet foun a solution for goal i. In this case, TI will not make the choice points of all the branches to the right of i public (that is, choice points create from the goal in the example). This is for two main reasons. Firstly, an as mentione when presenting the sharing requirement, work available from these choice points will be very speculative, as i may yet fail (possibly after computing for a long time) an all the work in copying these branches an picking work from them may therefore be waste. Secon, making these choice points public will lea to mixing of public an private parts of the logical search tree 10. For instance, if execution of i (or the CGE's continuation) in TI was to lea to further choice points, they will initially be private an henee not be visible to other teams, although choice points of goals to the right of i (such as ) will Note that they are alreay mixe in the physical stacks. be public. Thus, uring normal backtracking through a CGE private choice points might be encountere after a processor has backtracke into the public área (or worse yet, if another team steals alternative from an later backtracks, it will not see parts of i at all since they were never copie because of being private to TI). This mixing of public an private áreas of the logical search tree, thus, will result in complications in scheuling. Henee, choice points in goals to the right of an incomplete goal in a CGE are never mae public. As a consequence of this, Incremental copying will en up copying all the private goals, that may form a large part of the tree, an immeiately escaring them (by backtracking over them), which is clearly a waste. Therefore, it makes sense to use Optimize copying for ACE, although it is more complicate to implement. It is not very clear, on the other sie, which incremental copying approach (Incremental or Optimize) will reuce the synchronization time between teams TI an T2. However, it is obvious that Incremental copying is the simpler of the two: in the case of Incremental copying TI has to synchronize for the uration of the copying of the ifference, while in the Optimize case we first have to figure out the limits of copying for each processor stack in TI (which may require a complete traversal of the an-parallel tree compute by TI) an then o the copying. Optimize incremental copying for ACE for the an-or tree shown in Figure 8 is illustrate in Figure 9. In ACE we propose to use both Incremental an Optimize incremental copying epening on the situation. The following heuristic tries to balance the excessive unnecessary copying (Incremental copying) an the excessive synchronization time (Optimize copying) by ynamically etecting which of the two options may give the best results. If the choicepoint from which the alternative is being stolen is outsie the scope of any CGE, or it is in the scope of some CGEs an all these CGEs have foun a solution (i.e. each subgoal of each CGE has alreay foun a solution) in the team from which the alternative is being stolen, then Incremental copying will be aopte. Otherwise, Optimize incremental copying will be use. As mentione, Optimize copying requires traversal of all CGEs in which the choicepoint is neste an obtaining the aresses of input-markers an en-markers [18, 16, 13] for each goal in these CGEs. From these aresses an the information in Parcall frames we etermine the useful part of the various stacks to be copie. Finally, note that all processors in a team can cooperate to spee up etecting the áreas to be copie an copying of stack segments from one team to another.

12 choicepoint en of parcall b & c & b 1 l\ j&k b b&c& a Pl Ai g g&h&i c P2 c i i k 1 h i h i P3 TeamTl=(Pl,P2,P3); TI computes itir b&c& a P4 g&h&i P5 1 k J h i h i P6 Team T2 = (P4, P5, P6) T2, initially positione at choice point b, steals alternative g2 from g, oes blin incremental copying, backtracks over, c, i, gl, an h (creating hole in stack of P6, shown shae), computes g2, an restarts all goals to right (h, i, an ). The state of stacks is shown after all this is one. Figure 8: Incremental Copying in ACE b! j&k k b b&c& a Pl 9! 9 g&h&i c P2 <*! 1 k 1 h! h! P3 Fig. (i) TI computes along gl TeamTl=(Pl,P2, P3) T&V b&c& 9 g&h&i P4 P5 P6 Fig (iv) T2 uses optimize incremental copying to pick h2 from T1 WC b&c& 9 g&h&i! b h! h i P4 P5 P6 Fig (ii) T2 steals from g an computes along g2 Team T2 = (P4, P5, P6) Only this part is copie (the frames below CGE (g&h&i) up to choicept h). D- Blin incremental copying, in aition to the copying one here, will also copy the stuff above g in P2, an above hl in P3 an then backtrack over it. Also, note that useless space (holes) might have some useful information copie in them later (e.g., stack frame h in figure iv to the left). T&^r b&c& g&h&i P4 P5 P6 Fig (iii) T2 backtracks to g! b! j&k i\ b b&c& a h 2 *i h. h ««! c i i 9 g&h&i c P4 P5 P6 Fig (v) T2 fins solution h2 for h an recomputes i, an. Figure 9: Optimize Incremental Copying in ACE

13 5.3 Implementing Sie-effects an Cuts in the ACE Moel One avantage of an an-or parallel moel that recomputes inepenent goals is that since it closely mirrors traitional Prolog execution it can quite easily support full Prolog, i.e. support the execution of orer sensitive preicates such as sie-effect preicates (e.g. rea, write, assert, retract, an calis to ynamic preicates) an cut (!). Essentially, a sie-effect preicate (sep for brevity) shoul be execute only after the sep "preceing" it (preceing in the sense of left-to-right, top-to-bottom Prolog orer) has finishe execution. If the preceing sep has not been execute, the current sep shoul suspen, an resume after the execution of the preceing sep is over. However, given a sep, etermining the sep that "precees" it is akin to solving the halting problem, an therefore the knowlege that the preceing sep has finishe has to be approximate. For example, consier supporting seps in purely or-parallel systems [15]. Here, the preceing sep is assume to be finishe if the or-branch containing it has finishe execution. In other wors, a sep is execute only when the branch containing it becomes the leftmost in the or-parallel tree. Likewise, in purely inepenent an-parallel system, such as &-Prolog, a sep encountere in an inepenent an-parallel goal g in a CGE C is execute only after all the inepenent an-parallel goals to the left of g in the CGE C have finishe execution. If the CGE C containing the goal g is neste insie a goal h, which is an inepenent an-parallel goal in another CGE D, then all the inepenent an-parallel goals in CGE D that are to the left of goal h shoul have finishe, an so on. We can combine the conitions for executing seps in a purely or-parallel system with those for a purely an-parallel system to genérate the conitions for executing a sep in an an-or parallel system such as ACE. Given a CGE (con =$ g\ 8... & g &... & g ), where we assume that the parallel execution of goal < r leas to a sie-effect, the conitions uner which this sie-effect will be execute are given below. Note that the goal g is being recompute in response to Solutions si... s _i that will be foun for goals g\... <7 _i respectively. Let b\... 6 _i be the or-branches in respective search trees of goals g\... <7 _i that lea to these solutions. The conitions are as follows: (i) The or-branch that contains the sep in the search tree of goal g shoul become leftmost n. (ii) The computation of solutions si... s _i shoul have finishe; an the or-branches _i shoul be leftmost in the search tree of their respective goals gi... <7 _i. (iii) If the CGE containing < r is neste insie another CGE then conitions (i) an (ii) must recursively with respect to the equivalent or-parallel tree hol for the inner CGE with respect to the outer CGE. If the CGE is not neste insie other CGEs, then the or-branch in which it appears shoul be leftmost with respect to the root of the whole computation tree. In the rest of this section we present a concrete technique for etermining when a sep's turn for execution has come uring an-or parallel execution. The techniques make use of techniques evelope for &-Prolog [7, 21, 5], MUSE [3], an Aurora [9]. For simplicity, an without loss of generality, we assume that when a processor reaches a sep it repeately performs the above check until it succees (thus the processor busy-waits rather than suspens). However, suspensión woul be use in practice 12 so that the processor that encountere the sep, rather than busy-waiting an wasting cpu-cycles, can o useful work elsewhere Sie-Effects in ACE Note that while verifying the above conitions to check if a sie-effect can be execute, processors nee to access share choice points recore in the share memory (to o the leftmost check). This can be expensive, especially in a non-share memory or a hybri multiprocessor system. One can reuce the number of accesses to share memory by first requiring a processor that has reache a sie-effect to: (a) check if all goals to the left of < r in the current CGE, an those to the left in all the ancestor CGEs have prouce a solution (first part of conition (ii), an conition (iii)) (b) check if the sie-effect is in the leftmost branch, an the solutions to preceing goals in all the CGEs are in leftmost branches (conition (i), secon part of conition (ii), an conition (iii)). Note that check (a) oes not require access to the share área, it is performe wholly within the aress space of the team executing the sie-effect. Check (b) will be mae only after check (a) succees, thus reucing the number of accesses to share área. The above ecomposition also neatly separates the anparallel an the or-parallel components of the check. Both checks (a) an (b) must be implemente efficiently, particularly check (a) since it is going to be performe more often Implementation of suspensión oes not present problems in &-Prolog. Techniques for implementing suspensión more efficiently in MUSE by storing the ifference between the suspene branch an the one that the processor switches to have recently also been evelope by the MUSE group. These techniques can be aapte for ACE. 13 Inee, check (a) can be implemente quite emciently since the appropriate information about the status of an-parallel goals is maintainein the CGE's escriptor, an therefore, performing check (a) involves a simple look-up of the corresponing parcall frame(s).

14 The presence of the sharing requirements allows to sepárate the sie-effect checks for or an anparallelism in a ifferent way. In fact the sharing requirements guarantee that all the branches on the left of a public choice point are complete (otherwise the choice point woul not satisfy the requirements uring the sharing operation). Because of this we nee not perform the check (a) in the public part of the tree. Furthermore, in the private part of the tree the check (b) is unnecessary since no sharing operations have been performe (the sie effect is for sure in the leftmost branch). Thanks to these observations, if P is the bottommost public noe in the current branch, then we can organize the sie-effect check as follows: (1) apply check (a) only to the subtree roote at P; (2) apply check (b) only to the public part of the tree above P. Two main algorithms have been propose to hanle sie-effects in inepenent an-parallel systems (like &-Prolog): synchronization blocks[7, 21], an visiting each ancestor CGE an checking if goals to the left have finishe [5]. Either one of these can be use for performing check (a). To check if a given noe is in the leftmost branch of a given subtree, we nee access to the left sibling noes of the immeiate ancestor choice point noes (given a noe, if the choice point noe above it oesn't have any left siblings, the noe is in leftmost branch of the subtree roote at that choice point). However, the sibling-noes of a choice point are not irectly accessible to a team oing the check, therefore we have to use some other technique to etermine this. The technique that we use parallels the technique propose for MUSE [3]. We use the fact that part of the choice point in ACE is share, an henee the fiels in the share part of choice-point are visible to all teams. Each share choice-point in ACE inclues a teamsbitmap, (from MUSE's workersbitmap). The teamsbitmap inicate which teams are exploiting alternatives of that noe. When the th alternative is picke by a team from a noe, the th bit in the teamsbitmap is set. When the subtree corresponing to the th alternative has been completely explore an backtracke through, the th bit is reset. In the alt-number fiel in the private part of the choice-point, a team also recors the alternative number which it picke from this choice point. Note that the alt-number fiel will oceupy the same memory aress in the aress space of each team that is working below this choice point. The algorithm for verifying leftmostness is thus as follows: the team goes up the execution tree; whenever it reaches a share choice-point it looks at the corresponing teamsbitmap; if there are other teams that are working on the alternatives of this noe, the corresponing images of the choice-point are checke in the aress spaces of these teams to see if the current branch is leftmost. This is one by a a simple arithmetic comparison of the alt-number fiels in these choice points. Note that while checking for leftmostness of sie-effect goals an solutions of goals to the left in a CGE etc. we are only concerne with etermining leftmostness of noes in the subtree of a goal in the CGE (local leftmostness), an not in the whole program search tree (global leftmostness). Several optimizations are necessary to make this algorithm efficient. Firstly, if two teams share a choicepoint N, they will also share all ancestor noes. Thus, one nees to compare two teams only once for the youngest noe they share. Seconly, as in the Aurora scheulers (an propose as an optimization in MUSE), one can keep track of the current noe up to where a worker or team is leftmost. Finally, we can completely avoi accessing any remote choice point by storing in each share frame a bitmap which inicates for each alternative in the choice point whether there is at least one active team working on that alternative Implementing Cut in the ACE Moel The effect of a cut is to prune all branches to the right of the path from the place where the cut is execute to the noe where the clause containing the cut was introuce (cut level). Henee, because a cut can only cut up to the current CGE, conition (iii) is always trivially satisfie. The treatment of cuts is similar to that of sie-effects except that in the case of cut some action can be taken (i.e. some pruning can be one) without the cut becoming leftmost in the entire tree [15]. Basically, a cut can be immeiately exercise in the subtree in which that cut is leftmost. Other branches can be prune only after the cut becomes leftmost in the entire tree. Thus, in ACE when a cut is encountere pruning can be immeiately one up to the point where conitions (i) an (ii) above succee. To prune other parts the team has to wait until conition (i) an (ii) are satisfie right up to the root noe, i.e. the cut becomes a global leftmost. Note that pruning a choice-point consists of clearing that choicepoint an signaling any teams exploring alternatives to the right to terminate execution. Termination of execution by a team means that all the processors in the team abanon execution an backtrack. The efficient techniques use to eal with cuts in or-parallel systems (like those of MUSE [3] an Aurora [9]) can be aapte to ACE. 6 Efficiency an Generality of the ACE Moel We believe that an implementation of the ACE moel will be quite an efficient realization of an or-

15 an inepenent an-parallel system. This is primarily because, as may alreay be evient, in the presence of only or-parallelism ACE will be as efficient as MUSE, while in the presence of only inepenent an-parallelism it will be as efficient as &- Prolog. Therefore, it appears clear that having an ACE system woul be, at least, as powerful an efficient as having both a MUSE an an &-Prolog system, in the sense that now a single system will run or-parallel only programs an an-parallel only programs with similar performance as the MUSE an &- Prolog systems respectively. ACE shoul also combine speeups from programs where both or- an inepenent an-parallelism are available, henee performing even better than the best of MUSE or &-Prolog for such applications. Note that with respect to MUSE, the parts that are copie in an-or parallel execution in ACE for a given program are exactly those that will be copie by MUSE in an equivalent purely orparallel execution of the same program, but, whereas MUSE will copy one large stack segment at any given time, by exploiting inepenent an-parallelism, ACE may sprea this segment over many memory segments in the aress space of the team. This may in principie a some overhea to the copying cost (since many small segments rather than one large segment may have to be copie). However, because each team has múltiple processors, the copying of múltiple segments can be one in parallel. With respect to &- Prolog, ACE oes not introuce any new overheas. The only inefficieney present in the ACE moel is with respect to memory consumption, but that cannot be avoie if we want to use stack-copying for representation of múltiple environments. Given that memory is inexpensive, we hope that this will not be such a big bottleneck. Another important point that shoul be note is that the approach outline in this paper for implementing an-or parallel systems, while presente in terms of combining the types of parallelism present in MUSE an &-Prolog, is actually quite general, an can be applie to implement other systems that exploit an- an or-parallelism, such as Anorra- I [6], Prometheus [24], an IDIOM [11]. It is quite easy to see how Anorra-I, a system that exploits or-parallelism an eterminate epenent anparallelism, can be implemente (the implementation of Anorra-I by Yang, Santos Costa, an Warren is base on bining arrays) using stack-copying. In Anorra-I there is no or-parallelism within anparallel goals since only eterministic goals can be processe in an-parallel (thus it reuces to the case escribe in section 2), thus an-parallel execution can be performe by each team locally. Or-parallelism will be implemente using stack-copying an the memorymanagement scheme escribe above. Likewise, Prometheus [24], a system that exploits or-parallelism an non-eterminate epenent an-parallelism (with no coroutining) by extening CGEs, can be easily implemente using the ACE scheme. In fact, since the DAS-WAM abstract machine on which Prometheus is base is itself base on that of &-Prolog no extra measures nee to be taken apart from those neee to support epenent an-parallelism, which are for the most part orthogonal to the issues ealt with by ACE. IDIOM, which as inepenent an-parallelism to Anorra-I, can also be implemente using the ACE approach. Its implementation can be thought of as a combination of the ACE an Anorra-I implementations, an, again, is straightforwar to erive. 7 Conclusions In this paper, we presente ACE, a moel capable of exploiting both non-eterministic an-parallelism an or-parallelism. We have iscusse both high-level an low level implementation issues an shown how using recomputation the scheme can incorpórate sieeffeets an support Prolog as the user language easily. We have shown how ACE subsumes two of the most successful approaches for exploiting parallelism in logic programming (MUSE an &-Prolog). We have argue how the resulting system has a goo potential for low sequential overhea, can be implemente in a reasonably easy way by extening existing systems, an retains the avantages of both purely or-parallel systems as well as (even noneterministic) purely an-parallel systems. A collaborative implementation of ACE on Sequent an other multiprocessors is uner way at New México State University an University of Mari (UPM). References [1] K.A.M. Ali. Or-parallel Execution of Prolog on the BC-Machine. In Fifth International Con-

Online Appendix to: Generalizing Database Forensics

Online Appendix to: Generalizing Database Forensics Online Appenix to: Generalizing Database Forensics KYRIACOS E. PAVLOU an RICHARD T. SNODGRASS, University of Arizona This appenix presents a step-by-step iscussion of the forensic analysis protocol that

More information

CS 106 Winter 2016 Craig S. Kaplan. Module 01 Processing Recap. Topics

CS 106 Winter 2016 Craig S. Kaplan. Module 01 Processing Recap. Topics CS 106 Winter 2016 Craig S. Kaplan Moule 01 Processing Recap Topics The basic parts of speech in a Processing program Scope Review of syntax for classes an objects Reaings Your CS 105 notes Learning Processing,

More information

And-Or Parallel Prolog: A Recomputation Based Approachf

And-Or Parallel Prolog: A Recomputation Based Approachf And-Or Parallel Prolog: A Recomputation Based Approachf Gopal Gupta Department of Computer Science Box 30001, Dept. CS, New México State University Las Cruces, NM 88003, USA guptaonmsu.edu Manuel V. Hermenegildo

More information

Computer Organization

Computer Organization Computer Organization Douglas Comer Computer Science Department Purue University 250 N. University Street West Lafayette, IN 47907-2066 http://www.cs.purue.eu/people/comer Copyright 2006. All rights reserve.

More information

The Reconstruction of Graphs. Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune , India. Abstract

The Reconstruction of Graphs. Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune , India. Abstract The Reconstruction of Graphs Dhananay P. Mehenale Sir Parashurambhau College, Tila Roa, Pune-4030, Inia. Abstract In this paper we iscuss reconstruction problems for graphs. We evelop some new ieas lie

More information

Message Transport With The User Datagram Protocol

Message Transport With The User Datagram Protocol Message Transport With The User Datagram Protocol User Datagram Protocol (UDP) Use During startup For VoIP an some vieo applications Accounts for less than 10% of Internet traffic Blocke by some ISPs Computer

More information

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control Almost Disjunct Coes in Large Scale Multihop Wireless Network Meia Access Control D. Charles Engelhart Anan Sivasubramaniam Penn. State University University Park PA 682 engelhar,anan @cse.psu.eu Abstract

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. Preface Here are my online notes for my Calculus I course that I teach here at Lamar University. Despite the fact that these are my class notes, they shoul be accessible to anyone wanting to learn Calculus

More information

A Plane Tracker for AEC-automation Applications

A Plane Tracker for AEC-automation Applications A Plane Tracker for AEC-automation Applications Chen Feng *, an Vineet R. Kamat Department of Civil an Environmental Engineering, University of Michigan, Ann Arbor, USA * Corresponing author (cforrest@umich.eu)

More information

Non-homogeneous Generalization in Privacy Preserving Data Publishing

Non-homogeneous Generalization in Privacy Preserving Data Publishing Non-homogeneous Generalization in Privacy Preserving Data Publishing W. K. Wong, Nios Mamoulis an Davi W. Cheung Department of Computer Science, The University of Hong Kong Pofulam Roa, Hong Kong {wwong2,nios,cheung}@cs.hu.h

More information

Chapter 9 Memory Management

Chapter 9 Memory Management Contents 1. Introuction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threas 6. CPU Scheuling 7. Process Synchronization 8. Dealocks 9. Memory Management 10.Virtual Memory

More information

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Queueing Moel an Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Marc Aoun, Antonios Argyriou, Philips Research, Einhoven, 66AE, The Netherlans Department of Computer an Communication

More information

Coupling the User Interfaces of a Multiuser Program

Coupling the User Interfaces of a Multiuser Program Coupling the User Interfaces of a Multiuser Program PRASUN DEWAN University of North Carolina at Chapel Hill RAJIV CHOUDHARY Intel Corporation We have evelope a new moel for coupling the user-interfaces

More information

Frequent Pattern Mining. Frequent Item Set Mining. Overview. Frequent Item Set Mining: Motivation. Frequent Pattern Mining comprises

Frequent Pattern Mining. Frequent Item Set Mining. Overview. Frequent Item Set Mining: Motivation. Frequent Pattern Mining comprises verview Frequent Pattern Mining comprises Frequent Pattern Mining hristian Borgelt School of omputer Science University of Konstanz Universitätsstraße, Konstanz, Germany christian.borgelt@uni-konstanz.e

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu Institute of Information Science Acaemia Sinica Taipei, Taiwan Da-wei Wang Jan-Jan Wu Institute of Information Science

More information

PART 2. Organization Of An Operating System

PART 2. Organization Of An Operating System PART 2 Organization Of An Operating System CS 503 - PART 2 1 2010 Services An OS Supplies Support for concurrent execution Facilities for process synchronization Inter-process communication mechanisms

More information

Divide-and-Conquer Algorithms

Divide-and-Conquer Algorithms Supplment to A Practical Guie to Data Structures an Algorithms Using Java Divie-an-Conquer Algorithms Sally A Golman an Kenneth J Golman Hanout Divie-an-conquer algorithms use the following three phases:

More information

Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien

Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama an Hayato Ohwaa Faculty of Sci. an Tech. Tokyo University of Science, 2641 Yamazaki, Noa-shi, CHIBA, 278-8510, Japan hiroyuki@rs.noa.tus.ac.jp,

More information

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2 This paper appears in J. of Parallel an Distribute Computing 10 (1990), pp. 167 181. Intensive Hypercube Communication: Prearrange Communication in Link-Boun Machines 1 2 Quentin F. Stout an Bruce Wagar

More information

Recitation Caches and Blocking. 4 March 2019

Recitation Caches and Blocking. 4 March 2019 15-213 Recitation Caches an Blocking 4 March 2019 Agena Reminers Revisiting Cache Lab Caching Review Blocking to reuce cache misses Cache alignment Reminers Due Dates Cache Lab (Thursay 3/7) Miterm Exam

More information

Skyline Community Search in Multi-valued Networks

Skyline Community Search in Multi-valued Networks Syline Community Search in Multi-value Networs Rong-Hua Li Beijing Institute of Technology Beijing, China lironghuascut@gmail.com Jeffrey Xu Yu Chinese University of Hong Kong Hong Kong, China yu@se.cuh.eu.h

More information

Shift-map Image Registration

Shift-map Image Registration Shift-map Image Registration Svärm, Linus; Stranmark, Petter Unpublishe: 2010-01-01 Link to publication Citation for publishe version (APA): Svärm, L., & Stranmark, P. (2010). Shift-map Image Registration.

More information

Supporting Fully Adaptive Routing in InfiniBand Networks

Supporting Fully Adaptive Routing in InfiniBand Networks XIV JORNADAS DE PARALELISMO - LEGANES, SEPTIEMBRE 200 1 Supporting Fully Aaptive Routing in InfiniBan Networks J.C. Martínez, J. Flich, A. Robles, P. López an J. Duato Resumen InfiniBan is a new stanar

More information

Learning Subproblem Complexities in Distributed Branch and Bound

Learning Subproblem Complexities in Distributed Branch and Bound Learning Subproblem Complexities in Distribute Branch an Boun Lars Otten Department of Computer Science University of California, Irvine lotten@ics.uci.eu Rina Dechter Department of Computer Science University

More information

6.823 Computer System Architecture. Problem Set #3 Spring 2002

6.823 Computer System Architecture. Problem Set #3 Spring 2002 6.823 Computer System Architecture Problem Set #3 Spring 2002 Stuents are strongly encourage to collaborate in groups of up to three people. A group shoul han in only one copy of the solution to the problem

More information

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources An Algorithm for Builing an Enterprise Network Topology Using Wiesprea Data Sources Anton Anreev, Iurii Bogoiavlenskii Petrozavosk State University Petrozavosk, Russia {anreev, ybgv}@cs.petrsu.ru Abstract

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks TR-IIS-05-021 Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu, Pangfeng Liu, Da-Wei Wang, Jan-Jan Wu December 2005 Technical Report No. TR-IIS-05-021 http://www.iis.sinica.eu.tw/lib/techreport/tr2005/tr05.html

More information

Chapter 5 Proposed models for reconstituting/ adapting three stereoscopes

Chapter 5 Proposed models for reconstituting/ adapting three stereoscopes Chapter 5 Propose moels for reconstituting/ aapting three stereoscopes - 89 - 5. Propose moels for reconstituting/aapting three stereoscopes This chapter offers three contributions in the Stereoscopy area,

More information

Loop Scheduling and Partitions for Hiding Memory Latencies

Loop Scheduling and Partitions for Hiding Memory Latencies Loop Scheuling an Partitions for Hiing Memory Latencies Fei Chen Ewin Hsing-Mean Sha Dept. of Computer Science an Engineering University of Notre Dame Notre Dame, IN 46556 Email: fchen,esha @cse.n.eu Tel:

More information

AnyTraffic Labeled Routing

AnyTraffic Labeled Routing AnyTraffic Labele Routing Dimitri Papaimitriou 1, Pero Peroso 2, Davie Careglio 2 1 Alcatel-Lucent Bell, Antwerp, Belgium Email: imitri.papaimitriou@alcatel-lucent.com 2 Universitat Politècnica e Catalunya,

More information

Overview. Operating Systems I. Simple Memory Management. Simple Memory Management. Multiprocessing w/fixed Partitions.

Overview. Operating Systems I. Simple Memory Management. Simple Memory Management. Multiprocessing w/fixed Partitions. Overview Operating Systems I Management Provie Services processes files Manage Devices processor memory isk Simple Management One process in memory, using it all each program nees I/O rivers until 96 I/O

More information

1 Surprises in high dimensions

1 Surprises in high dimensions 1 Surprises in high imensions Our intuition about space is base on two an three imensions an can often be misleaing in high imensions. It is instructive to analyze the shape an properties of some basic

More information

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization 1 Offloaing Cellular Traffic through Opportunistic Communications: Analysis an Optimization Vincenzo Sciancalepore, Domenico Giustiniano, Albert Banchs, Anreea Picu arxiv:1405.3548v1 [cs.ni] 14 May 24

More information

A Neural Network Model Based on Graph Matching and Annealing :Application to Hand-Written Digits Recognition

A Neural Network Model Based on Graph Matching and Annealing :Application to Hand-Written Digits Recognition ITERATIOAL JOURAL OF MATHEMATICS AD COMPUTERS I SIMULATIO A eural etwork Moel Base on Graph Matching an Annealing :Application to Han-Written Digits Recognition Kyunghee Lee Abstract We present a neural

More information

PALS: Efficient Or-Parallel Execution of Prolog on Beowulf Clusters

PALS: Efficient Or-Parallel Execution of Prolog on Beowulf Clusters PALS: Efficient Or-Parallel Execution of Prolog on Beowulf Clusters K. Villaverde and E. Pontelli H. Guo G. Gupta Dept. Computer Science Dept. Computer Science Dept. Computer Science New Mexico State University

More information

Preamble. Singly linked lists. Collaboration policy and academic integrity. Getting help

Preamble. Singly linked lists. Collaboration policy and academic integrity. Getting help CS2110 Spring 2016 Assignment A. Linke Lists Due on the CMS by: See the CMS 1 Preamble Linke Lists This assignment begins our iscussions of structures. In this assignment, you will implement a structure

More information

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks : a Movement-Base Routing Algorithm for Vehicle A Hoc Networks Fabrizio Granelli, Senior Member, Giulia Boato, Member, an Dzmitry Kliazovich, Stuent Member Abstract Recent interest in car-to-car communications

More information

Using Vector and Raster-Based Techniques in Categorical Map Generalization

Using Vector and Raster-Based Techniques in Categorical Map Generalization Thir ICA Workshop on Progress in Automate Map Generalization, Ottawa, 12-14 August 1999 1 Using Vector an Raster-Base Techniques in Categorical Map Generalization Beat Peter an Robert Weibel Department

More information

Questions? Post on piazza, or Radhika (radhika at eecs.berkeley) or Sameer (sa at berkeley)!

Questions? Post on piazza, or  Radhika (radhika at eecs.berkeley) or Sameer (sa at berkeley)! EE122 Fall 2013 HW3 Instructions Recor your answers in a file calle hw3.pf. Make sure to write your name an SID at the top of your assignment. For each problem, clearly inicate your final answer, bol an

More information

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation Solution Representation for Job Shop Scheuling Problems in Ant Colony Optimisation James Montgomery, Carole Faya 2, an Sana Petrovic 2 Faculty of Information & Communication Technologies, Swinburne University

More information

Politehnica University of Timisoara Mobile Computing, Sensors Network and Embedded Systems Laboratory. Testing Techniques

Politehnica University of Timisoara Mobile Computing, Sensors Network and Embedded Systems Laboratory. Testing Techniques Politehnica University of Timisoara Mobile Computing, Sensors Network an Embee Systems Laboratory ing Techniques What is testing? ing is the process of emonstrating that errors are not present. The purpose

More information

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method Southern Cross University epublications@scu 23r Australasian Conference on the Mechanics of Structures an Materials 214 Transient analysis of wave propagation in 3D soil by using the scale bounary finite

More information

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation DEIM Forum 2018 I4-4 Abstract Ranom Clustering for Multiple Sampling Units to Spee Up Run-time Sample Generation uzuru OKAJIMA an Koichi MARUAMA NEC Solution Innovators, Lt. 1-18-7 Shinkiba, Koto-ku, Tokyo,

More information

Comparison of Methods for Increasing the Performance of a DUA Computation

Comparison of Methods for Increasing the Performance of a DUA Computation Comparison of Methos for Increasing the Performance of a DUA Computation Michael Behrisch, Daniel Krajzewicz, Peter Wagner an Yun-Pang Wang Institute of Transportation Systems, German Aerospace Center,

More information

A shortest path algorithm in multimodal networks: a case study with time varying costs

A shortest path algorithm in multimodal networks: a case study with time varying costs A shortest path algorithm in multimoal networks: a case stuy with time varying costs Daniela Ambrosino*, Anna Sciomachen* * Department of Economics an Quantitative Methos (DIEM), University of Genoa Via

More information

CS269I: Incentives in Computer Science Lecture #8: Incentives in BGP Routing

CS269I: Incentives in Computer Science Lecture #8: Incentives in BGP Routing CS269I: Incentives in Computer Science Lecture #8: Incentives in BGP Routing Tim Roughgaren October 19, 2016 1 Routing in the Internet Last lecture we talke about elay-base (or selfish ) routing, which

More information

On the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems

On the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems On the Role of Multiply Sectione Bayesian Networks to Cooperative Multiagent Systems Y. Xiang University of Guelph, Canaa, yxiang@cis.uoguelph.ca V. Lesser University of Massachusetts at Amherst, USA,

More information

Study of Network Optimization Method Based on ACL

Study of Network Optimization Method Based on ACL Available online at www.scienceirect.com Proceia Engineering 5 (20) 3959 3963 Avance in Control Engineering an Information Science Stuy of Network Optimization Metho Base on ACL Liu Zhian * Department

More information

State Indexed Policy Search by Dynamic Programming. Abstract. 1. Introduction. 2. System parameterization. Charles DuHadway

State Indexed Policy Search by Dynamic Programming. Abstract. 1. Introduction. 2. System parameterization. Charles DuHadway State Inexe Policy Search by Dynamic Programming Charles DuHaway Yi Gu 5435537 503372 December 4, 2007 Abstract We consier the reinforcement learning problem of simultaneous trajectory-following an obstacle

More information

Learning convex bodies is hard

Learning convex bodies is hard Learning convex boies is har Navin Goyal Microsoft Research Inia navingo@microsoftcom Luis Raemacher Georgia Tech lraemac@ccgatecheu Abstract We show that learning a convex boy in R, given ranom samples

More information

Shift-map Image Registration

Shift-map Image Registration Shift-map Image Registration Linus Svärm Petter Stranmark Centre for Mathematical Sciences, Lun University {linus,petter}@maths.lth.se Abstract Shift-map image processing is a new framework base on energy

More information

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks 1 Backpressure-base Packet-by-Packet Aaptive Routing in Communication Networks Eleftheria Athanasopoulou, Loc Bui, Tianxiong Ji, R. Srikant, an Alexaner Stolyar Abstract Backpressure-base aaptive routing

More information

Handling missing values in kernel methods with application to microbiology data

Handling missing values in kernel methods with application to microbiology data an Machine Learning. Bruges (Belgium), 24-26 April 2013, i6oc.com publ., ISBN 978-2-87419-081-0. Available from http://www.i6oc.com/en/livre/?gcoi=28001100131010. Hanling missing values in kernel methos

More information

Research Article Inviscid Uniform Shear Flow past a Smooth Concave Body

Research Article Inviscid Uniform Shear Flow past a Smooth Concave Body International Engineering Mathematics Volume 04, Article ID 46593, 7 pages http://x.oi.org/0.55/04/46593 Research Article Invisci Uniform Shear Flow past a Smooth Concave Boy Abullah Mura Department of

More information

Coordinating Distributed Algorithms for Feature Extraction Offloading in Multi-Camera Visual Sensor Networks

Coordinating Distributed Algorithms for Feature Extraction Offloading in Multi-Camera Visual Sensor Networks Coorinating Distribute Algorithms for Feature Extraction Offloaing in Multi-Camera Visual Sensor Networks Emil Eriksson, György Dán, Viktoria Foor School of Electrical Engineering, KTH Royal Institute

More information

Lecture 1 September 4, 2013

Lecture 1 September 4, 2013 CS 84r: Incentives an Information in Networks Fall 013 Prof. Yaron Singer Lecture 1 September 4, 013 Scribe: Bo Waggoner 1 Overview In this course we will try to evelop a mathematical unerstaning for the

More information

NAND flash memory is widely used as a storage

NAND flash memory is widely used as a storage 1 : Buffer-Aware Garbage Collection for Flash-Base Storage Systems Sungjin Lee, Dongkun Shin Member, IEEE, an Jihong Kim Member, IEEE Abstract NAND flash-base storage evice is becoming a viable storage

More information

Variable Independence and Resolution Paths for Quantified Boolean Formulas

Variable Independence and Resolution Paths for Quantified Boolean Formulas Variable Inepenence an Resolution Paths for Quantifie Boolean Formulas Allen Van Geler http://www.cse.ucsc.eu/ avg University of California, Santa Cruz Abstract. Variable inepenence in quantifie boolean

More information

Improving Performance of Sparse Matrix-Vector Multiplication

Improving Performance of Sparse Matrix-Vector Multiplication Improving Performance of Sparse Matrix-Vector Multiplication Ali Pınar Michael T. Heath Department of Computer Science an Center of Simulation of Avance Rockets University of Illinois at Urbana-Champaign

More information

PART 5. Process Coordination And Synchronization

PART 5. Process Coordination And Synchronization PART 5 Process Coorination An Synchronization CS 503 - PART 5 1 2010 Location Of Process Coorination In The Xinu Hierarchy CS 503 - PART 5 2 2010 Coorination Of Processes Necessary in a concurrent system

More information

Figure 1: 2D arm. Figure 2: 2D arm with labelled angles

Figure 1: 2D arm. Figure 2: 2D arm with labelled angles 2D Kinematics Consier a robotic arm. We can sen it commans like, move that joint so it bens at an angle θ. Once we ve set each joint, that s all well an goo. More interesting, though, is the question of

More information

Considering bounds for approximation of 2 M to 3 N

Considering bounds for approximation of 2 M to 3 N Consiering bouns for approximation of to (version. Abstract: Estimating bouns of best approximations of to is iscusse. In the first part I evelop a powerseries, which shoul give practicable limits for

More information

Architecture Design of Mobile Access Coordinated Wireless Sensor Networks

Architecture Design of Mobile Access Coordinated Wireless Sensor Networks Architecture Design of Mobile Access Coorinate Wireless Sensor Networks Mai Abelhakim 1 Leonar E. Lightfoot Jian Ren 1 Tongtong Li 1 1 Department of Electrical & Computer Engineering, Michigan State University,

More information

WLAN Indoor Positioning Based on Euclidean Distances and Fuzzy Logic

WLAN Indoor Positioning Based on Euclidean Distances and Fuzzy Logic WLAN Inoor Positioning Base on Eucliean Distances an Fuzzy Logic Anreas TEUBER, Bern EISSFELLER Institute of Geoesy an Navigation, University FAF, Munich, Germany, e-mail: (anreas.teuber, bern.eissfeller)@unibw.e

More information

Improving Spatial Reuse of IEEE Based Ad Hoc Networks

Improving Spatial Reuse of IEEE Based Ad Hoc Networks mproving Spatial Reuse of EEE 82.11 Base A Hoc Networks Fengji Ye, Su Yi an Biplab Sikar ECSE Department, Rensselaer Polytechnic nstitute Troy, NY 1218 Abstract n this paper, we evaluate an suggest methos

More information

Compiler Optimisation

Compiler Optimisation Compiler Optimisation Michael O Boyle mob@inf.e.ac.uk Room 1.06 January, 2014 1 Two recommene books for the course Recommene texts Engineering a Compiler Engineering a Compiler by K. D. Cooper an L. Torczon.

More information

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH Galen H Sasaki Dept Elec Engg, U Hawaii 2540 Dole Street Honolul HI 96822 USA Ching-Fong Su Fuitsu Laboratories of America 595 Lawrence Expressway

More information

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means Classifying Facial Expression with Raial Basis Function Networks, using Graient Descent an K-means Neil Allrin Department of Computer Science University of California, San Diego La Jolla, CA 9237 nallrin@cs.ucs.eu

More information

Using Ray Tracing for Site-Specific Indoor Radio Signal Strength Analysis 1

Using Ray Tracing for Site-Specific Indoor Radio Signal Strength Analysis 1 Using Ray Tracing for Site-Specific Inoor Raio Signal Strength Analysis 1 Michael Ni, Stephen Mann, an Jay Black Computer Science Department, University of Waterloo, Waterloo, Ontario, NL G1, Canaa Abstract

More information

On Effectively Determining the Downlink-to-uplink Sub-frame Width Ratio for Mobile WiMAX Networks Using Spline Extrapolation

On Effectively Determining the Downlink-to-uplink Sub-frame Width Ratio for Mobile WiMAX Networks Using Spline Extrapolation On Effectively Determining the Downlink-to-uplink Sub-frame With Ratio for Mobile WiMAX Networks Using Spline Extrapolation Panagiotis Sarigianniis, Member, IEEE, Member Malamati Louta, Member, IEEE, Member

More information

Additional Divide and Conquer Algorithms. Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication

Additional Divide and Conquer Algorithms. Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication Aitional Divie an Conquer Algorithms Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication Divie an Conquer Closest Pair Let s revisit the closest pair problem. Last

More information

Modifying ROC Curves to Incorporate Predicted Probabilities

Modifying ROC Curves to Incorporate Predicted Probabilities Moifying ROC Curves to Incorporate Preicte Probabilities Cèsar Ferri DSIC, Universitat Politècnica e València Peter Flach Department of Computer Science, University of Bristol José Hernánez-Orallo DSIC,

More information

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 1, NO. 4, APRIL 01 74 Towar Efficient Distribute Algorithms for In-Network Binary Operator Tree Placement in Wireless Sensor Networks Zongqing Lu,

More information

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks 1 Backpressure-base Packet-by-Packet Aaptive Routing in Communication Networks Eleftheria Athanasopoulou, Loc Bui, Tianxiong Ji, R. Srikant, an Alexaner Stoylar arxiv:15.4984v1 [cs.ni] 27 May 21 Abstract

More information

BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES

BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES OLIVIER BERNARDI AND ÉRIC FUSY Abstract. We present bijections for planar maps with bounaries. In particular, we obtain bijections for triangulations an quarangulations

More information

On the Placement of Internet Taps in Wireless Neighborhood Networks

On the Placement of Internet Taps in Wireless Neighborhood Networks 1 On the Placement of Internet Taps in Wireless Neighborhoo Networks Lili Qiu, Ranveer Chanra, Kamal Jain, Mohamma Mahian Abstract Recently there has emerge a novel application of wireless technology that

More information

Characterizing Decoding Robustness under Parametric Channel Uncertainty

Characterizing Decoding Robustness under Parametric Channel Uncertainty Characterizing Decoing Robustness uner Parametric Channel Uncertainty Jay D. Wierer, Wahee U. Bajwa, Nigel Boston, an Robert D. Nowak Abstract This paper characterizes the robustness of ecoing uner parametric

More information

New Version of Davies-Bouldin Index for Clustering Validation Based on Cylindrical Distance

New Version of Davies-Bouldin Index for Clustering Validation Based on Cylindrical Distance New Version of Davies-Boulin Inex for lustering Valiation Base on ylinrical Distance Juan arlos Roas Thomas Faculta e Informática Universia omplutense e Mari Mari, España correoroas@gmail.com Abstract

More information

Politecnico di Torino. Porto Institutional Repository

Politecnico di Torino. Porto Institutional Repository Politecnico i Torino Porto Institutional Repository [Proceeing] Automatic March tests generation for multi-port SRAMs Original Citation: Benso A., Bosio A., i Carlo S., i Natale G., Prinetto P. (26). Automatic

More information

Learning Polynomial Functions. by Feature Construction

Learning Polynomial Functions. by Feature Construction I Proceeings of the Eighth International Workshop on Machine Learning Chicago, Illinois, June 27-29 1991 Learning Polynomial Functions by Feature Construction Richar S. Sutton GTE Laboratories Incorporate

More information

Problem Paper Atoms Tree. atoms.pas. atoms.cpp. atoms.c. atoms.java. Time limit per test 1 second 1 second 2 seconds. Number of tests

Problem Paper Atoms Tree. atoms.pas. atoms.cpp. atoms.c. atoms.java. Time limit per test 1 second 1 second 2 seconds. Number of tests ! " # %$ & Overview Problem Paper Atoms Tree Shen Tian Harry Wiggins Carl Hultquist Program name paper.exe atoms.exe tree.exe Source name paper.pas atoms.pas tree.pas paper.cpp atoms.cpp tree.cpp paper.c

More information

Inuence of Cross-Interferences on Blocked Loops: to know the precise gain brought by blocking. It is even dicult to determine for which problem

Inuence of Cross-Interferences on Blocked Loops: to know the precise gain brought by blocking. It is even dicult to determine for which problem Inuence of Cross-Interferences on Blocke Loops A Case Stuy with Matrix-Vector Multiply CHRISTINE FRICKER INRIA, France an OLIVIER TEMAM an WILLIAM JALBY University of Versailles, France State-of-the art

More information

Overlap Interval Partition Join

Overlap Interval Partition Join Overlap Interval Partition Join Anton Dignös Department of Computer Science University of Zürich, Switzerlan aignoes@ifi.uzh.ch Michael H. Böhlen Department of Computer Science University of Zürich, Switzerlan

More information

Real-time concepts for Software/Hardware Engineering

Real-time concepts for Software/Hardware Engineering Real-time concepts for Software/Harware Engineering Master s thesis of M.C.W. Geilen Date: August 996 Coaches: ing.p.h.a. van er Putten ir.j.p.m. Voeten Supervisor: prof.ir.m.p.j. Stevens Section of Information

More information

Kinematic Analysis of a Family of 3R Manipulators

Kinematic Analysis of a Family of 3R Manipulators Kinematic Analysis of a Family of R Manipulators Maher Baili, Philippe Wenger an Damien Chablat Institut e Recherche en Communications et Cybernétique e Nantes, UMR C.N.R.S. 6597 1, rue e la Noë, BP 92101,

More information

Evolutionary Optimisation Methods for Template Based Image Registration

Evolutionary Optimisation Methods for Template Based Image Registration Evolutionary Optimisation Methos for Template Base Image Registration Lukasz A Machowski, Tshilizi Marwala School of Electrical an Information Engineering University of Witwatersran, Johannesburg, South

More information

Open Access Adaptive Image Enhancement Algorithm with Complex Background

Open Access Adaptive Image Enhancement Algorithm with Complex Background Sen Orers for Reprints to reprints@benthamscience.ae 594 The Open Cybernetics & Systemics Journal, 205, 9, 594-600 Open Access Aaptive Image Enhancement Algorithm with Complex Bacgroun Zhang Pai * epartment

More information

Throughput Characterization of Node-based Scheduling in Multihop Wireless Networks: A Novel Application of the Gallai-Edmonds Structure Theorem

Throughput Characterization of Node-based Scheduling in Multihop Wireless Networks: A Novel Application of the Gallai-Edmonds Structure Theorem Throughput Characterization of Noe-base Scheuling in Multihop Wireless Networks: A Novel Application of the Gallai-Emons Structure Theorem Bo Ji an Yu Sang Dept. of Computer an Information Sciences Temple

More information

Uninformed search methods

Uninformed search methods CS 1571 Introuction to AI Lecture 4 Uninforme search methos Milos Hauskrecht milos@cs.pitt.eu 539 Sennott Square Announcements Homework assignment 1 is out Due on Thursay, September 11, 014 before the

More information

Adjacency Matrix Based Full-Text Indexing Models

Adjacency Matrix Based Full-Text Indexing Models 1000-9825/2002/13(10)1933-10 2002 Journal of Software Vol.13, No.10 Ajacency Matrix Base Full-Text Inexing Moels ZHOU Shui-geng 1, HU Yun-fa 2, GUAN Ji-hong 3 1 (Department of Computer Science an Engineering,

More information

THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE

THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE БСУ Международна конференция - 2 THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE Evgeniya Nikolova, Veselina Jecheva Burgas Free University Abstract:

More information

k-nn Graph Construction: a Generic Online Approach

k-nn Graph Construction: a Generic Online Approach k-nn Graph Construction: a Generic Online Approach Wan-Lei Zhao arxiv:80.00v [cs.ir] Sep 08 Abstract Nearest neighbor search an k-nearest neighbor graph construction are two funamental issues arise from

More information

An Adaptive Routing Algorithm for Communication Networks using Back Pressure Technique

An Adaptive Routing Algorithm for Communication Networks using Back Pressure Technique International OPEN ACCESS Journal Of Moern Engineering Research (IJMER) An Aaptive Routing Algorithm for Communication Networks using Back Pressure Technique Khasimpeera Mohamme 1, K. Kalpana 2 1 M. Tech

More information

Multi-camera tracking algorithm study based on information fusion

Multi-camera tracking algorithm study based on information fusion International Conference on Avance Electronic Science an Technolog (AEST 016) Multi-camera tracking algorithm stu base on information fusion a Guoqiang Wang, Shangfu Li an Xue Wen School of Electronic

More information

A New Search Algorithm for Solving Symmetric Traveling Salesman Problem Based on Gravity

A New Search Algorithm for Solving Symmetric Traveling Salesman Problem Based on Gravity Worl Applie Sciences Journal 16 (10): 1387-1392, 2012 ISSN 1818-4952 IDOSI Publications, 2012 A New Search Algorithm for Solving Symmetric Traveling Salesman Problem Base on Gravity Aliasghar Rahmani Hosseinabai,

More information

Lesson 11 Interference of Light

Lesson 11 Interference of Light Physics 30 Lesson 11 Interference of Light I. Light Wave or Particle? The fact that light carries energy is obvious to anyone who has focuse the sun's rays with a magnifying glass on a piece of paper an

More information

EFFICIENT ON-LINE TESTING METHOD FOR A FLOATING-POINT ADDER

EFFICIENT ON-LINE TESTING METHOD FOR A FLOATING-POINT ADDER FFICINT ON-LIN TSTING MTHOD FOR A FLOATING-POINT ADDR A. Droz, M. Lobachev Department of Computer Systems, Oessa State Polytechnic University, Oessa, Ukraine Droz@ukr.net, Lobachev@ukr.net Abstract In

More information

HOW DO SECURITY TECHNOLOGIES INTERACT WITH EACH OTHER TO CREATE VALUE? THE ANALYSIS OF FIREWALL AND INTRUSION DETECTION SYSTEM

HOW DO SECURITY TECHNOLOGIES INTERACT WITH EACH OTHER TO CREATE VALUE? THE ANALYSIS OF FIREWALL AND INTRUSION DETECTION SYSTEM HOW O SECURTY TECHNOLOGES NTERACT WTH EACH OTHER TO CREATE VALUE? THE ANALYSS O REWALL AN NTRUSON ETECTON SYSTEM Huseyin CAVUSOGLU Srinivasan RAGHUNATHAN Hasan CAVUSOGLU Tulane University University of

More information

Cluster Center Initialization Method for K-means Algorithm Over Data Sets with Two Clusters

Cluster Center Initialization Method for K-means Algorithm Over Data Sets with Two Clusters Available online at www.scienceirect.com Proceia Engineering 4 (011 ) 34 38 011 International Conference on Avances in Engineering Cluster Center Initialization Metho for K-means Algorithm Over Data Sets

More information

Optimizing the quality of scalable video streams on P2P Networks

Optimizing the quality of scalable video streams on P2P Networks Optimizing the quality of scalable vieo streams on PP Networks Paper #7 ASTRACT The volume of multimeia ata, incluing vieo, serve through Peer-to-Peer (PP) networks is growing rapily Unfortunately, high

More information