Design Space Exploration in System Level Synthesis under Memory Constraints

Size: px

Start display at page:

Download "Design Space Exploration in System Level Synthesis under Memory Constraints"

Peter Taylor
5 years ago
Views:

1 Design Space Exploration in System Level Synthesis under Constraints Radoslaw Szymanek and Krzysztof Kuchcinski Dept. of Computer and Information Science Linköping University Sweden Abstract This paper addresses the problem of component selection, task assignment and task scheduling for distributed embedded computer systems. Such systems have a large number of constraints of different nature, such as cost, execution time, memory capacity and limitations on resource usage. Previous approaches have concentrated on a specific class of requirements and thus they limit number of constraints which can be handled in the design process. This results very often in non-feasible or too expensive solutions. The system presented in this paper CLASS (Constraint Logic based System Synthesis) makes it possible to impose different design constraints and thus model the design more realistically. It is also efficient in finding good solutions or, in some cases, optimal solutions for even nontrivial problems. 1. Introduction Embedded systems are needed in more and more areas. The expectations concerning their cost, reliability and functionality grow constantly. The gap between the technology and design methods ability to address system synthesis problems also increases. Synthesis of embedded systems is usually decomposed into a chain of problems which must be solved to obtain the final solution. The first task is to decide which components should be used in the target architecture. In the next step assignment of all tasks into the selected processing units is done. The last step is scheduling defined as an assignment of the particular time slots for execution of tasks and data transmissions between processing units. All these decisions influence each other making the process of finding (near) optimal solutions for industrial designs very difficult or even impossible. The properties of the final design are determined by the quality of the solutions of these three main tasks. The complexity of designed systems and the number of This work was supported by the Foundation for Strategic Research, Integrated Electronic Systems program. heterogeneous constraints imposed on these designs increase therefore makes the synthesis challenging. The development of new modeling techniques and heuristics to address this challenge is needed. The aim of this development is to explore as much as possible of the whole design space while not being trapped locally. Since the complexity grows there is a shift from single processor systems to multiprocessor heterogeneous systems. These systems have a potential to achieve superior results in terms of cost and performance over homogenous systems since they can match the problem more closely. Multiprocessor systems, considered in this paper, consist of three types of units. Processing units, such as processor cores or ASIC s, belong to the first group. Next group consists of the communication devices, such as buses and links, used to transfer data between processing units. The last but not the least important is a memory, which can be divided into at least two groups, according to its usage. The data memory is used to store temporary data which are produced and consumed during the execution of an algorithm. The need for this memory can dramatically vary and does not depend directly on the choice of the processing unit. The second group is the code memory, for which the requirement is fixed during the task execution. Assignment of a task to a different processor can result in change of code and data memory requirements. The memory constraints are difficult to model but neglecting them can lead to a considerable waste of memory components. The main contribution of this paper is the inclusion of memory constraints in the design process of distributed embedded systems. CLASS uses the Constraint Logic Programming (CLP) paradigm. The optimization methods based on CLP have several advantages over other approaches. The CLP framework provides elegant formalism for modeling system level synthesis problems. The already defined models can be easily extend by adding new type of constraints. Finally, the process of creating new synthesis heuristics is easy, since it can be based on primitive constructs provided by CLP framework. CLASS supports interactive exploration of the design space. This interaction guides the designer towards that part

2 of the design space which has the highest potential of the (near) optimal solutions. Designer knowledge, gained during the interaction, is usually very hard to put into a formal framework but this knowledge is very important in tailoring the architecture to a specific application. CLASS assumes that the functional description of a system is given as a set of cooperating tasks. This description has to be compiled into a task graph which captures data dependencies between tasks. Each task is characterized by the estimated execution time and memory requirements. The real-time DSP or image processing applications belong to the class of problems which can be modeled using above assumptions. They are fairly deterministic which make them suitable for static scheduling. In these applications an important aspect is also the scheme of data memory usage. Very often the unbalanced use of data memory results in too high data memory requirements. The amount of data memory usage can be, however, decreased by taking into account data memory during scheduling. The importance of memory consideration was also indicated in [1], for example. In this paper we focus on the new synthesis system based on CLP methodology. In section 2, we outline related work in this area. Section 3 defines the model of the architecture and the model of the system. Section 4 gives an example to grasp better our system and understand the complexity of the synthesis problem. Section 5 describes the system synthesis tool and possible designer interaction with the system. Section 6 presents experimental results. Finally, the last section concludes the paper and gives some directions for future work. 2. Related Work The synthesis problem encompasses a big number of problems which can be studied alone. In our work we try to integrate these problems and include as many heterogeneous constraints as possible. The closest system to our was described in [1] and its previous version in [2]. That approach uses Mixed Integer Linear Programming which results in many inequalities and binary decision variables. Since the system aimed at finding optimal solutions the runtimes are prohibitively large even for examples consisting of nine tasks. The architecture of the synthesized systems consists of processors, and buses or links only. Other components, such as ASIC s are not considered. The approach was extended by inclusion of a simple memory model in [1]. The memory cost was directly included in the cost function. Another system synthesis approach which guarantees optimal solutions is described in [8]. The presented algorithm makes it possible to introduce multiple computation of the same task on several processing units in order to remove some communications from the buses. The target architecture is restricted by assuming that buses have the same transmission rate and tasks assigned to ASIC s are executed sequentially. The global memory is used to store input and output data of the whole computation, but intermediate data cannot be stored there. Clustering is the main way of dealing with complexity of co-synthesis of embedded systems. The COSYN system and its extension CASPER described in [6] and [7] are using this method. The algorithm presented there can cope with large industrial size problems, but memory requirements are not considered. During allocation of clusters the architecture is gradually extended by adding new components when deadlines are not met. In [12] clustering is guided by detailed information of communication requirements in order to merge transfers which do not interfere with each other too much. In [9], the multiprocessor task assignment was modeled as a vector packing problem. The target architecture consists of an arbitrary number of heterogeneous processing units that communicate over one bus. It often results in a bus bottleneck for data dominated applications. This assumption causes the best heuristic to produce solutions with the lowest bus ratio. That work was extended in [10], where it solves also a configuration selection problem. The goal is to minimize the cost of the system while correct assignment and schedule can be found. Evolution algorithms for system synthesis are used in [11, 15]. Those approaches give good solutions for middle size problems while not limiting the target architecture. The target architecture consists of general purpose processors, ASIC s, buses and memories. Their advantage is that the obtained result defines the set of solutions, which is often close to the Pareto-set. The main problem of that method is difficulty in adding new type of constraints to the model. Our work is built on the research presented in [3]. This work uses CLP to represent the system synthesis problem by a set of finite domain constraints defining different timing requirements. For small problems optimal solutions can be obtained. Heuristics are used for larger design problems. The system can minimize the design cost for a given execution time or vice versa. The efficiency of the CLP approach is compared with other approaches in favor of CLP. Our approach have more general target architecture than other approaches. It considers processors, ASIC s, buses, links, and local memories. The memory is divided into two groups, code and data memory. The designer has a freedom to influence the final design by making decisions concerning the final architecture, task assignment, and scheduling. He can also supply the system with a partial solution of the selection of components and task assignments. This makes possible to guide the synthesis process in a clean manner and still use the full power of automatic synthesis methods. We implemented an optimization heuristic which gives

3 good results for large designs. 3. System Modeling In our approach, CLP is used to model the system architecture and the design problem. Therefore we first briefly introduce the concept of finite domain variables and constraints over these variables, then present both models and the relations between them in two consecutive subsections. Each finite domain variable (FDV) is initially defined by a set of integer values which constitute its domain. Constraints specify relations among these variables therefore restricting a domain of one variable usually results in restricting domains of the other FDV s. The CLP model consists of FDV s and constraints imposed on these variables. We use system Constraint Handling in Prolog (CHIP) v The CHIP system implements basic and global constraints. Basic constraints are equality, inequality or conditional constraint. The problem can be described using these constraints only. To avoid exponential growth of the number of constraints when complexity of problems increases global constraints are also used. Global constraints usually impose restrictions on cumulative use of resources, rectangle placement or partitioning of graphs [18]. Modeling the problem using global constraints gives a clean and understandable description of the problem Architecture Model The target architecture, in our approach, consists of processing units, such as processors, ASIC s and communication devices, such as buses and links. Each processor has two local memories: for data and for code. The architecture is described by specifying processors, ASIC s, buses and interconnection between them. Each processor can be described by the following tuple: P=(λ, β, κ, ϕ) (1) where λ is an integer value and denotes the cost of the processor,βis a 0/1 FDV denoting whether a processor is used or not, κis also an integer and denotes the amount of data memory, and finally the amount of code memory is denoted by ϕ. In case of ASIC ϕ=0. The data memory is used to store data computed by the tasks. Each task requires data memory which is reserved from the start time of the task until all communications from this task are completed. The amount of needed data memory on each processor changes during the schedule because of data transfers between tasks. The processor can compute and send or receive data concurrently, which makes the data memory usage scheme even more dynamic. During synthesis we have to assure that the maximal usage of data memory does not exceed the memory size. ASIC s consist of any number of parts. The ASIC s parts operate independently making possible parallel execution of tasks. The ASIC s cost is fixed regardless of the number of tasks assigned to them. All tasks assigned to an ASIC have access to local data memory. ASIC s do not have code memory. Each bus or link is described by the following tuple: B = (λ, β, ϖ) (4) where λ and ϖ are numbers and denote the cost and the speed of a bus/link respectively, β is a 0/1 FDV and denotes whether a bus or a link is used in the final architecture. The processing units and communication devices have an associated cost. The cost of processing components includes the cost of their memory. This suits better the situation when a designer is creating the system from off-theshelf components when all features off the components are fixed. Differing the cost of the processor with different amount of memory available for it can be done by creating a set of processors with the same performance but different cost and memory capacitance. There is no restriction on the nature of the interconnection structure. The designer has to specify the possible connections between processing and communication devices. This specification is used to impose constraints on bus or link selection for transferring data between two cooperating tasks, when executed on different processors. An example of interconnection structure is presented in Figure 1b, where one bus, B 1, and two links, L 1 and L 2, are used Problem Definition In our approach, the functional description of the system contains a number of cooperating tasks. This description together with estimated execution time and memory requirements are compiled into a task graph. The task graph is an acyclic graph, as one presented in Figure 1a. The nodes of this graph represent computational tasks. Each task is described by the following tuple of FDV s: T=(τ, ρ, δ, µ) (3) where τ denotes the start time of the task execution, ρ, denotes the resource on which the task is executed, δ denotes the task duration and finally µ denotes the amount of code memory needed for the task execution. T 1 T 2 C C 2 3 C T 1 3 C 4 T C 4 5 T 5 B 1 P 1 L 1 P 2 P 3 L 2 A 1 a) a task graph example b) an example of the system architecture Figure 1. Task data flow graph and target architecture

4 The execution time and code memory required by the task depend on the processor. The tasks must be always scheduled on one of the processing units and they cannot be preempted. This is modeled by imposing constraints which define finite relations between FDV s of (3) representing different tasks [3]. The arcs in the task graph represent data transfers between tasks. Each arc is described by a tuple of FDV s: C=(τ, ρ, δ, α) (4) where τ denotes the start time of the communication, ρ denotes the resource which is used to transfer data, δ denotes the duration of the communication, and α denotes the amount of the transferred data. Each arc in the task graph imposes a constraint which defines an execution order between two tasks. When task T i communicates data (communication C c ) to task T j then this is modeled by a constraint which is a conjuction of two following inequalities: τ i + δ i τ c τ c + δ c τ j (5) There are two possible scenarios for exchanging data between two cooperating tasks. First one is, when two tasks are executed on different processors. The communication must be assigned to and scheduled on a communication device for transferring data between two tasks involved in this communication. In second case, both tasks are executed on the same processor and they communicate using the local memory. Therefore FDV δ c =0 and previous constraint reduces to the following one τ i + δ i τ j. All these constraints together create a partial ordering of the tasks. Data memory has to be allocated for each task assigned to a processing device. This memory needs to be reserved during the time interval spanning from the task start and the related communication end. More formally, if we have a task T i and a communication from this task C j then the data memory is reserved for the time interval [τ i, τ j +δ j ]. Note that the time interval when the data memory is allocated depends on both task assignment and scheduling. The constraints on data memory allocation are imposed using cumulative and conditional constraints. These constraints limit the allocation of data memory on each processor to the level of the available data memory. The code memory constraints are simpler. The code memory requirements do not change during execution, so these constraints can be expressed using a conditional sum. Example: Consider two cooperating tasks and communication between them as depicted in Figure 2a. T 1 is executed on the processor P 1 and T 2 is executed on the processor P 2. The communication C 1 is scheduled on bus B 1. This is depicted in Figure 1b as a Gantt diagram. The data transfers can freely appear between finish time of T 1 and start time of T 2 which is expressed by the following inequalities: τ t1 + δ t1 τ c1 τ c1 + δ c1 τ t a) two cooperating tasks B1 P2 P1 DM P2 DM P1 T C 1 1 T 2 T 1 b) schedule for two cooperating tasks D 1 Processor P 1 must reserve data memory for task T 1 denoted by D 1 from τ t1 until τ c1. Processor P 2 reserves data memory for task T 2 denoted by D 2 from τ c1 until τ t2. The memory size of D 1 and D 2 is the same. This constraint is represented for each processing unit using FDV s and cumulative constraints. Figure 2c depicts the possible memory usage and the situation when there is some data memory left on both processors. This memory can be used to store data from other tasks. 4. System Synthesis C 1 T 2 The synthesis problem is to find an architecture with a minimal cost which can execute all tasks fulfilling timing and memory constraints. The architecture is created from a set of components specified by the designer. The whole process is guided by the constraint system, which enforces the correctness of the solution by rejecting all decisions which violate constraints. The synthesis process assigns to each FDV one of the values from its domain. All domain variables introduced in the previous section, such as τ i, ρ i, δ i, µ i and β i, must be assigned to one specific value. After each assignment the correctness of the partial solution is checked by the constraint engine. In case of inconsistency the last decision is withdrawn and another value is tried. In our approach, CLP is used for modeling the synthesis problem and finding solutions. Since adding new constraints is relatively easy task when comparing to other approaches, the designer has the possibility to add his own constraints and influence the solution. The designer constraints concern the deadline for execution of the task graph and deadlines for selected tasks. He has also the possibility to guide our synthesis system by specifying the components which should be used in the final architecture and tasks assignment to these components. This kind of guidance can lead to good solutions easier and in consequence results in better exploration of the design space. D 2 c) data memory usage for processors executing these tasks Figure 2. Data memory requirements

5 The ordering of FDV s in the assignment influences the efficiency of cutting the search space. Good heuristic should choose wisely the next variable and the value assigned to this variable, thus obtaining the good solutions faster. We implemented several heuristics for finding solutions which are based on the domain specific knowledge. In this paper, we use one of these heuristics and the decision flow of used heuristic is following: Assignment of tasks to the resources Assignment of execution intervals to each task Assignment of time slots for executing tasks The first step of the heuristic is to assign tasks to processing units and communications to communication devices. The assignment tries to select the cheapest processors for a given task while minimizing the code memory usage. When the task cannot be assigned to none of the processing units which are already present in the architecture, then a new processing unit is added. In the second step, we assign an execution interval for each task and communication. The number of intervals depends on the duration of the task, larger task duration gives smaller number intervals. Tasks and communications are then divided into three groups depending on the position in the graph. For tasks which are close to the start time of the task graph the intervals with the smallest starting time are tried first. For tasks positioned in the middle of the execution period the middle intervals are selected first, and finally for tasks from the end of the execution period we assign intervals with the largest starting time. This approach allows to scatter tasks and communications evenly in time domain. The search for the correct assignment is done using a heuristic method for traversing the search space which is called credit search [17]. In the previous step, the intervals for tasks execution were decided. The third step assigns the actual time slots within previously decided intervals. Since the search space is very restricted an exhaustive search is performed to find the first correct assignment. The heuristic backtracks whenever it cannot find correct assignment at any step. For example, if during credit search of the intervals, no assignment will be found, then our heuristic finds a new assignment of tasks and communications to resources and credit search is performed again. After finding a solution, a new constraint is added which restricts the cost of the next solution to be smaller than the one just obtained and the heuristic is restarted. 5. Example In this section, we use simple task graph and target architecture depicted in Figure 1. The task graph consists of five tasks and five communications. The target architecture can consist of at most three processors, one ASIC, one bus, and Table 1: Characteristics of the processing units Processing units Communication devices P 1 P 2 P 3 A 1 B 1 L 1 L 2 Cost Data Speed two links. The cost of the processors, P 1,P 2 and P 3, ASIC, A 1, bus, B 1, and links, L 1 and L 2, is given in Table 1. The execution time of the tasks on different processing units and their code memory requirements are given in Table 2. Table 2: Execution time and code memory requirements Task Execution time Processing units P 1 P 2 P 3 A 1 Execution time Execution time Execution time T T T T T For example, T 3 can be executed on processors P 1,P 2 or P 3. The execution time on P 3 is 2 time units and code memory requirement is 3 units. The amount of data transferred between tasks is given in Table 3 Table 3: Data transfers characteristics C 1 C 2 C 3 C 4 C 5 Data amount For this small example there are five different solutions depending on the deadline of the whole schedule. These solutions are presented in Table 4. They were generated by our synthesis system. The solution 5 for the deadline 7 is depicted in Figure 3. This solution consists of two processors, P 1 and P 2, one ASIC A 1, one bus B 1 and one link L 2. The Gantt diagram produced by our synthesis system for this solution is depicted in Figure 4. For this schedule the requirements for Table 4: Synthesis # Deadline Cost Architecture P P 1, P 2, L 1 3 9,10 6 P 1, P 2, B P 1, P 3, B P 2, P 3, A 1, B 1, L 2

6 T 4 T 3 C2 A1->P3 T 1 T 2 T 5 P 2 P 3 A 1 C3(A1->P3) C4(A1->P2) C1(A1->P2) Figure 3. The solution 5 for deadline 7. the code and data memory are presented in Table 5. Table 5: and data memory requirements for the solution 5. P 2 P 3 A Data Synthesis System Tool A1 P1 P2 P3 L1 L2 B1 T1 T2 T4 C2 C1 C4 C3 T5 T3 Figure 4. Gantt diagram of the solution with deadline 7. Figure 5. The synthesis system was implemented in Constraint Handling in Prolog (CHIP) package version 5.1 [18]. This synthesis system allows the designer to specify the synthesis problem and to find solutions interactively. The system guides the designer in the process of the space exploration. The designer has the possibility of saving solutions, finding new solutions keeping some decisions made during previous design exploration steps. There are several ways of influencing the final design by making decisions regarding component selection, task assignment, and scheduling. The designer can specify which components must be used in the final architecture thus partially creating final architecture. This will help to reduce the design space during task assignment step. The decisions of the designer can also influence the task assignment. The designer can explicitly assign all or selected tasks and communications to the chosen processing or communication units. Finally, the designer can specify the deadlines for specific tasks and help the synthesis system to cut the search during scheduling. The overall view of the system is presented in Figure 6. The main window is divided into two parts. The first one is used to display one of the four spreadsheets which specify the characteristics of the architecture components or the characteristics of computational and communication tasks. The second window, the bottom one, is a Gantt diagram representing the solution graphically to make the evaluation easier. The graphical interface allows better interaction with the system. The system gives also the possibility to display the current task graph, an example is presented in Figure Experimental Results The tool s presentation of the task graph The efficiency of the synthesis tool has been evaluated on six randomly generated examples. They represent relatively small synthesis problems as well as the large ones. All experiments were targeted for the same type of the architecture as depicted in Figure 1b, but instead of the ASIC A 1 we use processor P 4. Since the number of tasks and communications grows the deadlines were extended and the processors were enhanced by adding additional data and code memory. The possible components of the architecture are presented in Table 6. Bus B 1 connects all processors, link L 1 connects processors P 1 and P 2 and link L 2 connects processors P 3 with P 4. Table 6: Architecture components Cost Data Speed P P P P B L L

Figure 6. The system synthesis tool The main characteristics of the examples are presented in Table 7. All tasks and communications in all examples create a single task graph.

7 Figure 6. The system synthesis tool The main characteristics of the examples are presented in Table 7. All tasks and communications in all examples create a single task graph. All examples were run on the Pentium 200MHz processor using the heuristic presented in section 4. The results are shown in Table 8 Our objective was to minimize the cost of the architecture not the of the processors and buses, which are not utilized very much, up to 79%. However, the of code memory and data memory is very high always around 90%. The runtimes for large examples are less than 12 minutes. All these results were obtained without designer s interaction. The tool allows the designer to express a wide variety of constraints on the designs and helps to prune enormous design space. The designer can work iteratively on the design by saving the currently obtained design into a file, and improve it later. Improvements can usually be done by Table 7: Example task graphs Experiments #tasks #commun. Mem DataMem enforcing good decisions from the old design in a new one and leaving other decisions to the synthesis system. An example of a possible scenario for iterative improvements is presented in Table 9. We start with design 6.1, which is identical to design 6 from Table 8. The synthesis of this example was done without interaction from the designer. Design 6.2 was obtained by specifying that the architecture in a new design is the same as in 6.1. With this information the system can improve the assignment and the schedule. Finally, design 6.3 which is depicted in Figure 6 was obtained by fixing architecture as well as task and communication assignment. The final design has very good characteristics in terms of the ratios of the processors and memories, which are very high, all above 90%. Table 8: Experimental results Exp. Cost Deadline Components memory Data memory CPU Bus/Link Runtime (s) P3, P4, L P3, P4, L P1, P2, L P3, P4, L P3, P4, L P1, P2, L

8 Table 9: Interactive improvements of the design 6 Exp. Cost Deadline 8. Conclusions and Future Work CLASS copes with heterogeneous constraints at the system level design. The main contribution is the inclusion of the data memory constraints. Data memory usage peak varies very much according to the assignment and schedule. Incorporating timing constraints with data constraints gives better propagation, better decisions during component selection, assignment and scheduling. We can cope with large examples using the developed heuristic. The presented synthesis system has been evaluated on large task graph examples. It has been shown that it provides good quality results while having acceptable runtimes. By providing more information from the designer, the design quality can be significantly improved. The future directions of this research work are to create better heuristics, extend the model of the problem and add the possibility of pipelined designs as it was indicated in [4, 13]. 9. References Components memory Data memory [1] S. Prakash and A. C. Parker, Synthesis of Application- Specific Multiprocessor Systems Including Components, VLSI Signal Processing, 1994 [2] S. Prakash and A. C. Parker, SOS: Synthesis of Application- Specific Heterogeneous Multiprocessor Systems, Parallel and Distributed Computing, pp , 1992 [3] K. Kuchcinski, Embedded System Synthesis by Timing Constraint Solving, ISSS 1997 [4] K. Kuchcinski, Integrated Resource Assignment and Scheduling of Task Graphs Using Finite Domain Constraints, DATE Conference, 1999 CPU Bus/Link Runtime (s) P1, P2, L P1, P2, L P1, P2, L [5] R. Beckmann and J. Herrmann, Synthesis for General Purpose Computers by Use of Constraint Logic Programming, University of Dortmund, Department of Computer Science, Research Report 684, 1998 [6] B. P. Dave, G. Lakshminarayana and N. K. Jha, COSYN: Hardware-Software Co-Synthesis of Embedded Systems, DAC, 1997 [7] B. P. Dave and N. K. Jha, CASPER: Concurrent Hardware- Software Co-Synthesis of Hard Real-Time Aperiodic and Periodic Specifications of Embedded System Architecture, DATE Conference, 1998 [8] A. Bender, MILP Based Task Mapping for Heterogeneous Multiprocessor Systems, European Design Automation Conference with Euro-VHDL, 1996 [9] J. Beck and D. P. Siewiorek, Modeling Multicomputer Task Allocation as a Vector Packing Problem, International Symposium on System Synthesis, 1996 [10] J. E. Beck and D. P. Siewiorek, Automatic Configuration of Embedded Multicomputer Systems, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, feb, 2, pp , volume 17, 1998 [11] T. Blickle, J. Teich and L. Thiele, System-Level Synthesis Using Evolutionary Algorithms, Design Automation for Embedded Systems, 1998 [12] M. Gasteier and M. Glesner, Bus-Based Communication Synthesis on System Level, 9th International Symposium on System Synthesis, 1996 [13] S. Bakshi and D. D. Gajski, A Scheduling and Pipelining Algorithm for Hardware/Software Systems,10th International Symposium on System Synthesis, 1997 [14] C. Lee, M. Potkonjak, and W. Wolf, System-Level Synthesis of Application Specific Systems using A* Search and Generalized Force-Directed Heuristics, 9th International Symposium on System Synthesis, 1996 [15] R. P. Dick and K. Jha, MOGAC: A Multiobjective Genetic Algorithm for Hardware-Software Cosynthesis of Distributed Embedded Systems, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, oct, 10, volume 17, 1998 [16] B. P. Dave and N. K. Jha, COHRA: Hardware-Software Cosynthesis of Hierarchical Heterogeneous Distributed Embedded Systems, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, October, vol. 17, 1998 [17] N. Beldiceanu, E. Bourreau, H. Simonis and P. Chan, Partial search strategy in CHIP, Presented at 2nd Metaheuristic International Conference MIC97, Sophia Antipolis, France, July [18] CHIP, System Documentation, COSYTEC, 1996

Published in: Proceedings of the Practical Application of Constraint Logic Programming (PACLP) Conference

Digital Systems Design Using Constraint Logic Programming Szymanek, Radoslaw; Gruian, Flavius; Kuchcinski, Krzysztof Published in: Proceedings of the Practical Application of Constraint Logic Programming