Annual Report: 0087487 Annual Report for Period:09/2001-09/2002 Submitted on: 07/06/2004 Principal Investigator: Veeraraghavan, Malathi. Award ID: 0087487 Organization: Polytechnic Univ of NY Title: Towards enabling a 2-3 orders of magnitude improvement in call handling capacities of switches Senior Personnel Name: Veeraraghavan, Malathi Worked for more than 160 Hours: Yes Project Participants Contribution to Project: Malathi Veeraraghavan guided two PhD students, MS and undergraduate students in various tasks. This includes identification of a subset of CR-LDP for hardware implementation, the design and implementation of OCSP, and a study of network processors. Name: Karri, Ramesh Worked for more than 160 Hours: Yes Contribution to Project: Ramesh Karri provided expertise in our selection of hardware (FPGA) and related design choices. He guided the many students involved in this project on the hardware aspects of the work. Post-doc Graduate Student Name: Wang, Haobo Worked for more than 160 Hours: Yes Contribution to Project: Haobo Wang focussed on implementing the signaling protocol subset in a FPGA. He modeled it using VHDL and implemented it on an FGPA and tested it in a Wildforce prototype board. He was supported as an RA with stipend and tuition for the year. Name: Tao, Zhifeng Worked for more than 160 Hours: Yes Contribution to Project: Zhifeng (Jeff) Tao compared several network processors to see if any of these were suitable for our protocol implementation. He was supported as an RA with stipend and tuition for the year. Undergraduate Student Technician, Programmer Other Participant Page 1 of 4
Annual Report: 0087487 Research Experience for Undergraduates Organizational Partners Other Collaborators or Contacts Activities and Findings Research and Education Activities: (See PDF version submitted by PI at the end of the report) Findings: (See PDF version submitted by PI at the end of the report) Training and Development: Research and teaching skills in the area of hardware implementation of protocols was acquired in this project. We also learned to use Cadence tools for VHDL modeling and ModelSim. Outreach Activities: We had a poster on this project at the annual CATT review meeting, which is attended by external university and industry participants. CATT is the NY State funded Center for Advanced Technology in Telecommunications. Journal Publications Books or Other One-time Publications Haobo Wang Malathi Veeraraghavan Ramesh Karri, "A hardware implementation of a signaling protocol", (2002). Conference Proceedings, Published Collection: Proceedings of Opticomm 2002. Bibliography: July 29-Aug. 2, 2002, Boston, MA. URL(s): http://eeweb1.poly.edu/networks/html-files/index.htm Description: Web/Internet Site Other Specific Products Contributions Contributions within Discipline: 1. We demonstrated how to take a complex protocol and isolate a subset of it for hardware implementation. Our overall approach is to handle frequent operations in hardware and relegate infrequent operations to software. Page 2 of 4
Annual Report: 0087487 2. By increasing the call handling capacity and lowering the call processing delays at switches by two to three orders of magnitude, we have enabled a new category of applications on connection-oriented networks. For example, even small file transfers can be carried out on high-speed end-to-end connections faster than on TCP/IP networks, especially in wide-area settings. Also, restoratation of circuits following failures becomes a real possibility now instead of protection. Restoration requires dynamic circuit setup after the failure unlike protection schemes in which the backup paths are pre-provisioned. The cost of protection schemes is in the extra bandwidth required for the protection paths. Without hardware signaling, the delays incurred in restoration could lead to excessive data loss. Contributions to Other Disciplines: We used the positive results obtained in this project to write a paper and a proposal for a project called CHEETAH (Circuit-switched High-speed End-to-End Transport ArcHitecture). The paper won Best Student Paper Award at Opticomm 2003. The proposal was funded by NSF EIN (Experimental Infrastructure Networks). The goal of this funded project is to support an escience project called the TeraScale Supernova Initiative (TSI). We would not have proposed the use of high-speed circuits for file transfers without hardware-accelerated signaling engines. Thus, this project on hardware implementation of signaling protocols has had a positive impact on escience disciplines. Contributions to Human Resource Development: This project had a significant impact on three undergraduate students, Brian Douglas, Alex Lankios, Bin Troung. Also two MS students, Reinette Grobler and Shao Hui, based their research work on problems identified in this project. Among these students, one student is a woman (Reinette Grobler). Some of these students were placed in good jobs because of their training on this project. Contributions to Resources for Research and Education: Contributions Beyond Science and Engineering: Special Requirements Special reporting requirements: None Change in Objectives or Scope: None Unobligated funds: less than 20 percent of current funds Animal, Human Subjects, Biohazards: None Organizational Partners Categories for which nothing is reported: Page 3 of 4
Annual Report: 0087487 Any Journal Any Product Contributions: To Any Resources for Research and Education Contributions: To Any Beyond Science and Engineering Page 4 of 4
Year 1 Activities report for the NSF project 0087487 Title: Towards enabling a 2-3 orders of magnitude improvement in call handling capacities of switches Please reiterate the goals and objectives of your efforts, and summarize the research and education activities you have engaged in that aim to achieve these objectives. Include experiments you have conducted, the simulations you have run, the collecting you have done, the observations you have made, the materials you have developed, and major presentations you have made about your efforts. In a later section you will list more formally any publications and other specific products (database, collections, software, inventions, etc.) that have resulted. The goal of this project is to demonstrate a signaling engine for switches in connection-oriented networks with circuit or packet switches, that can have high call handling capacities and low call setup delays. Our objectives are to achieve call setup delays in the order of microseconds, and call handling capacities in the order of 100,000 to 1M calls/sec. Both these are two to three orders of magnitude better than signaling engines in existing connection-oriented switches. Research activities include: the definition and implementation of a signaling protocol in an FPGA a study of network processors to examine their suitability for implementation of signaling protocols the identification of a subset of Constraint Routing based Label Distribution Protocol (CR-LDP) for hardware implementation Education activities include the teaching of Ph.D. students, Haobo Wang and Zhifeng Tao, both hardware issues and signaling protocol design. Three undergraduate students, Brian Douglas, Alex Lankios, Bin Troung, worked on VHDL modeling of our original signaling protocol. Two M.S. students, Reinette Grobler and Shao Hui, also worked on this project. Reinette Grobler designed a signaling protocol suitable for hardware implementation, and Shao Hui implemented the protocol on an FPGA. We later called this protocol Optical Circuit Signaling Protocol (OCSP) in our document Specification of a subset of RSVP-TE for hardware implementation, Nov. 2002 posted on our web site: http://eeweb1.poly.edu/networks/html-files/index.htm.
Our experiments included testing our hardware implementation of OCSP on a Wildforce prototype board. We developed a prototype VHDL model for the signaling hardware accelerator, used Synplify for synthesizing the design and Xilinx Alliance for the placement and routing of the design. CPE0 (Xilinx XC4036XLA FPGA) uses 62% of its resources while PE1 (XC4013XLA) uses 8% of its resources. We performed timing simulations of the signaling hardware accelerator using ModelSim simulator. The simulation results are as follows. Receiving and transmitting a Setup message (requesting a bandwidth of OC-12 at a cross connect rate of OC-1) consumes 12 clock cycles each, processing of the Setup message consumes 53 clock cycles. Overall, this translates into 77 clock cycles to receive, process and transmit a Setup message. Processing Setup-Success, Release and Release-Confirm messages consumes about 70 clock cycles total since these messages are much shorter (two 32-bit words versus eleven 32-bit words for Setup) and require simpler processing. The number of clock cycles needed are 77-101 (depending on the time to obtain alternative paths from the routing table), 9, 51, and 10 for the Setup, Setup-Success, Release and Release-Confirm messages, respectively. Assuming a 25 MHz clock, this translates into 3.1 to 4.0 microseconds for Setup message processing and about 2.8 microseconds for the combined processing of Setup-Success, Release and Release-Confirm message. Thus, a complete setup and teardown of a connection consumes about 6.6 microseconds. Compare this with the millisecond-based software implementations of signaling protocols. The materials we have developed include VHDL models, papers and presentations. A paper was published in the Proceedings of Opticomm 2002. Presentations were given at the annual CATT review and at Opticomm 2002. These materials are available through our project web site: http://eeweb1.poly.edu/networks/html-files/index.htm.
Year 1 Findings report for the NSF project 0087487 Title: Towards enabling a 2-3 orders of magnitude improvement in call handling capacities of switches Please summarize the conclusions that have emerged from your activities. Later screens will invite you to identify publications and other concrete products (collections, databases, software, inventions, and so on) and to explain the significance and implications of both findings and products for your field, for other fields, and even beyond science and engineering. If you have no findings to report, at least for now, please click the corresponding button. We anticipate that as the project progresses your emphasis in reporting will shift from activities to findings and products, and ultimately to contributions. Our key conclusions: 1. It is possible to identify a subset of Constraint Routing based Label Distribution Protocol (CR-LDP) for hardware implementation. CR-LDP is extremely complex with many features that make hardware implementation a real challenge. This is because the goals of the designers of this protocol were flexibility and wide-range applicability (to a large variety of networks) instead of high performance (low call handling delays and high call handling throughputs). Clearly, CR-LDP was designed for implementation in software on general-purpose processors. Our achievement this year was to isolate a large enough subset of CR-LDP for hardware implementation such that the vast majority of signaling messages could be handled by the hardware engine without compromising our ability to implement this subset in hardware. Our overall approach is to handle frequent operations in hardware and relegate infrequent operations to software. Our resulting design document is posted on http:// eeweb1.poly.edu/networks/papers/design.pdf. 2. We concluded that current Network Processors (NPs) are not equipped to handle the complex Tag-Length-Value (TLV) format used in signaling protocols, such as CR-LDP and RSVP-TE. We arrived at this conclusion after studying several NPs. The TLV format was developed for flexibility allowing protocol designers to add new parameters easily. This format is in contrast to the fixed-position format used in existing protocols such as IP, ATM, Ethernet, etc. In these protocols, the header consists of multiple fields, each occurring in a specified position. Since
most common protocols do not use the TLV structure, current NPs do not have built-in capabilities to handle the TLV structure. Hence we designed TLV processing for an FPGA. The challenges we faced with this design are as follows. With the TLV format, the order of the fields (parameters) is flexible because the type of field (tag) is specified before the value of the field. For example, in a CR-LDP Label Request message, the TLV carrying the destination address, which is the parameter that needs to be processed first, could appear last in the message. Processing of all other parameters occurring before the one carrying the destination address needs be blocked, leading to larger processing delays and lower throughputs. In order to improve the throughput, we plan to use a pipeline solution with multiple data paths. 3. We found that it is possible to implement state-full protocols in hardware by using data tables in memory to hold state information. Most protocols that are currently implemented in hardware are stateless; this means that the protocol engine at a switch does not maintain state information about packets that traverse the switch. On the other hand, signaling protocol engines need to maintain state information about connections. Our approach for maintaining state information is to use data tables. Much of the processing done by a signaling protocol engine upon receiving a message involves reading or writing of data tables. To accomplish this, we have created a design that will allow us to interface the FPGA implementing the CR- LDP protocol subset with function-specific NPs or Content Addressable Memory (CAM) chips for data table manipulation. 4. We showed that a signaling protocol designed specifically with high-performance as a goal, such as our Optical Circuit Signaling Protocol (OCSP), can indeed be readily implemented in an FPGA.