Is It Possible to Code an Efficient Software Switch?
|
|
- Dwight Armstrong
- 5 years ago
- Views:
Transcription
1 Is It Possible to Code an Efficient Software Switch? Linearizing the Heap, and the Pervasive Use of Hardware Accelerators! Nick Mitchell along with Ioana Baldini, Peter Sweeney IBM T.J. Watson Research Center!! July 31, 2014
2 What Nick Does: Work with IBM customers and developers to make their applications run well the data herein is backed by thousands of real apps
3 Summer School Means Nick Learns At Least As Much From You, As You From Him
4 0. The Apps
5 Most of Our Data Comes from Somewhere Else Your Service Request My Service MongoDB Response Hadoop FPGA, GPU
6 Each Source Has its Own Wire Protocol Thrift Your Service Request HTTP BSON MongoDB My Service Response HTTP RPC Hadoop streaming! binary FPGA, GPU
7 Each Service Has its Operational Form Your Service Request My Service MongoDB Response Hadoop FPGA, GPU
8 Solutions Denied: (but worth learning from)! keep everything in memory code in a systems language
9 Summary Programmers are plumbers who write software switches ECOOP Languages offer poor support for efficient software switches, because of the distance between operational and wire forms Amdahl s Law strikes again, severely limiting the use of specialized computational circuitry What are our options for doing better?
10 1. Initial Thought Exercises
11 Implement an Efficient Map Entry HashMap a conventional chained hash map Key Value ~3 pointers per entry ~3-5 cache misses per GET copying required for RPC c.f. Entry Key Value
12 Implement This Without Copying Any Bits Into the Heap Entry HashMap HashMap Key Value Java Heap Entry Key Value Entry Key Value Entry Key Value update this value
13 Implement an Efficient String String 4 8 bytes of data bytes of non-data 8 33% efficiency characters
14 2. Programmers as Plumbers
15 For Example (a JAX-RS RESTful Response addtocart(@cookieparam( session ) Cookie item ) int quantity ) int quantity) {! ShoppingCart cart = new ShoppingCart(session); cart.add(itemcode, quantity); return Response.ok(cart.status()).cookie(cart.toCookie()).build(); }
16 Dispatch Response addtocart(@cookieparam( session ) Cookie item ) int quantity ) int quantity) {! ShoppingCart cart = new ShoppingCart(session); cart.add(itemcode, quantity); return Response.ok(cart.status()).cookie(cart.toCookie()).build(); }
17 Response addtocart(@cookieparam( session ) Cookie item ) int quantity ) int quantity) {! ShoppingCart cart = new ShoppingCart(session); cart.add(itemcode, quantity); return Response.ok(cart.status()).cookie(cart.toCookie()).build(); }
18 Deserialization into Application Response addtocart(@cookieparam( session ) Cookie item ) int quantity ) int quantity) {! ShoppingCart cart = new ShoppingCart(session); cart.add(itemcode, quantity); return Response.ok(cart.status()).cookie(cart.toCookie()).build(); }
19 Response addtocart(@cookieparam( session ) Cookie item ) int quantity ) int quantity) {! ShoppingCart cart = new ShoppingCart(session); cart.add(itemcode, quantity); return Response.ok(cart.status()).cookie(cart.toCookie()).build(); } (a lonely increment operation)
20 Serialize Application Data Structures Back into Response addtocart(@cookieparam( session ) Cookie item ) int quantity ) int quantity) {! ShoppingCart cart = new ShoppingCart(session); cart.add(itemcode, quantity); return Response.ok(cart.status()).cookie(cart.toCookie()).build(); }
21 Marshalling and Data Formats operational form wire form serialize transfer deserialize
22 3. The Profitability Threshold
23 Trade-offs At what point does externalizing a computation become more pain than it s worth? granularity of kernel cost of externalization Amdahl s Law accelerator speedup
24 Amdahl s Law Tuned Kernel Grows Smaller Offloading Computation Hurts Offloading Computation Helps Externalization More Expensive
25 Amdahl s Law make kernel 6.25% faster Externalization More Expensive free externalization, kernel is 100% of overall computation
26 Amdahl s Law make kernel 6.25% faster Externalization More Expensive externalization equivalent to 1% of overall computation, kernel is 100% of overall computation
27 Amdahl s Law make kernel 12.5% faster
28 Amdahl s Law make kernel 25% faster
29 Amdahl s Law make kernel 50% faster
30 Amdahl s Law make kernel 2x faster
31 Amdahl s Law make kernel 4x faster
32 Amdahl s Law make kernel 8x faster
33 Amdahl s Law make kernel 16x faster
34 Amdahl s Law make kernel 32x faster
35 4. The Costs of Externalization
36 Rough Measurements Object Churn Fractal Webs of Invocations 3000 temps to ingest one document 70 temps to turn a SOAP date (as bytes) into a Java Calendar 200k calls to ingest that document 2000 calls to turn an XML timecard into a Java data structure 6 temps to turn month into int (Sevitsky 2006)
37 Spending CPU Cycles 43% 32% 21% Serialization String Operations Intra-object Copying Application Logic Caching Encryption Reflection Data-driven Dispatching Connection Management
38 Three Kinds of Expenses 43% 32% 21% Serialization String Operations Intra-object Copying Optimizable Data Motion Reflection Data-driven Dispatching Connection Management Amortizable Application Logic Caching Encryption Vital
39 Does the Story Vary? All Data Sets 50% Optimizable Data Motion 31% Amortizable 19% Vital Analytics Apps Web Apps SPECjbb % 53% Vital SPECjEnterprise 49% Optimizable 34% Amortizable 13% trade 54% Amortizable BENCHMARKS
40 What the JIT Doesn t Catch (numbers relative to original w/o JIT optimizations) Copies Comparisons ALU Loads/Stores 0% 25% 50% 75% 100% original with JIT optimizations handtuned w/o JIT optimizations (Xu 2009)
41 Allocate-Use Separation Thread.run How much do we have to inline to have a chance of removing temps and copies?
42 Method Inlining % temps eliminated Date field base J9 inliner JOLT inliner Eclipse 0.4% 1.9% JPetStore on Spring 0.7% 2.5% TPCW on JBoss 0% 4.3% DaCapo 3.4% 13.3% Inlining is hard! Small benchmarks don t reflect difficulties (Shankar 2008)
43 Spending Memory COBOL vs Java 01 Student occurs Name PIC X(10) 02 Score PIC 999V99 class Student { String name; BigDecimal score; } Student[] data = new Student[40]; bytes 4492 bytes (Suganuma 2008)
44 Memory Overhead 0-10% 10-20% 20-30% 30-40% Overhead 40-50% 50-60% 60-70% 70-80% 80-90% % Number of Heap Snapshots
45 5. Alternative Marshalling Schemes
46 Options operational form operational form operational form optimize the translation wire form wire form transmit less wire form transmit less
47 Options operational form operational form operational form optimize the translation Google protobuf, Apache Thrift, Apache Avro, Scala Pickling???? wire form wire form wire form transmit less transmit less
48 protobuf, avro, thrift declaratively specify schema of data automatically generate marshallers struct Work { 1: i32 num1 = 0, 2: i32 num2, 3: Operation op, 4: optional string comment, } operational form optimize the translation wire form operational form wire form transmit less
49 Does it Matter? java-manual protostuff-manual protostuff kryo protobuf thrift-compact thrift avro hessian java-built-in scala/java-built-in relative time to serialize and deserialize an object (
50 Does it Matter? Java Kryo v1 Kryo v2 Scala Pickling Pickler Combinators Unsafe Pickler Combinators Time [ms] e+06 Number of Elements (Miller 2013)
51 Experiment: specjbb % Fraction of CPU Spent in Serialization 40% 30% 20% 10% original implementation change ~100 lines of code in the marshaller 0% Stack Sample Frequency (seconds)
52 6. Alternative Storage Schemes
53 Value Types and Structs struct Student { char[] name; int score; } pay the header cost once per record
54 Ref Poisoning struct Student { String name; int score; }. unless we use a reference type in which case, we re back to where we started
55 Arrays of Records struct Student { chart[] name; int score; } Student[] data = new Student[10]; pay the header cost once per array
56 Marshalling Arrays of Records struct Student { string name; decimal score; } Student[] data = new Student[10]; RPC pay the header cost once per array data[3].name.charat(5) operational form and wire form
57 c.f. Cobol both are fine until we need to store non-scalar data, i.e. variable-length data
58 Column Stores class Student { String name; BigDecimal score; } namestart nameend scorestart scoreend scorepool Student # 1 Student # 2 PROS maintains easily serializability allows for mix-and-match use of attributes Student # CONS added overhead (start and end pointers) unspeakably horrible code
59 7. Alternative Compilation Schemes and Optimizable Language Kernels
60 High-level Languages for Targeting FPGAs LegUp (U. Toronto) Kiwi (Microsoft Research) Bluespec (MIT) make it easier to write computational kernels Lime (IBM Research)
61 asm.js! C/C++ Javascript LLVM subset that is easier to optimize Browser JIT with asm.js support
62 RPython (c.f. Truffle)! Python Human codes interpreter for language X subset that is easier to optimize JIT for language X
63 Models vs Storage class Student { String name; BigDecimal score; } code is hard to maintain, or impossible to express in the language, but performance is great code is easy to maintain, but performance sucks
64 Lowering class Student { String name; BigDecimal score; } Can we lower from one to the other?
65 Lowering class Student { String name; BigDecimal score; } What is lowering other than a partial evaluation of serialization?
66 Partial Marshal class Student { String name; BigDecimal score; } transformer?? student.getname(); transformer?? class StudentTable { CharTable names; int[] namestart; int[] nameend; IntTable scores; int[] scorestart; int[] scoreend; } students.getname(i); names.splice(students.namestart[i], students.nameend([i]);
67 Partial Marshal class Student { String name; BigDecimal score; } data model class StudentTable CharTable names; int[] namestart; int[] nameend; IntTable scores; int[] scorestart; int[] scoreend; } student.getname(); data access students.getname(i); names.splice(students.namestart[i], students.nameend([i]);
68 Attic
69 All Data Sets 50% Optimizable Data Motion 31% Amortizable 19% Vital Batch A Batch B 77% 55% Amortizable Framework T Framework U Framework Z Framework Y Framework X Framework W 59% 43% Amortizable J2EE Provider A jboss J2EE Provider B Analytics D Analytics C Analytics A Analytics F Analytics E SPECjEnterprise tradesoap SPECjbb2013 tradebeans 49% Optimizable 46% 35% 11% 65% Vital
CSCI-1680 RPC and Data Representation. Rodrigo Fonseca
CSCI-1680 RPC and Data Representation Rodrigo Fonseca Today Defining Protocols RPC IDL Problem Two programs want to communicate: must define the protocol We have seen many of these, across all layers E.g.,
More informationBuilding Memory-efficient Java Applications: Practices and Challenges
Building Memory-efficient Java Applications: Practices and Challenges Nick Mitchell, Gary Sevitsky (presenting) IBM TJ Watson Research Center Hawthorne, NY USA Copyright is held by the author/owner(s).
More informationCSCI-1680 RPC and Data Representation John Jannotti
CSCI-1680 RPC and Data Representation John Jannotti Original Slides from Rodrigo Fonseca Today Defining Protocols RPC IDL Problem Two programs want to communicate: must define the protocol We have seen
More informationKampala August, Agner Fog
Advanced microprocessor optimization Kampala August, 2007 Agner Fog www.agner.org Agenda Intel and AMD microprocessors Out Of Order execution Branch prediction Platform, 32 or 64 bits Choice of compiler
More informationProtocol Buffers, grpc
Protocol Buffers, grpc Szolgáltatásorientált rendszerintegráció Service-Oriented System Integration Dr. Balázs Simon BME, IIT Outline Remote communication application level vs. transport level protocols
More informationThe Evolution of Big Data Platforms and Data Science
IBM Analytics The Evolution of Big Data Platforms and Data Science ECC Conference 2016 Brandon MacKenzie June 13, 2016 2016 IBM Corporation Hello, I m Brandon MacKenzie. I work at IBM. Data Science - Offering
More informationJohn Davies LJP Tech. Understand your data
John Davies LJP Tech Understand your data Understand your data John Davies CTO 11 th October 2016 @jtdavies It worked so well someone bought it - yesterday Copyright 2016 LJP Technologies Ltd. 3 Why do
More informationOverview. Communication types and role of Middleware Remote Procedure Call (RPC) Message Oriented Communication Multicasting 2/36
Communication address calls class client communication declarations implementations interface java language littleendian machine message method multicast network object operations parameters passing procedure
More informationApache Thrift Introduction & Tutorial
Apache Thrift Introduction & Tutorial Marlon Pierce, Suresh Marru Q & A TIOBE Index Programming Language polyglotism Modern distributed applications are rarely composed of modules written in a single language.
More informationIdentifying the Sources of Cache Misses in Java Programs Without Relying on Hardware Counters. Hiroshi Inoue and Toshio Nakatani IBM Research - Tokyo
Identifying the Sources of Cache Misses in Java Programs Without Relying on Hardware Counters Hiroshi Inoue and Toshio Nakatani IBM Research - Tokyo June 15, 2012 ISMM 2012 at Beijing, China Motivation
More informationChapter 5. Names, Bindings, and Scopes
Chapter 5 Names, Bindings, and Scopes Chapter 5 Topics Introduction Names Variables The Concept of Binding Scope Scope and Lifetime Referencing Environments Named Constants 1-2 Introduction Imperative
More informationSoftware Analysis. Asymptotic Performance Analysis
Software Analysis Performance Analysis Presenter: Jonathan Aldrich Performance Analysis Software Analysis 1 Asymptotic Performance Analysis How do we compare algorithm performance? Abstract away low-level
More informationLecture 9 Dynamic Compilation
Lecture 9 Dynamic Compilation I. Motivation & Background II. Overview III. Compilation Policy IV. Partial Method Compilation V. Partial Dead Code Elimination VI. Escape Analysis VII. Results Partial Method
More informationTrace-based JIT Compilation
Trace-based JIT Compilation Hiroshi Inoue, IBM Research - Tokyo 1 Trace JIT vs. Method JIT https://twitter.com/yukihiro_matz/status/533775624486133762 2 Background: Trace-based Compilation Using a Trace,
More informationLIQUID METAL Taming Heterogeneity
LIQUID METAL Taming Heterogeneity Stephen Fink IBM Research! IBM Research Liquid Metal Team (IBM T. J. Watson Research Center) Josh Auerbach Perry Cheng 2 David Bacon Stephen Fink Ioana Baldini Rodric
More informationCSCI-1680 RPC and Data Representation. Rodrigo Fonseca
CSCI-1680 RPC and Data Representation Rodrigo Fonseca Administrivia TCP: talk to the TAs if you still have questions! ursday: HW3 out Final Project (out 4/21) Implement a WebSockets server an efficient
More informationA Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler
A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler Hiroshi Inoue, Hiroshige Hayashizaki, Peng Wu and Toshio Nakatani IBM Research Tokyo IBM Research T.J. Watson Research Center April
More informationRISC Architecture Ch 12
RISC Architecture Ch 12 Some History Instruction Usage Characteristics Large Register Files Register Allocation Optimization RISC vs. CISC 18 Original Ideas Behind CISC (Complex Instruction Set Comp.)
More informationRuntime Defenses against Memory Corruption
CS 380S Runtime Defenses against Memory Corruption Vitaly Shmatikov slide 1 Reading Assignment Cowan et al. Buffer overflows: Attacks and defenses for the vulnerability of the decade (DISCEX 2000). Avijit,
More informationJVA-563. Developing RESTful Services in Java
JVA-563. Developing RESTful Services in Java Version 2.0.1 This course shows experienced Java programmers how to build RESTful web services using the Java API for RESTful Web Services, or JAX-RS. We develop
More informationSpark 2. Alexey Zinovyev, Java/BigData Trainer in EPAM
Spark 2 Alexey Zinovyev, Java/BigData Trainer in EPAM With IT since 2007 With Java since 2009 With Hadoop since 2012 With EPAM since 2015 About Secret Word from EPAM itsubbotnik Big Data Training 3 Contacts
More informationECE C61 Computer Architecture Lecture 2 performance. Prof. Alok N. Choudhary.
ECE C61 Computer Architecture Lecture 2 performance Prof Alok N Choudhary choudhar@ecenorthwesternedu 2-1 Today s s Lecture Performance Concepts Response Time Throughput Performance Evaluation Benchmarks
More informationCpE 442 Introduction to Computer Architecture. The Role of Performance
CpE 442 Introduction to Computer Architecture The Role of Performance Instructor: H. H. Ammar CpE442 Lec2.1 Overview of Today s Lecture: The Role of Performance Review from Last Lecture Definition and
More informationNetezza The Analytics Appliance
Software 2011 Netezza The Analytics Appliance Michael Eden Information Management Brand Executive Central & Eastern Europe Vilnius 18 October 2011 Information Management 2011IBM Corporation Thought for
More informationWorkload Optimization on Hybrid Architectures
IBM OCR project Workload Optimization on Hybrid Architectures IBM T.J. Watson Research Center May 4, 2011 Chiron & Achilles 2003 IBM Corporation IBM Research Goal Parallelism with hundreds and thousands
More informationComputer Organization & Assembly Language Programming
Computer Organization & Assembly Language Programming CSE 2312 Lecture 11 Introduction of Assembly Language 1 Assembly Language Translation The Assembly Language layer is implemented by translation rather
More informationCOS 140: Foundations of Computer Science
COS 140: Foundations of Computer Science Variables and Primitive Data Types Fall 2017 Introduction 3 What is a variable?......................................................... 3 Variable attributes..........................................................
More informationProgramming Web Services in Java
Programming Web Services in Java Description Audience This course teaches students how to program Web Services in Java, including using SOAP, WSDL and UDDI. Developers and other people interested in learning
More informationOperating Systems. 18. Remote Procedure Calls. Paul Krzyzanowski. Rutgers University. Spring /20/ Paul Krzyzanowski
Operating Systems 18. Remote Procedure Calls Paul Krzyzanowski Rutgers University Spring 2015 4/20/2015 2014-2015 Paul Krzyzanowski 1 Remote Procedure Calls 2 Problems with the sockets API The sockets
More informationHash-Based Indexes. Chapter 11
Hash-Based Indexes Chapter 11 1 Introduction : Hash-based Indexes Best for equality selections. Cannot support range searches. Static and dynamic hashing techniques exist: Trade-offs similar to ISAM vs.
More informationCOS 140: Foundations of Computer Science
COS 140: Foundations of Variables and Primitive Data Types Fall 2017 Copyright c 2002 2017 UMaine School of Computing and Information S 1 / 29 Homework Reading: Chapter 16 Homework: Exercises at end of
More informationMemory Management: The Details
Lecture 10 Memory Management: The Details Sizing Up Memory Primitive Data Types Complex Data Types byte: char: short: basic value (8 bits) 1 byte 2 bytes Pointer: platform dependent 4 bytes on 32 bit machine
More informationCS 417 9/18/17. Paul Krzyzanowski 1. Socket-based communication. Distributed Systems 03. Remote Procedure Calls. Sample SMTP Interaction
Socket-based communication Distributed Systems 03. Remote Procedure Calls Socket API: all we get from the to access the network Socket = distinct end-to-end communication channels Read/write model Line-oriented,
More informationDarek Mihocka, Emulators.com Stanislav Shwartsman, Intel Corp. June
Darek Mihocka, Emulators.com Stanislav Shwartsman, Intel Corp. June 21 2008 Agenda Introduction Gemulator Bochs Proposed ISA Extensions Conclusions and Future Work Q & A Jun-21-2008 AMAS-BT 2008 2 Introduction
More informationDistributed Systems. 03. Remote Procedure Calls. Paul Krzyzanowski. Rutgers University. Fall 2017
Distributed Systems 03. Remote Procedure Calls Paul Krzyzanowski Rutgers University Fall 2017 1 Socket-based communication Socket API: all we get from the OS to access the network Socket = distinct end-to-end
More informationCS Programming In C
CS 24000 - Programming In C Week 16: Review Zhiyuan Li Department of Computer Science Purdue University, USA This has been quite a journey Congratulation to all students who have worked hard in honest
More informationC exam. IBM C IBM WebSphere Application Server Developer Tools V8.5 with Liberty Profile. Version: 1.
C9510-319.exam Number: C9510-319 Passing Score: 800 Time Limit: 120 min File Version: 1.0 IBM C9510-319 IBM WebSphere Application Server Developer Tools V8.5 with Liberty Profile Version: 1.0 Exam A QUESTION
More informationGo with the Flow: Profiling Copies to Find Run-time Bloat
Go with the Flow: Profiling Copies to Find Run-time Bloat Guoqing Xu, Matthew Arnold, Nick Mitchell, Atanas Rountev, Gary Sevitsky Presented by Wei Fang February 10, 2015 Bloat Example A commercial document
More informationSoot A Java Bytecode Optimization Framework. Sable Research Group School of Computer Science McGill University
Soot A Java Bytecode Optimization Framework Sable Research Group School of Computer Science McGill University Goal Provide a Java framework for optimizing and annotating bytecode provide a set of API s
More informationLogCA: A High-Level Performance Model for Hardware Accelerators
Everything should be made as simple as possible, but not simpler Albert Einstein LogCA: A High-Level Performance Model for Hardware Accelerators Muhammad Shoaib Bin Altaf* David A. Wood University of Wisconsin-Madison
More informationCS558 Programming Languages
CS558 Programming Languages Fall 2016 Lecture 4a Andrew Tolmach Portland State University 1994-2016 Pragmatics of Large Values Real machines are very efficient at handling word-size chunks of data (e.g.
More informationG Programming Languages Spring 2010 Lecture 6. Robert Grimm, New York University
G22.2110-001 Programming Languages Spring 2010 Lecture 6 Robert Grimm, New York University 1 Review Last week Function Languages Lambda Calculus SCHEME review 2 Outline Promises, promises, promises Types,
More informationPragmatic Clustering. Mike Cannon-Brookes CEO, Atlassian Software Systems
Pragmatic Clustering Mike Cannon-Brookes CEO, Atlassian Software Systems 1 Confluence Largest enterprise wiki in the world 2000 customers in 60 countries J2EE application, ~500k LOC Hibernate, Lucene,
More informationComputer Principles and Components 1
Computer Principles and Components 1 Course Map This module provides an overview of the hardware and software environment being used throughout the course. Introduction Computer Principles and Components
More informationJava On Steroids: Sun s High-Performance Java Implementation. History
Java On Steroids: Sun s High-Performance Java Implementation Urs Hölzle Lars Bak Steffen Grarup Robert Griesemer Srdjan Mitrovic Sun Microsystems History First Java implementations: interpreters compact
More informationC++\CLI. Jim Fawcett CSE687-OnLine Object Oriented Design Summer 2017
C++\CLI Jim Fawcett CSE687-OnLine Object Oriented Design Summer 2017 Comparison of Object Models Standard C++ Object Model All objects share a rich memory model: Static, stack, and heap Rich object life-time
More informationSTARCOUNTER. Technical Overview
STARCOUNTER Technical Overview Summary 3 Introduction 4 Scope 5 Audience 5 Prerequisite Knowledge 5 Virtual Machine Database Management System 6 Weaver 7 Shared Memory 8 Atomicity 8 Consistency 9 Isolation
More informationUnit 2 : Computer and Operating System Structure
Unit 2 : Computer and Operating System Structure Lesson 1 : Interrupts and I/O Structure 1.1. Learning Objectives On completion of this lesson you will know : what interrupt is the causes of occurring
More informationThere are 16 total numbered pages, 7 Questions. You have 2 hours. Budget your time carefully!
UNIVERSITY OF TORONTO FACULTY OF APPLIED SCIENCE AND ENGINEERING MIDTERM EXAMINATION, October 27, 2014 ECE454H1 Computer Systems Programming Closed book, Closed note No programmable electronics allowed
More informationComprehensive Final Exam for Capacity Planning (CIS 4930/6930) Fall 2001 >>> SOLUTIONS <<<
Comprehensive Final Exam for Capacity Planning (CIS 4930/6930) Fall 001 >>> SOLUTIONS
More informationEfficient Runtime Tracking of Allocation Sites in Java
Efficient Runtime Tracking of Allocation Sites in Java Rei Odaira, Kazunori Ogata, Kiyokuni Kawachiya, Tamiya Onodera, Toshio Nakatani IBM Research - Tokyo Why Do You Need Allocation Site Information?
More informationBuilding up to today. Remote Procedure Calls. Reminder about last time. Threads - impl
Remote Procedure Calls Carnegie Mellon University 15-440 Distributed Systems Building up to today 2x ago: Abstractions for communication example: TCP masks some of the pain of communicating across unreliable
More informationApplication Development in JAVA. Data Types, Variable, Comments & Operators. Part I: Core Java (J2SE) Getting Started
Application Development in JAVA Duration Lecture: Specialization x Hours Core Java (J2SE) & Advance Java (J2EE) Detailed Module Part I: Core Java (J2SE) Getting Started What is Java all about? Features
More information/ / JAVA TRAINING
www.tekclasses.com +91-8970005497/+91-7411642061 info@tekclasses.com / contact@tekclasses.com JAVA TRAINING If you are looking for JAVA Training, then Tek Classes is the right place to get the knowledge.
More informationQUIZ on Ch.5. Why is it sometimes not a good idea to place the private part of the interface in a header file?
QUIZ on Ch.5 Why is it sometimes not a good idea to place the private part of the interface in a header file? Example projects where we don t want the implementation visible to the client programmer: The
More informationA new Mono GC. Paolo Molaro October 25, 2006
A new Mono GC Paolo Molaro lupus@novell.com October 25, 2006 Current GC: why Boehm Ported to the major architectures and systems Featurefull Very easy to integrate Handles managed pointers in unmanaged
More informationCS558 Programming Languages Winter 2018 Lecture 4a. Andrew Tolmach Portland State University
CS558 Programming Languages Winter 2018 Lecture 4a Andrew Tolmach Portland State University 1994-2018 Pragmatics of Large Values Real machines are very efficient at handling word-size chunks of data (e.g.
More informationOpenACC Course. Office Hour #2 Q&A
OpenACC Course Office Hour #2 Q&A Q1: How many threads does each GPU core have? A: GPU cores execute arithmetic instructions. Each core can execute one single precision floating point instruction per cycle
More informationLegUp: Accelerating Memcached on Cloud FPGAs
0 LegUp: Accelerating Memcached on Cloud FPGAs Xilinx Developer Forum December 10, 2018 Andrew Canis & Ruolong Lian LegUp Computing Inc. 1 COMPUTE IS BECOMING SPECIALIZED 1 GPU Nvidia graphics cards are
More informationChapter 4 Communication
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 4 Communication Layered Protocols (1) Figure 4-1. Layers, interfaces, and protocols in the OSI
More informationCS 31: Intro to Systems Pointers and Memory. Kevin Webb Swarthmore College October 2, 2018
CS 31: Intro to Systems Pointers and Memory Kevin Webb Swarthmore College October 2, 2018 Overview How to reference the location of a variable in memory Where variables are placed in memory How to make
More informationBuilding Robust Embedded Software
Building Robust Embedded Software by Lars Bak, OOVM A/S Demands of the Embedded Industy Increased reliability Low cost -> resource constraints Dynamic software updates in the field Real-time capabilities
More informationBuilding loosely coupled and scalable systems using Event-Driven Architecture. Jonas Bonér Patrik Nordwall Andreas Källberg
Building loosely coupled and scalable systems using Event-Driven Architecture Jonas Bonér Patrik Nordwall Andreas Källberg Why is EDA Important for Scalability? What building blocks does EDA consists of?
More information19 File Structure, Disk Scheduling
88 19 File Structure, Disk Scheduling Readings for this topic: Silberschatz et al., Chapters 10 11. File: a named collection of bytes stored on disk. From the OS standpoint, the file consists of a bunch
More informationVertical Profiling: Understanding the Behavior of Object-Oriented Applications
Vertical Profiling: Understanding the Behavior of Object-Oriented Applications Matthias Hauswirth, Amer Diwan University of Colorado at Boulder Peter F. Sweeney, Michael Hind IBM Thomas J. Watson Research
More informationEITF20: Computer Architecture Part2.1.1: Instruction Set Architecture
EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 2 Main Content Computer
More informationHardware Acceleration for Tagging
Parallel Hardware Parallel Applications IT industry (Silicon Valley) Parallel Software Users Hardware Acceleration for Tagging Sarah Bird, David McGrogan, John Kubiatowicz, Krste Asanovic June 5, 2008
More informationProjects and Apache Thrift. August 30 th 2018 Suresh Marru, Marlon Pierce
Projects and Apache Thrift August 30 th 2018 Suresh Marru, Marlon Pierce smarru@iu.edu, marpierc@iu.edu Todays Outline Final Project Proposals Apache Thrift Overview Open Discussion Project Life Cycle
More informationIntroduction. L25: Modern Compiler Design
Introduction L25: Modern Compiler Design Course Aims Understand the performance characteristics of modern processors Be familiar with strategies for optimising dynamic dispatch for languages like JavaScript
More informationGoogle Protocol Buffers for Embedded IoT
Google Protocol Buffers for Embedded IoT Integration in a medical device project Quick look 1: Overview What is it and why is useful Peers and alternatives Wire format and language syntax Libraries for
More informationKasper Lund, Software engineer at Google. Crankshaft. Turbocharging the next generation of web applications
Kasper Lund, Software engineer at Google Crankshaft Turbocharging the next generation of web applications Overview Why did we introduce Crankshaft? Deciding when and what to optimize Type feedback and
More informationIntegrate MATLAB Analytics into Enterprise Applications
Integrate Analytics into Enterprise Applications Lyamine Hedjazi 2015 The MathWorks, Inc. 1 Data Analytics Workflow Preprocessing Data Business Systems Build Algorithms Smart Connected Systems Take Decisions
More information416 Distributed Systems. RPC Day 2 Jan 11, 2017
416 Distributed Systems RPC Day 2 Jan 11, 2017 1 Last class Finish networks review Fate sharing End-to-end principle UDP versus TCP; blocking sockets IP thin waist, smart end-hosts, dumb (stateless) network
More informationSista: Improving Cog s JIT performance. Clément Béra
Sista: Improving Cog s JIT performance Clément Béra Main people involved in Sista Eliot Miranda Over 30 years experience in Smalltalk VM Clément Béra 2 years engineer in the Pharo team Phd student starting
More informationJava without the Coffee Breaks: A Nonintrusive Multiprocessor Garbage Collector
Java without the Coffee Breaks: A Nonintrusive Multiprocessor Garbage Collector David F. Bacon IBM T.J. Watson Research Center Joint work with C.R. Attanasio, Han Lee, V.T. Rajan, and Steve Smith ACM Conference
More informationField Analysis. Last time Exploit encapsulation to improve memory system performance
Field Analysis Last time Exploit encapsulation to improve memory system performance This time Exploit encapsulation to simplify analysis Two uses of field analysis Escape analysis Object inlining April
More informationModern Processor Architectures. L25: Modern Compiler Design
Modern Processor Architectures L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant minimising the number of instructions
More informationVirtual Memory. Kevin Webb Swarthmore College March 8, 2018
irtual Memory Kevin Webb Swarthmore College March 8, 2018 Today s Goals Describe the mechanisms behind address translation. Analyze the performance of address translation alternatives. Explore page replacement
More informationTowards Ruby3x3 Performance
Towards Ruby3x3 Performance Introducing RTL and MJIT Vladimir Makarov Red Hat September 21, 2017 Vladimir Makarov (Red Hat) Towards Ruby3x3 Performance September 21, 2017 1 / 30 About Myself Red Hat, Toronto
More informationAttila Szegedi, Software
Attila Szegedi, Software Engineer @asz Everything I ever learned about JVM performance tuning @twitter Everything More than I ever wanted to learned about JVM performance tuning @twitter Memory tuning
More informationProfiling & Optimization
Lecture 11 Sources of Game Performance Issues? 2 Avoid Premature Optimization Novice developers rely on ad hoc optimization Make private data public Force function inlining Decrease code modularity removes
More informationWebservices In Java Tutorial For Beginners Using Netbeans Pdf
Webservices In Java Tutorial For Beginners Using Netbeans Pdf Java (using Annotations, etc.). Part of way) (1/2). 1- Download Netbeans IDE for Java EE from here: 2- Follow the tutorial for creating a web
More informationAgenda Process Concept Process Scheduling Operations on Processes Interprocess Communication 3.2
Lecture 3: Processes Agenda Process Concept Process Scheduling Operations on Processes Interprocess Communication 3.2 Process in General 3.3 Process Concept Process is an active program in execution; process
More informationOperating System. Chapter 4. Threads. Lynn Choi School of Electrical Engineering
Operating System Chapter 4. Threads Lynn Choi School of Electrical Engineering Process Characteristics Resource ownership Includes a virtual address space (process image) Ownership of resources including
More informationCaching and Buffering in HDF5
Caching and Buffering in HDF5 September 9, 2008 SPEEDUP Workshop - HDF5 Tutorial 1 Software stack Life cycle: What happens to data when it is transferred from application buffer to HDF5 file and from HDF5
More informationHigh Performance Computing and Programming, Lecture 3
High Performance Computing and Programming, Lecture 3 Memory usage and some other things Ali Dorostkar Division of Scientific Computing, Department of Information Technology, Uppsala University, Sweden
More informationCS 241 Honors Memory
CS 241 Honors Memory Ben Kurtovic Atul Sandur Bhuvan Venkatesh Brian Zhou Kevin Hong University of Illinois Urbana Champaign February 20, 2018 CS 241 Course Staff (UIUC) Memory February 20, 2018 1 / 35
More informationWAY MORE THAN YOU EVER WANTED TO KNOW ABOUT C++ TEMPLATES AND C++ TEMPLATE METAPROGRAMMING DYLAN KNUTSON, UCSD 13-17
WAY MORE THAN YOU EVER WANTED TO KNOW ABOUT C++ TEMPLATES AND C++ TEMPLATE METAPROGRAMMING DYLAN KNUTSON, UCSD 13-17 Creative Commons Attribution + Noncommercial WHO AM I Facebook Seattle s Core Systems/Infra
More informationC 1. Recap: Finger Table. CSE 486/586 Distributed Systems Remote Procedure Call. Chord: Node Joins and Leaves. Recall? Socket API
Recap: Finger Table Finding a using fingers CSE 486/586 Distributed Systems Remote Procedure Call Steve Ko Computer Sciences and Engineering University at Buffalo N102" 86 + 2 4! N86" 20 +
More information"Web Age Speaks!" Webinar Series
"Web Age Speaks!" Webinar Series Java EE Patterns Revisited WebAgeSolutions.com 1 Introduction Bibhas Bhattacharya CTO bibhas@webagesolutions.com Web Age Solutions Premier provider of Java & Java EE training
More informationMongoDB Web Architecture
MongoDB Web Architecture MongoDB MongoDB is an open-source, NoSQL database that uses a JSON-like (BSON) document-oriented model. Data is stored in collections (rather than tables). - Uses dynamic schemas
More informationRate-Limiting at Scale. SANS AppSec Las Vegas 2012 Nick
Rate-Limiting at Scale SANS AppSec Las Vegas 2012 Nick Galbreath @ngalbreath nickg@etsy.com Who is Etsy? Marketplace for Small Creative Businesses Alexa says #51 for USA traffic > $500MM transaction volume
More informationCS 31: Introduction to Computer Systems. 03: Binary Arithmetic January 29
CS 31: Introduction to Computer Systems 03: Binary Arithmetic January 29 WiCS! Swarthmore Women in Computer Science Slide 2 Today Binary Arithmetic Unsigned addition Subtraction Representation Signed magnitude
More informationPerformance evaluation. Performance evaluation. CS/COE0447: Computer Organization. It s an everyday process
Performance evaluation It s an everyday process CS/COE0447: Computer Organization and Assembly Language Chapter 4 Sangyeun Cho Dept. of Computer Science When you buy food Same quantity, then you look at
More informationOverview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::
Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized
More informationCS506 Web Design & Development Final Term Solved MCQs with Reference
with Reference I am student in MCS (Virtual University of Pakistan). All the MCQs are solved by me. I followed the Moaaz pattern in Writing and Layout this document. Because many students are familiar
More informationScalability, Performance & Caching
COMP 150-IDS: Internet Scale Distributed Systems (Spring 2015) Scalability, Performance & Caching Noah Mendelsohn Tufts University Email: noah@cs.tufts.edu Web: http://www.cs.tufts.edu/~noah Copyright
More informationCSCE 212: FINAL EXAM Spring 2009
CSCE 212: FINAL EXAM Spring 2009 Name (please print): Total points: /120 Instructions This is a CLOSED BOOK and CLOSED NOTES exam. However, you may use calculators, scratch paper, and the green MIPS reference
More informationIntroduction to Parallel Programming
Introduction to Parallel Programming Linda Woodard CAC 19 May 2010 Introduction to Parallel Computing on Ranger 5/18/2010 www.cac.cornell.edu 1 y What is Parallel Programming? Using more than one processor
More informationMemory Management. Kevin Webb Swarthmore College February 27, 2018
Memory Management Kevin Webb Swarthmore College February 27, 2018 Today s Goals Shifting topics: different process resource memory Motivate virtual memory, including what it might look like without it
More information