GPU Technology Conference 2015 Silicon Valley

Size: px

Start display at page:

Download "GPU Technology Conference 2015 Silicon Valley"

Angelica Stevens
5 years ago
Views:

1 GPU Technology Conference 2015 Silicon Valley Big Data in Real Time: An Approach to Predictive Analytics for Alpha Generation and Risk Management Yigal Jhirad and Blay Tarnoff March 19, 2015

2 Table of Contents I. State-Space Models: State Instantiation Portfolio and Risk Management: Big Data/Real Time Sensitivity II. Parallel Approach/Predictive Analytics Cluster Analysis: Coherence, Correlation, Cointegration Evolutionary Optimization III. Summary IV. Author Biographies DISCLAIMER: This presentation is for information purposes only. The presenter accepts no liability for the content of this presentation, or for the consequences of any actions taken on the basis of the information provided. Although the information in this presentation is considered to be accurate, this is not a representation that it is complete or should be relied upon as a sole resource, as the information contained herein is subject to change. 2

3 State-Space Models Portfolio/Risk Management Financials markets display discrete dynamic micro states and regimes Traditional valuation techniques may not be as effective, e.g. Options markets Implied Volatility vs. Jump Diffusion Regime Shifts Big Data: Time Series Tick Data, Interday, and Intraday Predictive Analytics: Identifying Structural Breaks Alpha Generation, Risk Management, Market Impact Parallel Processing: APL/CPU and CUDA/GPU physics based modeling exploit hardware efficiently Array Based Processing paradigm Matrix/Vector thought process is key APL is a programming language whose quantum data object is an array, which is fundamental to parallel processing and can leverage parallel processing across CPU s CUDA leverages GPU Hardware Application in Econometrics and Applied Mathematics Monte Carlo Simulations Fourier Analysis Principal Components Optimization Cluster Analysis Cointegration 3

4 Cluster Analysis: Correlation & Cointegration Cluster Analysis: A multivariate technique designed to identify relationships and cohesion Factor Analysis, Risk Model Development Correlation Analysis: Pairwise analysis of data across assets. Each pairwise comparison can be run in parallel. Use Correlation or Cointegration as primary input to cluster analysis Apply proprietary signal filter to remove selected data and reduce spurious correlations 4

5 Evolutionary Optimization Asymptotic Multi-Phase Optimization Identify a target portfolio of stocks that is trending consistently over consecutive periods using specialized, possibly time-sensitive, optimization algorithms Establish a portfolio of stocks that is performing in a cohesive way Identifying and Assessing factors driving outperformance Optimize a basket of factors to track target portfolio Look at factors such as Value vs. Growth, Large Cap vs. Small Cap Optimized Factor Attribution of Targeted Portfolio Relative Ranking Cash/Assets 1 Capex/Assets 2 Dividend Yield 3 Dividend Growth (1 and 5 Year) 4 Market Cap (High - Low) 5 Dividend Payout 6 ROIC 7 E/P 8 Indicative factor attribution of target portfolio Period: 2nd Half of

6 Cluster Analysis: Correlation Application example: Correlation function removing N/A values Correlation measures the direction and strength of a linear relationship between variables. The Pearson product moment correlation between two variables X and Y is calculated as: For N assets there are unique correlation pairs Given an N x M matrix A in which each row is a list of returns for a particular equity, return an N x N matrix R in which each element is the scalar result of correlating each row of A to every other Each element of A may be an N/A value When processing a pair of rows, the calculation must include neither each N/A value nor the corresponding element in the other row. This requires evaluating each pair separately. As a result the increased computational demand is more effectively implemented through a parallel processing solution. As the matrix size increases the benefits of parallel processing become more significant. 6

7 Cluster Analysis: Configuration Tesla K20c Hardware Constraints Compute capability: 3.5 Warp size: 32 Number of shared memory banks: 32 Coalescence capacity (4-byte words): 128 bytes = 32 contiguous words Max blocks / warps / threads per multiprocessor: 16 / 64 / 2048 Max registers per multiprocessor: 64K Max shared memory per multiprocessor: 48K Software Constraints/Response Run 32 independent correlations per block for output coalescence Read 32 prices/returns per correlation for input coalescence Registers used per thread: 33 => max 15 blocks per 4 warps per block Shared memory used per block: per warp => max 16 blocks per 4 warps per block Thread block configuration: 4 x 32 => 15 blocks / 60 warps / 1920 threads per SM 7

8 Cluster Analysis: Kernel Operation Each pair of rows produces a single correlation Input Output

9 Cluster Analysis: Kernel Correlation computation X X Y Y X X 2 Y Y 2 Two passes through prices/returns in global memory required First pass: Compute means of X and Y X, Y Second pass: Subtract means from X and Y X, Y Sum X Y, X X and Y Y X Y, X 2, Y 2 Multiply resultant X Y by reciprocal square roots of X 2 Y 2

10 Cluster Analysis: Kernel First Pass Computation of means (row 33, col 0 of block grid) Repeat: read next section of Prices/Returns into registers Ns necessary for NA handling: 0 where either Xs or Ys is NA; 1 where Xs and Ys are both valid

11 Cluster Analysis: Kernel First Pass Computation of means (row 33, col 0 of block grid) Repeat: sum values in registers into shared memory + + +

12 Cluster Analysis: Kernel First Pass Computation of means (row 33, col 0 of block grid) Sum rows to obtain totals for each row of Prices/Returns Save totals to shared memory Divide totals by N Assets

(reuse registers from first pass) Ns necessary for NA

13 Cluster Analysis: Kernel Second Pass Computation of correlations Read section of Prices/ Returns into registers (reuse registers from first pass) Ns necessary for NA handling: 0 where either Xs or Ys is NA; 1 where Xs and Ys are both valid

14 Cluster Analysis: Kernel Second Pass Computation of correlations Subtract means of X and Y

15 Cluster Analysis: Kernel Second Pass Computation of correlations Multiply and sum (reuse X, Y, N from first pass) + +

16 Cluster Analysis: Kernel Second Pass Computation of correlations Multiply and sum (reuse X, Y, N from first pass) +

17 Cluster Analysis: Kernel Second Pass Computation of correlations Sum rows to obtain of terms for each row of Prices/Returns X X Y Y X X 2 Y Y 2

18 Cluster Analysis: Kernel Second Pass Computation of correlations X X Y Y X X 2 Y Y 2 Assets rsqrt(x2) rsqrt(y2) XY

19 Cluster Analysis: Kernel Output to global memory : row 33, col 0 of block grid Assets Assets Assets

20 Cluster Analysis: Big Data, Real Time Host/Global Memory Pinned-mapped memory eliminates transfer complexity, reduces overhead Kernel processing speed approximately 0.1 seconds for 586 assets 120 periods Input Output

Cluster Analysis: Big Data, Real Time Overview

routine periodically aggregates ticks into VWAPs

ticker VWAPs/returns loaded into one column of

every interval Streaming data source Routine to

Aggregation by period enables arbitrary real time

21 Cluster Analysis: Big Data, Real Time Overview Data source constantly streaming tick data Simple routine periodically aggregates ticks into VWAPs or returns, producing an array of prices, one per ticker VWAPs/returns loaded into one column of input Prices/Returns matrix in host memory at every interval Streaming data source Routine to periodically aggregate ticks into VWAPs Aggregation by period enables arbitrary real time speed, limited by speed of GPU kernel algorithm G P U To target application

22 Summary State-Space model Instantiation using Intraday Time Series data for Predictive Analytics Our approach is based on Time Series Cluster Analysis and Evolutionary Optimization Identifying Structural Breaks, Patterns, and Signals Coherency and Group membership Applications in Portfolio Management for Alpha Generation and Risk Management 22

Author Biographies Yigal D. Jhirad, Senior Vice President, is Director of Quantitative and Derivatives Strategies and a Portfolio Manager for Cohen & Steers options and real assets strategies. Mr.

Jhirad was an Executive Director in the institutional equities division of Morgan Stanley, where he headed the company s portfolio and derivatives strategies effort.

27 Author Biographies Yigal D. Jhirad, Senior Vice President, is Director of Quantitative and Derivatives Strategies and a Portfolio Manager for Cohen & Steers options and real assets strategies. Mr. Jhirad heads the firm s Investment Risk Committee. Prior to joining the firm in 2007, Mr. Jhirad was an Executive Director in the institutional equities division of Morgan Stanley, where he headed the company s portfolio and derivatives strategies effort. He was responsible for developing, implementing, and marketing quantitative and derivatives products to a broad array of institutional clients, including hedge funds, active and passive funds, pension funds and endowments. Mr. Jhirad holds a BS in Economics from the Wharton School of the University of Pennsylvania. He is a Financial Risk Manager (FRM) as Certified by the Global Association of Risk Professionals. yjhirad@yahoo.com Blay A. Tarnoff is a senior applications developer and database architect. He specializes in array programming and database design and development. He has developed equity and derivatives applications for program trading, proprietary trading, quantitative strategy, and risk management. Mr. Tarnoff holds an AB in Mathematics from Brown University. He is currently a consultant at Cohen & Steers and was previously at Morgan Stanley. blay@eblay.com 27

GTC 2018 Silicon Valley, California

GTC 2018 Silicon Valley, California Predictive Learning of Factor Based Strategies using Deep Neural Networks for Investment and Risk Management Yigal Jhirad and Blay Tarnoff March 27, 2018 GTC 2018: Table