Synchronization with shared memory. AMANO, Hideharu Textbook pp.60-68

Similar documents
Relaxed Consistency models and software distributed memory. Computer Architecture Textbook pp.79-83

Snoop cache. AMANO, Hideharu, Keio University Textbook pp.40-60

Introduction to Information and Communication Technology (a)

Unofficial Redmine Cooking - QA #782 yaml_db を使った DB のマイグレーションで失敗する

Googleの強みは ささえるのは世界一のインフラ. Google File System 2008年度後期 情報システム構成論2 第10回 クラウドと協調フィルタリング. 初期(1999年)の Googleクラスタ. 最近のデータセンタ Google Chrome Comicより

Lecture 4 Branch & cut algorithm

Centralized (Indirect) switching networks. Computer Architecture AMANO, Hideharu

Cloud Connector 徹底解説. 多様な基盤への展開を可能にするための Citrix Cloud のキーコンポーネント A-5 セールスエンジニアリング本部パートナー SE 部リードシステムズエンジニア. 哲司 (Satoshi Komiyama) Citrix

電脳梁山泊烏賊塾 構造体のサイズ. Visual Basic

携帯電話の 吸収率 (SAR) について / Specific Absorption Rate (SAR) of Mobile Phones

携帯電話の 吸収率 (SAR) について / Specific Absorption Rate (SAR) of Mobile Phones

今日の予定 1. 展開図の基礎的な知識 1. 正多面体の共通の展開図. 2. 複数の箱が折れる共通の展開図 :2 時間目 3. Rep-Cube: 最新の話題 4. 正多面体に近い立体と正 4 面体の共通の展開図 5. ペタル型の紙で折るピラミッド型 :2 時間目 ~3 時間目

PSLT Adobe Typekit Service (2016v1.1)

Yamaha Steinberg USB Driver V for Mac Release Notes

WD/CD/DIS/FDIS stage

A. 展開図とそこから折れる凸立体の研究 1. 複数の箱が折れる共通の展開図 2 通りの箱が折れる共通の展開図 3 通りの箱が折れる共通の展開図そして. 残された未解決問題たち 2. 正多面体の共通の展開図 3. 正多面体に近い立体と正 4 面体の共通の展開図 ( 予備 )

Androidプログラミング 2 回目 迫紀徳

API サーバの URL. <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE COMPLIANCE_SCAN SYSTEM "

J の Lab システムの舞台裏 - パワーポイントはいらない -

Saki is a Japanese high school student who/ has just started to study/ in the US.//

Chapter 1 Videos Lesson 61 Thrillers are scary ~Reading~

暗い Lena トーンマッピング とは? 明るい Lena. 元の Lena. tone mapped. image. original. image. tone mapped. tone mapped image. image. original image. original.

Computer Programming I (Advanced)

Zabbix ログ解析方法. 2018/2/14 サイバートラスト株式会社 Linux/OSS 事業部技術統括部花島タケシ. Copyright Cybertrust Japan Co., Ltd. All rights reserved.

MySQL Cluster 7.3 リリース記念!! 5 分で作る MySQL Cluster 環境

フラクタル 1 ( ジュリア集合 ) 解説 : ジュリア集合 ( 自己平方フラクタル ) 入力パラメータの例 ( 小さな数値の変化で模様が大きく変化します. Ar や Ai の数値を少しずつ変化させて描画する. ) プログラムコード. 2010, AGU, M.

Preparing Information Design-Oriented. Posters. easy to. easy to. See! Understand! easy to. Convey!

Multiprocessor Synchronization

PRODUCT DESCRIPTIONS AND METRICS

Yamaha Steinberg USB Driver V for Windows Release Notes

本書について... 7 本文中の表記について... 7 マークについて... 7 MTCE をインストールする前に... 7 ご注意... 7 推奨 PC 仕様... 8 MTCE をインストールする... 9 MTCE をアンインストールする... 11

Certificate of Accreditation

Quick Install Guide. Adaptec SCSI RAID 2120S Controller

JASCO-HPLC Operating Manual. (Analytical HPLC)

Yamaha Steinberg USB Driver V for Windows Release Notes

Interdomain Routing Security Workshop 21 BGP, 4 Bytes AS. Brocade Communications Systems, K.K.

マルチビットアップセット耐性及びシングルビットアップセット耐性を備えた

IRS16: 4 byte ASN. Version: 1.0 Date: April 22, 2008 Cisco Systems 2008 Cisco, Inc. All rights reserved. Cisco Systems Japan

Online Meetings with Zoom

2. 集団の注目位置推定 提案手法では 複数の人物が同一の対象を注視している状況 置 を推定する手法を検討する この状況下では 図 1 のよう. 顔画像からそれぞれの注目位置を推定する ただし f は 1 枚 この仮説に基づいて 複数の人物を同時に撮影した低解像度顔

M4 Parallelism. Implementation of Locks Cache Coherence

サンプル. NI TestStand TM I: Introduction Course Manual

Vehicle Calibration Techniques Established and Substantiated for Motorcycles


Studies of Large-Scale Data Visualization: EXTRAWING and Visual Data Mining

Dream the sky. Make it yours.

UB-U01III/U02III/U03II User s Manual

UML. A Model Trasformation Environment for Embedded Control Software Design with Simulink Models and UML Models

Analysis on the Multi-stakeholder Structure in the IGF Discussions

サーブレットと Android との連携. Generated by Foxit PDF Creator Foxit Software For evaluation only.

Unified System Management Technology for Data Centres

PCIe SSD PACC EP P3700 Intel Solid-State Drive Data Center Tool

Verify99. Axis Systems

MetaSMIL : A Description Language for Dynamic Integration of Multimedia Content

IP Network Technology

Certificate of Accreditation

Video Annotation and Retrieval Using Vague Shot Intervals

Agilent. IO Libraries Suite 16.3/16.2 簡易取扱説明書. [ IO Libraries Suite 最新版 ]

DürrConnect the clever connection. The quick connection with the Click

Methods to Detect Malicious MS Document File using File Structure Inspection

URL IO オブジェクト指向プログラミング特論 2018 年度只木進一 : 工学系研究科

PGroonga 2. Make PostgreSQL rich full text search system backend!

びんぼうでいいの による LCD シールド aitendo UL024TF のオシロスコープモード試験

ポータブルメディア機器向けプロセッサ フォトフレーム向けメディアプロセッサ

Peering 101. August 2017 TPF. Walt Wollny, Director Interconnection Strategy Hurricane Electric AS6939

楽天株式会社楽天技術研究所 Autumn The Seasar Foundation and the others all rights reserved.

JCCT U.S.-China Cloud Computing Seminar

Rechargeable LED Work Light

アルゴリズムの設計と解析 (W4022) 教授 : 黄潤和 広野史明 (A4/A8)

TOOLS for MR V1.7.7 for Mac Release Notes

ONOS(/CORD) Update 樋口裕太 (NEC)

Clinical Data Acquisition Standards Harmonization (CDASH)

~ ソフトウエア認証への取り組みと課題 ~

IPv6 関連 WG の状況 (6man, v6ops, softwire)

船舶保安システムのセルフチェックリスト. Record No. Name of Ship 船名 flag 国籍 Name of Company 会社名 Date 点検日 Place 場所 Checked by 担当者名. MS-SELF-CHK-SHIP-j (2012.

Private Sub 終了 XToolStripMenuItem_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles 終了 XToolStripMenuItem.

tp.responsewriter, r *http.request) { /* Hmmm, I wonder if this main */ hosttokens := strings.split(r.host, ":"); if len(hosttok

BMW Head Up Display (HUD) Teardown BMW ヘッドアップディスプレイティアダウン

Operational Precaution

Book Template. GGerry.

Codegate 2014 オンライン予選 Write Up Chrono (Logical) 300

振込依頼書記入要領 Entry Guide for Direct Deposit Request Form

BraindumpQuiz. Best exam materials provider - BraindumpQuiz! Choosing us, Benefit more!

Infragistics ASP.NET リリースノート

4. 今回のプログラム 4.2 解決のクラス SearchNumber.java

Kazunari Okada( 岡田一成 ) Sr. Technical Marketing Manager ISO Vibration Analyst (CAT II) National Instruments Corporation Japan

Infrared Data Association Trademark and Brand Guidelines

BraindumpStudy. BraindumpStudy Exam Dumps, High Pass Rate!

Web Billing User Guide

Industrial Solar Power PoE Switch

2007/10/17 ソフトウェア基礎課題布広. /** * SOFT * AppletTest01.java * 文字表示 */

Microchip 社ワイヤボンド変更のご案内

Structured Report Implement rev2.1. Konica Minolta Medical & Graphic Inc. R&D Center Software Development Division Hiroyuki KUBOTA 2006/7/20

MathWorks Products and Prices Japan September 2016

一覧にない機種は動作確認が未実施のものであり 正常動作しないという意味ではありません ACER 型番 ( 機種名 ) APPLE ASUS BRULE DELL. Windows Vista Home Premium

Future X Network for 5G and IoT

トランスポートレイヤ技術. - TCP; Transmission Control Protocol -

Apollo-LakeタブレットにUbuntu17.10を Install Ubuntu17.10 on Apollo-Lake Tablet

NI TB Introduction. Conventions INSTALLATION INSTRUCTIONS Wire Terminal Block for the NI PXI-2529

SteelEye Protection Suite for Linux

Transcription:

Synchronization with shared memory AMANO, Hideharu Textbook pp.60-68

Fork-join: Starting and finishing parallel processes fork Usually, these processes (threads) can share variables fork join Fork/Join is a way of synchronization However, frequent fork/joins degrade performance Operating Systems manage them join These processes communicate with each other Synchronization is required!

Synchronization An independent process runs on each (Processing Unit) in a multiprocessor. When is the data updated? Data sending/receiving (readers-writers problems) A must be selected from multiple s. Mutual exclusion All s wait for each other Barrier synchronization Synchronization

Readers-Writers Problem Writer D Reader Write(D,Data); Write(X,1); 1 0 X Polling until(x==1); Writer: writes data then sets the synchronization flag Reader:waits until flag is set

Readers-Writers Problem Writer D Reader Polling until(x==0); 1 X 0 Polling until(x==1); data=read(d); Write(X,0); Reader: reads data from D when flag is set, then resets the flag Writer:waits for the reset of the flag

Readers-Writers Problem Writer D Reader Polling until(x==0); Then write other data 0 X Reader: reads data from D when flag is set, then resets the flag Writer:waits for reset of the flag

Multiple Readers Writer D Reader Write(D,Data); Write(c,3); 3 0 c Writer: writes data into D, then writes 3 into c. Reader: Polling until( c!=0 ) (Each reader reads data once.)

Multiple Readers Iterative execution Writer Polling until(c!=0) data = Read(D); counter = Read(c); Write(c,counter-1); Polling until(c==0); Polling until(c==0); 3 2 1 0 3-1 1-1 2-1 Reader Writer:waits until c==0 Reader: decrements c There is a problem!!

The case of No Problem Shared data 3 2 1 An example of correct operation 3 1 0 2 1 2 3-1=2 1-1=0 2-1=1 counter=read(c); Write(c,counter-1); counter=read(c); Write(c,counter-1);

The Problematic Case 3 3-1=2 2 3 3 An example of incorrect operation 2 3-1=2 Multiple s may get the same number (2) The value never changes to 0

Indivisible (atomic) operation The counter is a competitive variable for multiple s A write/read to/from such a competitive variable must be protected inside the critical section. A section which is executed by only one process. For protection, an indivisible operation which executes continuous read/write operations is required.

An example of an atomic operation (Test and Set) 0 1 Polling until (Test & Set(X)==0); critical section Write(X,0); 0 1 1 1 t=test&set(x) Reads x and if 0 then writes 1 indivisibly.

Various atomic operations Swap(x,y): exchanges shared variable x with local variable y Compare & Swap (x,b.y): compares shared variable x and constant b, and exchanges according to the result. And-write/Or-write: reads bit-map on the shared memory and writes with bit operation. Fetch & *(x,y): general indivisible operation Fetch & Dec(x): reads x and decrements (if x is 0 do nothing). Fetch&Add(x): reads x and adds y Fetch&Write1: Test & set Fetch&And/Or: And-write/Or-write

An example using Fetch&Dec Writer Polling until(c==0); 3 2 1 0 3-1 1-1 2-1 Reader data = Read(D); Fetch&Dec(c); Polling until(c==0);

Load Linked(Locked)/Store Conditional Using a pair of instructions to make an atomic action. LL (Load Linked): Load Instruction with Lock. SC (Store Conditional): If the contents of the memory location specified by the load linked are changed before SC to the same address (or context switching occurs), it fails and returns 0. Otherwise, returns 1. Atomic Exchange using LL/SC (R4<-> Memory indicated by R1) try: MOV R3,R4 LL R2,0(R1) SC R3,0(R1) BEQZ R3,try MOV R4,R2 Became famous by Hennessy and Patterson s textbook.

The benefit of LL/SC Locking bus system is not needed. Easy for implementation LL: saves the memory address in Link register SC: checks it before storing the data Invalidated with writing the data in the same address like the snoop cache.

Implementing LL and SC PE1 PE2 R1=0x1000 LL R2,0(R1) R1=0x1000 LL R2,0(R1) Link register 1000 Data in 1000 1000 is read out. Memory Cache is omitted in this diagram

Implementing LL and SC PE2 executes SC first PE1 PE2 R1=0x1000 R1=0x1000 SC R3,0(R1) 1 is returned to R3 Link register - PE R3 1000 Snoop and Invalidate Memory

Implementing LL and SC Then, PE1 executes SC PE1 PE2 R1=0x1000 SC R3,0(R1) 0 is returned to R3 Link register - PE - R3 X The data is not written Snoop and Invalidate Memory

Quiz Implement Fetch and Decrement by using LL and SC.

Answer try: LL R2,0(R1) DADDI R3,R2,#-1 SC R3,0(R1) BEQZ R3,try If SC is successful, the memory was decremented without interference.

Multi-Writers/Readers Problem Writer Mutual exclusion Selects a writer from multiple writers 3 2 1 0 Reader data = Read(D); Fetch&Dec(c); Polling until(c==0); 1 Writer-Multi Readers

Glossary 1 Synchronization: 同期 今回のメインテーマ Mutual exclusion: 排他制御 一つのプロセッサ ( プロセス ) のみを選び 他を排除する操作 Indivisible(atomic) operation: 不可分命令 命令実行中 他のプロセッサ ( プロセス ) が操作対象の変数にアクセスすることができない Critical Section: 排他制御により一つのプロセッサ ( プロセス ) のみ実行することを保証する領域 土居先生はこれを 際どい領域 と訳したが あまり一般的になってない Fork/Join: フォーク / ジョイン Barrier Synchronization: バリア同期 Readers-writers problem: そのまま呼ばれる Producer-Consumer Problem( 生産者 消費者問題 ) と類似しているがちょっと違う

Implementation of a synchronization operation Write 1 Main Memory A large bandwidth shared bus Bus mastership is locked between Read/Write operation Modify 1 (DE) 0 Test & Set Read 0 Snoop Cache Snoop Cache Snoop Cache

Snoop Cache and Synchronization Main Memory A large bandwidth shared bus 1 (DE) CS Test & Set Snoop Cache Test & Set 1(CS) Snoop Cache 0 Even for the line with CS, Test & Set requires Bus transaction

Test,Test&Set Test&Set(c) if c == 0 Main Memory A large bandwidth shared bus Test Test&Set Snoop Cache 01 1 Test Snoop Cache Polling for CS line DE does not require bus transaction Test does not require bus, except the first trial

Test,Test&Set Test&Set(c) if c == 0 Main Memory A large bandwidth shared bus Invalidation signal Snoop Cache 0 1 1- Snoop Cache 0 Release critical section by writing 0 Cache line is invalidated

Test,Test&Set Test&Set(c) if c == 0 Main Memory A large bandwidth shared bus Write back Snoop Cache 0-0 Snoop Cache Test Release critical section by writing 0

Test,Test&Set Test&Set(c) if c == 0 Main Memory A large bandwidth shared bus Test & Set 0 - Snoop Cache 0 Snoop Cache Test & Set is really executed If other issues the request, only a is selected.

Lock with LL/SC lockit: LL R2,0(R1) ; load linked BNEZ R2,lockit ; not-available spin DADDUI R2,R0,#1 ; locked value SC R2,0(R1) ; store BEQZ R2,lockit ; branch if store fails Since the bus traffic is not caused by LL instruction, the same effect as test-test-and set can be achieved.

Semaphore High level synchronization mechanism in the operating system A sender (or active process) executes V(s) (signal), while a receiver (or passive process) executes P(s) (wait). P(s) waits for s = 1 and s 0 indivisibly. V(s) waits for s = 0 and s 1 indivisibly. Busy waiting or blocking Binary semaphore or counting semaphore

Monitor High level synchronization mechanism used in operating systems. A set of shared variables and operations to handle them. Only a process can execute the operation to handle shared variables. Synchronization is done by the Signal/Wait.

Synchronization memory Memory provides tag or some synchronization mechanism Full/Empty bit Memory with Counters I-Structure Lock/Unlock

Full/Empty Bit 0 1 Write A data cannot read from 0 A data can write into 1 1 X Write Suitable for 1-to-1 communication 1 Read 0 0 X Read

Memory with counters 0 Write A data cannot read from 0 A data cannot write except 0 5 5 X Write Suitable for 1-to-many communication 5 Read A large memory is required for tag 4 0 X Read

I-Structure An example using Write-update Snoop cache Main Memory A large bandwidth shared bus Snoop Cache Snoop Cache Snoop Cache Snoop Cache Interrupt Full/Empty with informing mechanism to the receiving

Lock/Unlock Only registered processes can be written into the locked page. Page/Line Process Process Process Process Lock Lock Lock Lock Process Lock

Barrier Synchronization Wait for other processes Barrier; Barrier; All processors (processes) must wait for the barrier establishment. Barrier; Established Barrier Operation: Indivisible operations like Fetch&Dec Dedicated hardware

Dedicated hardware Reaches to the barrier Not yet Reaches to the barrier 1 0 1 Open collecter or Open Drain If there is 0, the entire line becomes 0

Extended barrier synchronizations Group barrier:a certain number of s form a group for a barrier synchronization. Fuzzy barrier:barrier is established not at a line, but a zone. Line barrier vs. Area barrier

Fuzzy barrier Prepare (PREP) or Synchronize (X), then barrier is established. PREP; PREP; X X X PREP; Establish

Fuzzy barrier (An example) Write the array Z PREP Read from Z Synchronize (X) Z[j]=p; PREP; Z[i]= s; PREP; Read Z Read Z Read Z Z[k]=q; PREP; Establish 0 1 2 3

Summary Synchronization is required not only for bus connected multiprocessor but for all MIMD parallel machines. Consistency is only kept with synchronization Consistency Model Synchronization with message passing Message passing model

Glossary 2 Semaphore: セマフォ 腕木式信号機からでている 二進セマフォ (Binary Semaphore) とカウンティングセマフォ (Counting Semaphore) がある Monitor: モニタ この言葉にはいろいろな意味があるが ここでは同期操作と変数を一体化して管理する手法 オブジェクト指向の元祖のひとつ Lock/Unlock: ロック / アンロック この辺の用語は ほぼそのまま呼ばれる Fuzzy Barrier: ファジーバリア バリアの成立時期に幅がある

Exercise Write a program for sending a data from A to B,C,D only using Test & Set operations.