AOSP Mini-Conference. Linaro

Size: px
Start display at page:

Download "AOSP Mini-Conference. Linaro"

Transcription

1 AOSP Mini-Conference Linaro

2 Welcome ENGINEERS AND DEVICES WORKING TOGETHER Main difference between the miniconference and regular Connect talks: Let s be more interactive! One additional purpose of the miniconference: Bring together the various groups inside Linaro that work on the AOSP codebase: LMG -- probably the most obvious use of AOSP LHG -- Android TV Potentially LITE -- Brillo Kernel, Toolchain, -- need to support both regular Linux and AOSP use Are there other groups (member engineering teams, maybe) here? What is your use of the AOSP code base?

3 Filesystem analysis Satish Patel

4 File System analysis Filesystems investigated: ext4, btrfs, f2fs, nilfs, squashfs Variants: encryption enabled/disabled, compression off/zlib/lz4 File system analysis briefing (ongoing changes) 3MItxsPsU/edit?usp=sharing Challenges Fixed build support for f2fs image generation (core.mk & image size alignment to 4096) Fixed sparse raw image generation issue Need to use for btrfs and nilfs Image generation for btrfs, nilfs, squashfs etc.. Benchmark porting - bonnie, iozone Partition overload scripts and long run impact scripts

5 Filesystems - A Brief Feature/FS ext4 f2fs btrfs nilfs squashfs Introduction Most used in linux based system Flash Friendly File System B/Better/Butter File System New Implementation of LFS Compress read only File System I-node Hashed B-Tree Linear B+ Tree B-Tree Block Size Extent Fixed Extent Fixed Fixed Type Unix like File Structure Log File Structure Copy On Write Log File Structure UFS Allocation Delayed Immediate Delayed Immediate NA Journal Ordered, WriteBack NA NA NA NA Ubuntu, Most mobiles Moto Series Suse Enterprise Ubuntu,NixOs Live CDs,Android

6 Filesystems - A Traditional Layer WebKit Sqlite Video/Image Application - file access - dir operations - file indexing and management - security operations Memory Management OS Logical File System(ext4, f2fs, btrfs etc..) - data operation on physical device - buffering if required - no management Basic File System Device Driver 1 Device Driver 2 Device Driver n Storage 1 Storage 2 Storage n

7 Filesystems - Basic Types LSF- Log File Structure COW - Copy On Write Image courtesy:

8 Filesystems - Test Environment Hikey - 96Board 1GB RAM Cortex-A53 Octa Core emmc Popular on embedded Device Cheap & Flexible Fast read & random seek Domains - navigation, ereaders, smartphones, industrial loggers, entertainment devices etc.. AOSP + Linaro PatchSet (branch : r55, kernel 4.4) F2FS, Ext4, Squashfs, btrfs, nilfs Benchmarks Vellamo, RL bench, androbench Bonnie (ported for Android) Iozone (ported for Android) Overload and long run test - in progress!!

9 Filesystems - Challenges Fixed build support for f2fs image generation (core.mk & image size alignment to 4096) Fixed sparse raw image generation issue Need to use for btrfs and nilfs Image generation for btrfs, nilfs, squashfs etc. (raw -> format -> sparse) Benchmark porting - bonnie, iozone Partition overload scripts and long run impact scripts

10 Filesystems - Results Given ranking based on performance for each benchmark and test Average rank for iozone test (span over various record length) Few more points to consider Performance impact as filesystem ages CPU utilization O_SYNC (-+r option iozone) : requires that any write operations block until all data and all metadata have been written to persistent storage. This ensure file integrity (rather than data integrity with O_DSYNC flag)

11 Filesystems - iozone average (full test) Write - btrfs (lzo/zlib) wins Read - ext4 performance is comparable to btrfs Note: nilfs failed to complete full iozone test

12 Filesystems - small read and write (64K) Small records/file F2FS wins with sync option For read NILFS has better performance on cache read

13 Filesystems - 1MB file test Ext4 outperform on all read operations F2FS has good score (with sync flag)

14 Filesystems - 512MB, 4MB Write - btrfs (lzo), with sync flag ZLIB wins the race not sure why? 4MB file read EXT4

15 Filesystems - bonnie results Low the better Btrfs (lzo, zlib) gives good number but.. At the cost of CPU eating.. No of kworker threads are more. Coming up next F2FS/Ext4 has fair amount of CPU usage on read/write F2FS outperform on char operation - do we have usecase?

16 Filesystems - hdparm Squashfs is better ( after btrfs )

17 Filesystems - speed variation Low the better Btrs wins for avg. speed But, speed (read/write) deviation is very less for f2fs

18 Filesystems - disk access Disk reads are more for f2fs ( use of less buffered i/o) Nilfs disk read are less More writes for btrfs ( might be due background write activities, for snapshot handling) High disk utilization in case of nilfs NilFS if we do not run gc runs, system went to out of disk space

19 Filesystems - btrfs low lights Though BTRFS has good performance High CPU Utilization: More kernel threads For small data (<1MB), btrfs under perform over f2fs and ext4. Not recommended where small i/o transaction with sync is expected. E.g. frequent calls to DB entries. Btrfs does not force all dirty data to disk on every fsync or O_SYNC operation ( risk on power/crash recovery) Yet to test effect on long run test??

20 File System analysis - Summary All relative rank graphs is available at F2FS/Ext4 Wins for Small File Access (4K-1MB) + DB Access with disk data integrity Potential use case: Industrial monitoring system, Consumer Phone, Health monitoring system NilFS outperforms for SQLite operations CLuzktJgx-K_CMgt0/edit?usp=sharing Only cache here is, metadata/data gets updated later once get written to log file ( kind of extended version of fdatasync over fsync) Can be useful for power backed system and continuous log recording of small data (upto 4K) but with good amount of storage It quickly fill up the space if GC is not called in between. On 5GB space, it just went out of space for 1000 runs of iozone test. Do not recommended for Embedded System SquashFS : Good buffered I/O read Can be used for read only partitions ( system libraries and ro database)

21 File System analysis - Summary BTRFS : Large file + large RAM LZO - Outperforms for block write/read operations ( > 4MB) Potential use case: Low lights: High cpu utilization ( more no# of threads) Not recommended where small i/o transaction with sync is expected Risk on power failure recovery (Not high, but sometimes corrupt itsself) Hybrid use of different file systems on multiple partitions can improve overall performance e.g. In flight entertainment system ( mostly for movies/songs/images etc..) Portable streaming & recording devices ( should be power backed up) large read/write (movies, extra download) on BTRFS partition All small read/write (docs, images) on f2fs/ext4 partition All database access insert/update/delete on f2fs/nilfs partition Note: Yet to perform impact on file system as it ages

22 Filesystems - Todo List Perform long run test (3-4 days, with various operations) and measure the impact Partition overload testing - impact on low disk availability Encryption impact Overhead of overlayfs etc. if we need to add drivers, HALs etc. for a specific piece of hardware to /system when otherwise using a common /system with HAL consolidation Any other?

23 Filesystems - Some points of discussion Any other filesystems (out-of-tree, perhaps) we should look into? Impact of storage technology (devices might start using NVMe) Best way to measure filesystem longevity

24 Thanks! Questions?

25 HAL Consolidation Rob Herring

26 HAL Consolidation - one build, many devices Goal is one Android build/filesystem per cpu architecture while maintaining configurability for device specific builds: A directory per feature for features more than just a config variable KConfig based configuration for features Supporting DB410c, HiKey, Nexus 7, QEMU, RaspberryPi 3 Tablet/phone or TV targets Next platforms or targets to add? Possible next config features: Anything the next device needs Any feature Linaro is working on Custom compiler and compiler flags Kernel build integration malloc selection f2fs filesystem

27 HAL Consolidation - Graphics (Done) CI job for Mesa Android builds GBM based gralloc implementation - GBM map/unmap support Mali (HiKey flavor) support in build YUV planar support GBM allocation and EGL import CSC conversion in GPU shader for gallium - Thanks Rob Clark Initial vc4 support - still some issues Supporting running Android under Xen and KVM on arm64 Various driver and build fixes

28 HAL Consolidation - Graphics (ToDo) drm_hwcomposer and HWC2 WIP drm_composer HWC2 support gbm_gralloc Support scanout buffer alloc from KMS node Gralloc 1.0 support minigbm support from Google: 5 HWC2 necessary for upstream explicit sync support Being worked on by Collabra Overlay and YUV plane support Mesa DRM Explicit fence and EGL_ANDROID_native_fence_sync support Video playback w/ V4L2 h/w codec software rendering support Not many h/w choices with mainline (or mainline ABI) support Needs an OpenMax to V4L2 layer Probably some buffer allocation issues and more YUV formats Mali (blob) and Mesa co-existence do we care?

29 HAL Consolidation - WiFi/BT Integrated generic and QCom WiFi into build Investigate moving QCom WiFi specifics into kernel Needs more testing with different devices (e.g. USB WiFi) How to handle firmware? Include linux-firmware? UART attached device kernel support ( Treat UART attached devices the same as any other bus (USB, PCI, SDIO, SPI, etc.) Moves the userspace device management (firmware load, serial config, PM, etc.) to kernel Move serio framework out of drivers/input/ Extend serio from character at a time to buffer at a time API Make tty_port usable for in-kernel drivers (i.e. serio host driver) Help needed to test devices

30 New developments with AOSP and the kernel AOSP EAS (Energy Aware Scheduler) Integration Sync API Changes in 4.6+ Arm64 KASLR and hardened user copy backport from upstream

31 AOSP Energy Aware Scheduler Integration John Stultz

32 EAS: A common topic at Connect LAS16-TR04: Using tracing to tune and optimize EAS LAS16-410: Window Based Load Tracking (WALT) versus PELT utilization LAS16-307: Benchmarking Schedutil in Android LAS16-105: Walkthrough of the EAS kernel adaptation to the Android Common Kernel BKK16-317: How to generate power models for EAS and IPA (x2) BKK16-311: EAS core upstreaming strategy BKK16-208: EAS SFO15-411: Energy Aware Scheduling: Power vs. Performance policy (x2) SFO15-302: EAS Policy LCU14-507: Chromebook2 EAS Enablement LCU14-410: How to build an Energy Model for your SoC LCU14-406: A QuIC Take on Energy-Aware Scheduling LCU14-402: Energy Aware Scheduling: Kernel summit update LCA14-109: Path to Energy Efficient Scheduler LCU13 Power-efficient scheduling, and the latest news from the kernel summit LCE13: Why all this sudden attention on the Linux Scheduler?

33 We ve heard quite a bit about EAS Now, how does one use it with Android?

34 Kernel side Need the EAS patchset Currently v5.2 Already in common/android-3.18 and common/android-4.4 Includes: ENGINEERS AND DEVICES WORKING TOGETHER EAS core Schedfreq (cpufreq gov) Schedtune (boosting mechanism) WALT (PELT load-tracking replacement) Need energy model for board Not going to cover this

35 Kernel Config ENGINEERS AND DEVICES WORKING TOGETHER CONFIG_CPU_FREQ_DEFAULT_GOV_SCHED=y CONFIG_CPU_FREQ_GOV_SCHED=y CONFIG_CGROUP_SCHEDTUNE=y CONFIG_SCHED_TUNE=y CONFIG_SCHED_WALT=y CONFIG_WQ_POWER_EFFICIENT_DEFAULT=y CONFIG_DEFAULT_USE_ENERGY_AWARE=y

36 AOSP Integration Components Basic concepts Three components: ActivityManager & Schedpolicy Init setup powerhal ENGINEERS AND DEVICES WORKING TOGETHER

37 Conceptual Android task types TOP_APP FOREGROUND BACKGROUND SYSTEM AUDIO_APP AUDIO_SYS ENGINEERS AND DEVICES WORKING TOGETHER

38 Activity Manager & Schedpolicy Activity manager Tracks foreground and background tasks Adjusts things like timerslack Schedpolicy Handles moving tasks between cgroups, lower-level interfaces ENGINEERS AND DEVICES WORKING TOGETHER

39 Multiple approaches used Base scheduler behavior Cpusets Cpuctl Schedtune boosting Interactive touch-boosting ENGINEERS AND DEVICES WORKING TOGETHER

40 Device BoardConfig.mk ENABLE_CPUSETS := true ENABLE_SCHEDBOOST := true ENABLE_SCHED_BOOST := false (Deprecated foreground boosting for big.little HMP scheduler) ENGINEERS AND DEVICES WORKING TOGETHER

41 Cpusets: Limit what runs where Top-app Foreground Background System-background Foreground-boost (Deprecated!) ENGINEERS AND DEVICES WORKING TOGETHER

42 Cpusets Little Core Little Core Big Core Big Core

43 Cpusets Little Core Little Core Big Core Big Core Background

44 Cpusets Little Core Little Core Big Core Big Core Background Foreground

45 Cpusets Little Core Little Core Big Core Big Core Background System-Background Foreground

46 Cpusets Little Core Little Core Big Core Big Core Background System-Background Foregroundboost Foreground

47 Cpusets Little Core Little Core Big Core Big Core Background Foregroundboost System-Background Foreground Top-App

48 Init cpuset config (init.hikey.rc) # Foreground should contain most cores write /dev/cpuset/foreground/cpus 0-6 # top-app gets all cores (7 is reserved for top-app) write /dev/cpuset/top-app/cpus 0-7 #background contains a small subset (generally one little core) write /dev/cpuset/background/cpus 0 ENGINEERS AND DEVICES WORKING TOGETHER # add system-background cpuset, a new cpuset for system services # that should not run on larger cores # system-background is for system tasks that should only run on # little cores, not on bigs to be used only by init write /dev/cpuset/system-background/cpus 0-3

49 Init cpuset config (init.bullhead.rc) # foreground gets all CPUs except CPU 3 # CPU 3 is reserved for the top app write /dev/cpuset/foreground/cpus 0-2,4-5 write /dev/cpuset/foreground/boost/cpus 4-5 write /dev/cpuset/background/cpus 0 write /dev/cpuset/system-background/cpus 0-2 write /dev/cpuset/top-app/cpus 0-5 ENGINEERS AND DEVICES WORKING TOGETHER

50 Cpuctl: Restrict cputime bg_non_interactive cgroup Keeps background tasks to only small portion of little core ENGINEERS AND DEVICES WORKING TOGETHER

51 Cpuctl: (system/core/rootdir/init.rc:) # Create cgroup mount points for process groups mkdir /dev/cpuctl mount cgroup none /dev/cpuctl cpu chown system system /dev/cpuctl chown system system /dev/cpuctl/tasks chmod 0666 /dev/cpuctl/tasks write /dev/cpuctl/cpu.rt_runtime_us write /dev/cpuctl/cpu.rt_period_us ENGINEERS AND DEVICES WORKING TOGETHER mkdir chown chmod # 5.0 write write write /dev/cpuctl/bg_non_interactive system system /dev/cpuctl/bg_non_interactive/tasks 0666 /dev/cpuctl/bg_non_interactive/tasks % /dev/cpuctl/bg_non_interactive/cpu.shares 52 /dev/cpuctl/bg_non_interactive/cpu.rt_runtime_us /dev/cpuctl/bg_non_interactive/cpu.rt_period_us

52 Schedtune: Runtime Boost-Knob System wide: sched_cfs_boost Per-cgroup : schedtune.boost Adds a margin to load-tracking accounting, making scheduler think there is more work to be done, which likely raises the cpufreq Image from:

53 Schedtune: Default Boosting Foreground (everything else) ENGINEERS AND DEVICES WORKING TOGETHER

54 Init stune config (init.hikey.rc) # # EAS # chown chown chown write write write ENGINEERS AND DEVICES WORKING TOGETHER stune boosting interfaces system system /dev/stune/foreground/schedtune.boost system system /dev/stune/foreground/schedtune.prefer_idle system system /dev/stune/schedtune.boost /dev/stune/foreground/schedtune.boost 10 /dev/stune/foreground/schedtune.prefer_idle 1 /dev/stune/schedtune.boost 0

55 Android PowerHAL Provides interactivity signals from userspace POWER_HINT_INTERACTION POWER_HINT_VSYNC POWER_HINT_LOW_POWER POWER_HINT_SUSTAINED_PERFORMANCE POWER_HINT_VR_MODE ENGINEERS AND DEVICES WORKING TOGETHER Deprecated?: POWER_HINT_VIDEO_ENCODE POWER_HINT_VIDEO_DECODE

56 For old interactive cpufreq gov Set the boostpulse_duration on init: # boost for 1sec echo > \ /sys/devices/system/cpu/cpufreq/interactive/boostpulse_duration On POWER_HINT_INTERACTION: echo 1 > /sys/devices/system/cpu/cpufreq/interactive/boostpulse ENGINEERS AND DEVICES WORKING TOGETHER

57 For EAS w/ schedtune The kernel doesn t do deboosting! On POWER_HINT_INTERACTION: echo 40 > /dev/stune/foreground/schedtune.boost Wait some time then: ENGINEERS AND DEVICES WORKING TOGETHER echo 10 > /dev/stune/foreground/schedtune.boost

58 Example touch-boost implementation static void schedtune_power_init(struct hikey_power_module *hikey) { hikey->deboost_time = 0; sem_init(&hikey->signal_lock, 0, 1); pthread_create(&tid, NULL, schedtune_deboost_thread, hikey); } static int schedtune_boost(struct hikey_power_module *hikey) { long long now; pthread_mutex_lock(&hikey->lock); now = gettime_ns(); if (!hikey->deboost_time) { schedtune_sysfs_boost(hikey, SCHEDTUNE_BOOST_INTERACTIVE); sem_post(&hikey->signal_lock); } hikey->deboost_time = now + SCHEDTUNE_BOOST_TIME_NS; pthread_mutex_unlock(&hikey->lock); return 0; } static void* schedtune_deboost_thread(void* arg) { struct hikey_power_module *hikey = (struct hikey_power_module *)arg; while(1) { sem_wait(&hikey->signal_lock); while(1) { long long now, sleeptime = 0; pthread_mutex_lock(&hikey->lock); now = gettime_ns(); if (hikey->deboost_time > now) { sleeptime = hikey->deboost_time - now; pthread_mutex_unlock(&hikey->lock); nanosleep_ns(sleeptime); continue; } schedtune_sysfs_boost(hikey, SCHEDTUNE_BOOST_NORM); hikey->deboost_time = 0; pthread_mutex_unlock(&hikey->lock); break; } } return NULL; } See full source here:

59 Other conceptual complications Negative boosting: Use schedtune to further reduce cpufreq for background or other groups ENGINEERS AND DEVICES WORKING TOGETHER schedboost.prefer_idle: Prefer to place tasks on idle cpus. Gives a bit more responsiveness but costs some power. Consider for foreground tasks

60 Thanks! Questions?

61 Sync API changes in 4.6+ John Stultz

62 Sync API changes in 4.6+ Android Sync API in staging has been refactored and pulled mostly out of staging into the DRM fences and sync_file code. ENGINEERS AND DEVICES WORKING TOGETHER Major credit to Gustavo Padovan for this work!

63 Good News! Proper sync/fence api in upstream kernel! One less Android specific kernel feature!

64 Bad News! Your out-of-tree vendor graphics driver is now terribly, terribly broken!

65 Old Android Sync Concept sync_timeline:

66 Old Android Sync Concept sync_timeline: sync_pt:

67 Old Android Sync Concept sync_timeline: sync_pt:

68 Old Android Sync Concept sync_timeline: sync_fence: sync_pt:

69 DRM Fences in Concept context: sync_file: fence:

70 Kernel transition from old API ENGINEERS AND DEVICES WORKING TOGETHER Old API New API CONFIG_SYNC CONFIG_SYNC_FILE struct sync_fence struct sync_pt struct sync_file struct fence sync_fence_put() sync_fence_fdget() fput(fence->file) sync_file_get_fence() sync_fence_wait_async() sync_fence_cancel_async() fence_add_callback() fence_remove_callback() sync_timeline_create() sync_timeline_signal() fence_context_alloc() fence_signal() sync_pt_create() fence_init() sync_timeline_ops fence_ops

71

72 No exact matches Async waits were previously done on sync_fences - Which are closest to sync_files Now async callbacks are done on fences - Which were analogous to sync_pts sync_timelines were objects contexts are just a unique 64-bit id Drivers have to manage their own context objects

73 In addition... Most graphics drivers have their own higher-level meta-infrastructure that overlaps functionality: struct mali_timeline struct mali_timeline_point struct mali_timeline_fence What do you do with something like: mali_timeline_sync_fence_create_and_add_tracker()

74 Bonus! Some changes for DRM fences are still in flight

75 Good luck rewriting your driver. It wouldn t be so bad if your driver was upstream.

76

77

78 Userspace libsync changes Gustavo s libsync tree: Rob Herring s DRM HWC changes: ENGINEERS AND DEVICES WORKING TOGETHER

79 References Eric Gilling s LPC13 talk: Riley Andrew s LPC14 talk: Gustavo s LinuxCon16 talk: Gustavo s Blog post:

80 Thanks! Questions?

81 ION Who is using Ion? Who wants to use Ion on mainline? What are you using Ion for? Do you need kernel APIs? What out of tree Ion features are you missing? What help do you need? Can you help with testing?

82 Reducing bootup time AOSP is increasingly being used in non-phone environments, where boot times matter much more (e.g. automotive). What can we do to improve boot times? Some measuring to help us decide, run on HiKey 2GB Ram version from LeMaker Android Nougat 7.0.0_r6 Kernel AOSP android-hikey-linaro-4.4 HDMI, Micro-USB, Serial Console connected Soft boot with reboot command for 2nd boot

83 Boot Time Percentage(Total) From surfaceflinger service started to UI displayed (65%) init: Starting service 'surfaceflinger'... From dmesg Kernel boot time(26.5%) Freeing unused kernel memory From dmesg From Init started to surfaceflinger service started(8.5%) Boot is finished (14907 ms) From logcat

84 Boot Time Percentage(Boot Progress) From preload_start to preload_end(23.4%) From pms_system_scan_start to pms_data_scan_start(15%) From boot_progress_start to preload_start(14.9%) From pms_ready to ams_ready(13.2%) From ams_ready to enable_screen(11.2%) The top 5 take 77.7% in total

85 Measurements Target Method Kernel boot time Information from dmesg Comments Android boot time before surfaceflinger service started Android boot time from surfaceflinger service started to Launcher displayed Others like Application start time, web site loading time, media app start time.? Can not measure time for bootloader automatically Need extra tools for accurate measurements Information from dmesg bootchart Services like vold, debuggerd are started here Information from logcat(including the events buffer) bootchart Timestamp in dmesg and logcat are not the same for the same message What others we want to check as well?

86 Reducing bootup time What can we do to improve boot times? Suspend to disk instead of complete shutdown? Parallel init? Launch extra services after UI is up? Better file system type for system/userdata/cache partitions????

87 Out of tree AOSP userspace patches Keeping a number of out of tree patches can become more problematic than it already is - with the move to more frequent security updates and the appearance of Android One-style devices, maintaining extra patches becomes more work. Upstreaming more important than ever Will try hard to upstream Linaro patches Do members need/want help upstreaming patches from their vendor trees/bsps? Is licensing sorted out? Do we keep some patches Members-First? What can we do about patches getting stuck in the upstream review queue? How will we handle out-of-tree patches that can t go upstream (e.g. rejected patches that still matter to a member) in the future? Patchset scripts vs. committing to git repositories?

88 AOSP transition to clang As of AOSP N, AOSP s primary toolchain is clang - based on a recent 4.0 snapshot. Being able to build all of AOSP with clang was largely Linaro s work gcc is still used to build some HALs for old devices and the kernel We can build the HiKey kernel with clang now - with a few patches and a few ugly workarounds that need to be fixed Resulting system works, but has some stability issues that need to be debugged Point of discussion: Do we need to patch support for building with gcc back in?

89 AOSP with upstream clang (especially TOT) Primary reasons for this work Clang in AOSP toolchain is 5+ months behind compared to tot upstream clang Enable monitoring the impact of upstream clang on AOSP (mainly for performance) Enable safe landing of clang's latest code onto AOSP when time is come Linaro's current efforts Downstream patches of AOSP clang now all upstreamed (thanks to Renato and Google folks) AOSP master can be built with upstream clang (at July) successfully Monitoring compilation of AOSP master with tot upstream clang Not have been tested for boot-up yet (See Future work for CI) 3 clang bugs reported (1 fixed 2 open) 1 AOSP bionic patch upstreamed Future work CI for building AOSP master with upstream clang is in progress for boot-up and benchmark tests Continuously finding and fixing problems that prevent successful compilation e.g. new warning (-address-of-packed-member ) added in tot clang causes compilation failure.

90 VIXL: A Programmatic Assembler and Disassembler for AArch32 Anton Kirilov Linaro ART team

91 Agenda What is VIXL? Assembler Disassembler VIXL in the Android Runtime 91

92 What is VIXL? A programmatic assembler and disassembler Does not process text files Originally designed for JIT compilers Supports AArch32 (both A32 and T32) and AArch64 Written in C++ Uses the modified BSD license Used by the Android Runtime, QEMU, HHVM, etc. Also, simulator and debugger for AArch64 This presentation will concentrate on AArch32 92

93 Useful links Download: git clone For AArch64 refer to the SFO presentation VIXL: 93

94 Assembler The basic low-level interface is the Assembler class Provides full control over code generation (e.g. the exact encoding used) Declared in aarch32/assembler-aarch32.h Generates A32 code by default, but can be changed: By the constructor On-the-fly, e.g. by the UseA32()/UseT32() methods Possible to mix A32 and T32 instructions in the generated code 94

95 The Assembler class Let s start with a simple factorial: unsigned factorial(unsigned x) { unsigned r = 1; while (x) { r *= x--; } return r; } In T32 assembly: factorial: movs r1, r0 mov r0, #1 it eq bxeq lr loop: mul r0, r0, r1 subs r1, #1 it ne bne loop bx lr 95

96 The Assembler class With VIXL: (continued from the left) Assembler as(t32); as.bind(&loop); Label factorial; Label loop; as.mul(r0, r0, r1); as.subs(r1, r1, 1); as.it(ne); as.bind(&factorial); as.movs(r1, r0); as.mov(r0, 1); as.b(&loop); as.bx(lr); as.it(eq); as.bx(lr); as.finalizecode(); 96

97 The Assembler class limitations No code buffer overflow check The caller is responsible No automatic generation of large constants Immediate operands of instructions such as MOV, etc. Branch offsets The Assembler methods will print an error message if a large constant is passed Consequence of being a low-level interface 97

98 Macro assembler Implemented by the MacroAssembler class Declared in aarch32/macro-assembler-aarch32.h Uses the assembler internally The interface is mostly the same The macro assembler-specific method names are capitalized Provides some extra features that make programming easier and safer Veneers for branch offsets that can t be encoded Literal pools Further examples follow It is the expected end-user interface 98

99 Macro assembler example Source code: Generated code: MacroAssembler masm(t32); mov ip, #22136 masm.add(r0, r0, 0x ); movt ip, #4660 add r0, ip masm.finalizecode(); 99

100 Further macro assembler example Performs simple optimizations: MacroAssembler masm(t32); Generated code: mvn r0, #255 masm.mov(r0, 0xFFFFFF00); masm.finalizecode(); 100

101 The UseScratchRegisterScope class Structured way to deal with scratch registers The IP register (R12) in particular should not be used directly Follows a standard C++ idiom: MacroAssembler as(t32); { UseScratchRegisterScope temps(&as); Register temporary = temps.acquire(); as.mov(temporary, 0x ); as.add(r0, temporary, temporary); } 101

102 Macro assembler pitfalls Consider the following situation (assuming we could access the IT instruction in the macro assembler): Generated code: MacroAssembler masm(t32); it eq masm.it(eq, 0x8); moveq r1, #22136 movt r1, #4660 add r1, r0, r1 masm.add(r1, r0, 0x ); masm.finalizecode(); 102

103 The AssemblerAccurateScope class Helps to control the number of generated instructions Prevents the assembler from emitting veneers and literal pools In fact, in situations like these the assembler must be used Provides bounds checking for the assembler Note that the constructor uses a size in bytes, not number of instructions 103

104 AssemblerAccurateScope example MacroAssembler masm(t32); { AssemblerAccurateScope aas(&masm, 4 * k32bitt32instructionsizeinbytes, CodeBufferCheckScope::kMaximumSize); masm.ittt(eq); masm.mov(r1, 22136); masm.movt(r1, 4660); masm.add(r1, r0, r1); } masm.finalizecode(); 104

105 Assembler vs. MacroAssembler The following table summarizes the differences: Assembler MacroAssembler Control over the generated code precise relaxed Code simplifications no yes Convenience no yes 105

106 Disassembler Implemented in the Disassembler class Declared in aarch32/disasm-aarch32.h Strives for strict ARMv8 compliance The main entry points are the DecodeA32() and DecodeT32() methods A little bit low-level for most use cases, especially when dealing with the variable-length T32 instructions 106

107 The PrintDisassembler class Provides a more convenient interface Most applications will probably use it instead of directly the disassembler Provides methods to disassemble a whole buffer of instructions: DisassembleA32Buffer() DisassembleT32Buffer() Also, a way to process a single instruction more conveniently (particularly for T32): DecodeA32At() DecodeT32At() 107

108 PrintDisassembler example Continuing our assembler example: Output: Assembler as(t32); 0x x f04f0001 mov r0, #1 as.bind(&factorial); 0x bf08 it eq as.movs(r1, r0); 0x bxeq lr 0x a fb00f001 mul r0, r0, r1 PrintDisassembler disasm(std::cout); 0x e 1e49 subs r1, #1 0x bf18 it ne 0x e7fa bne 0x a 0x bx lr disasm.disassemblet32buffer( as.getstartaddress<uint16_t *>(), movs r1, r0 as.getsizeofcodegenerated()); 108

109 The DisassemblerStream class The main approach to customize the disassembler output Used internally by the disassembler Each instruction is broken down into components by the disassembler, e.g.: Register MemOperand etc. The DisassemblerStream defines operators for processing each component Override the operator of interest to change the output 109

110 DisassemblerStream example Assigning a special name to a register: class RegisterPrettyPrinter : public DisassemblerStream { DisassemblerStream& operator<<(const Register reg) override { if (reg.is(r9)) { os() << "tr"; return *this; } else { return DisassemblerStream::operator<<(reg); } } }; 110

111 More examples and documentation Look into the examples/aarch32 directory in the VIXL source tree An excellent starting point for a beginner: doc/getting-started-aarch32.md 111

112 VIXL in the Android Runtime The ART team has been working on integrating VIXL into the AArch32 backend Lead to a safer and more extensible code base Mechanisms such as the UseScratchRegisterScope class provide better detection of mistakes The majority of the assembler and disassembler are automatically generated it should be much easier to support future ISA additions Much more extensive testing 112

113 Thank You #LAS16 For further information: LAS16 keynotes and videos on: connect.linaro.org

114 Android Runtime Performance Analysis Artem Serov Linaro ART team 114

115 Agenda Introduction Performance measurement Performance analysis ENGINEERS AND DEVICES WORKING TOGETHER 115

116 Linaro ART Team Android Runtime (ART) The managed runtime used by Java applications (Dex bytecode) and some system services on Android Android 6.0 or 7.0 Hybrid Mode (AOT) ART JIT in Android N by Xueliang ZHONG Linaro ART team Working on Android Runtime - improving the performance and stability Members and assignees from ARM, Spreadtrum, Mediatek 116

117 Art-testing Art-testing repository Benchmarks Recognized benchmarks: CHECKED! Embeddable CHECKED! Stable and reproducible CHECKED! Recognized CHECKED! Microbenchmarks Caffeinemark Benchmarksgame Stanford Richards Deltablue etc Analyzable and flexible New features Catch regressions Framework $./run.py --target --iterations 10 Host and target Statistics Perf tools 117

118 Performance: Per-patch What we re currently talking about Patch delivery life cycle Benchmarking Investigation Developing Code Review Testing Merging Perf Analysis 118

119 Performance Tracking Per-patch Performance comparing before and after We want to make sure that patches improve performance and don t bring unexpected degradation Continuous tracking Regressions and anomalies whenever they happen Upstream changes Linaro patches tracking (double checking) 119

120 Agenda Introduction Performance measurement Performance analysis ENGINEERS AND DEVICES WORKING TOGETHER 120

121 Performance: Continuous Tracking 121

122 Build-scripts Automated process to run benchmarks: building, configuring, running Android root chroot like (/system -> /data/local/tmp/system) Do not depend on other AOSP projects and GUI environment Device configuration: CPU frequencies and clusters (big/little)./scripts/benchmarks/benchmarks_run_target.sh --mode 32 --cpu little --iterations 10 Stable results CPU pinning Overheating Running the benchmarks 122

123 Performance: Per-patch: Automation 123

124 Performance: Manual Check geomean diff (%) geomean error 1 (%) geomean error 2 (%) caffeinemark/loopatom Summary intrinsics micro benchmarksgame algorithm stanford math caffeinemark OVERALL We measure time that s why negative numbers mean improvement 124

125 Agenda Introduction Performance measurement Performance analysis ENGINEERS AND DEVICES WORKING TOGETHER 125

126 Performance Analysis: Example caffeinemark/loopatom.java: % reduction of execution time (improvement) Task: Investigate the reason for performance difference Run perf-tools to collect data for A (before) and B (after) builds for caffeinemark/loopatom.java 126

127 Performance Analysis: Hotspots Hotspots - sections of code that get most of execution time Generic naive algorithm Find hotspots: Profiling Typically there are few hotspots which determine overall performance Method level Loop level Instruction level Find hotspots Analyze hotspots... Analyze hotspots Source code Binary code Tools PROFIT! 127

128 Art-testing Perf Tools Performance analysis of code generated by ART Scripts based on linux-perf-tools Not kernel Not native libraries Linux profiling with performance counters Statistical profiling Features Profiling - hotspots identification.cfg (IR + assembly) files generation Perf events collection All of the above in one click! 128

129 Identifying Hotspots: Methods Java methods benchmark.oat boot.oat Native methods Kernel Libart Others % Events DSO k?-- method % main data@local@tmp@bench.apk@classes.dex [.] int benchmarks.caffeinemark.loopatom.execute() % main linker [.] dl ZNK6soinfo10gnu_lookupER10SymbolName % main [kernel.kallsyms] [k] 0xffffffc c % main linker [.] dl ZNK6soinfo19find_symbol_by_nameER10S % main linker [.] dl ZN10SymbolName8gnu_hashEv 129

130 Hotspot:.CFG File c1visualizer - tool to visualize ART intermediate representation (IR) Control flow graph (CFG) IR Assembly For each method before and after each optimization 130

131 Identifying Hotspots public int execute() { for(int j = 0; j < FIBCOUNT; j++) { for(int k = 1; k < FIBCOUNT; k++) { 5.43 j1 += l; 5.45 add.w ip, r3, r0, lsl ldr.w r2, [ip, #12] add.w ip, r3, fp, lsl ldr.w r4, [ip, #12] 9.63 cmp r2, bge.n 32b add.w ip, r3, r0, lsl 6.45 str.w r4, [ip, #12] add.w ip, r3, fp, lsl 6.72 str.w r2, [ip, #12] 2.66 add.w fp, fp, # ldr r4, [sp, #4] 2.65 movs r2, # movs r0, # ldrh.w ip, [r9] cmp.w ip, #0 beq.n 3284 b.n 3314 if(fibs[k - 1] < fibs[k]) { int i1 = fibs[k - 1]; fibs[k - 1] = fibs[k]; fibs[k] = i1; } } } cmp fp, bge.n 32cc add sl, add.w r8, r8, #2 add.w r0, fp, # ; int l = FIBCOUNT + dummy; k1 += 2; } 4.90 #2 #2 #2 #

132 IR: c1visualizer 132

133 IR: c1visualizer 133

134 Analyzing Hotspot:.CFG File IR A (before the patch) IR B (after the patch) i75 ArrayGet [l6,i71] l18 IntermediateAddress [l6,i184] add.w ldr.w r12, r3, r0, lsl #2 r2, [r12, #12] i81 ArrayGet [l6,i51] add.w r12, r3, r11, lsl #2 ldr.w r4, [r12, #12] add r4, r3, #12 i75 ArrayGet [l185,i71] ldr.w r5, [r4, r2, lsl #2] i81 ArrayGet [l185,i51] ldr.w r6, [r4, r0, lsl #2] 134

135 Perf Events PMU counters - look into ARM Infocenter EPI - event per 1000 instructions IPC - instructions per cycle - increased from 0.97 to 1.23 Events are sorted by EPI A Event Descriptions Total events A Total events B Diff EPI A EPI B cycles Hardware event % instructions Hardware event % x14 L1 Instruction cache access % xE5 load/store instruction waiting for data to calculate the address in the AGU %

136 DS-5: Streamline Performance Analyzer ARM DS-5 Streamline - system-wide performance analysis tool Features: PMU counters Timeline Filter by processes and threads Multicore, multicluster and big.little Add custom annotations Overlay charts and customize expressions Mali GPU Optimization Supports Linux, Android For Android use the tutorial 136

137 Streamline: Diagram Example 137

138 Useful Links Android Open Source Project Android Runtime Linaro benchmarks and tools repository tool to visualizer ART intermediate representation Linux profiling with performance counters 6. pment-studio/streamline - ARM Streamline Performance Analyzer 7. FB.html - Cortex-A53 Performance Monitor Unit Events 138

139 Thank You #LAS16 For further information: LAS16 keynotes and videos on: connect.linaro.org 139

140 Android Runtime: Metrics Compilation Memory footprint Static: How much storage is required for the app binary Dynamic: How much RAM is consumed when the app is running Run-time performance How long it takes to compile the app How much RAM is consumed during app compilation The quality of the generated code From this point performance = run-time performance 140

141 Backup: Identifying hotspots Check performance difference (in %) Skim over the bench sources Run perf with cycles event Identify the hotspots using perf report a. b. c. d. Single very hot java leaf method Non-leaf java method not from boot.oat Native method Java method from boot.oat 5. Examine the hotspot a. b. Validate that this particular hotspot determines the difference in total performance Split and alter big methods 141

142 Backup: Analyzing Hotspots 1. Get.cfg file for the method 2. Identify the exact piece of hot code a. b. Loops Perf-annotate 3. Compare the corresponding IR (c1visualizer) a. find the compiler phase where difference occur 4. Compare the corresponding assembly (c1visualizer) 5. Use static binary built from assembly for performance difference validation 6. Run perf scripts will all PMU events a. b. c. Total-period option CPI Cycle per instruction reflect the performance ${Counter} / instructions * 1000 reflect counter s impact 142

143 O and on: What s in AOSP s future and how can we help? New partition layout, A/B updates What else would we LIKE to see in AOSP s future? and how can we help bring it about?

144 Anything else? Did the topics of the microconference bring up another topic we should be talking about? Did we omit an important topic? Feel free to talk about anything AOSP related now...

145 Thank You #LAS16 For further information: LAS16 keynotes and videos on: connect.linaro.org

146 Memory allocator analysis Primary focus: Reduce memory usage on low-memory devices Malloc implementations investigated: jemalloc, dlmalloc, nedmalloc, tcmalloc, musl malloc, TLSF, lockless allocator Memory analysis briefing /edit#heading=h.z9n368sk0eai Challenges Porting of atomic routines for ARM 64-bit platform name mangling issues C99 warnings Wrapper/dummy calls for bionic integration (e.g. malloc_usable_size, malloc_disable, mallinfo etc.) Other runtime issues Benchmark porting - (tlsf-test, t-test) Fragmentation analysis script

147 Memory allocator analysis Summary tcmalloc, jemalloc wins for multi-threaded apps and run time performance (good amount of small pages available at runtime) static size reduction for libc is improved with nedmalloc and tlsf jemalloc-svelte does not have good stand compare to jemalloc & tcmalloc Support issue with nedmalloc - no more support. Lockless allocator - under private license Note: Rank graph is generated based on relative performance. For real numbers kindly refer to memory analysis document

HiKey in AOSP - Update. John Stultz

HiKey in AOSP - Update. John Stultz HiKey in AOSP - Update John Stultz Continuing Collaboration Working closely with folks at Google. Submitting changes directly to AOSP Gerrit. New Features Added Since Announcement

More information

ART JIT in Android N. Xueliang ZHONG Linaro ART Team

ART JIT in Android N. Xueliang ZHONG Linaro ART Team ART JIT in Android N Xueliang ZHONG Linaro ART Team linaro-art@linaro.org 1 Outline Android Runtime (ART) and the new challenges ART Implementation in Android N Tooling Performance Data & Findings Q &

More information

AOSP Devboard Update & Recent/Future Pain Points. John Stultz

AOSP Devboard Update & Recent/Future Pain Points. John Stultz AOSP Devboard Update & Recent/Future Pain Points John Stultz Now there are two: https://source.android.com/source/devices HiKey HiKey960 Hardware overview HiKey HiSilicon Kirin

More information

Improving the bootup speed of AOSP

Improving the bootup speed of AOSP Improving the bootup speed of AOSP Bernhard Bero Rosenkränzer CC-BY-SA 3.0 ELC 2017-02-23 Quick overview 2 different possible approaches: Reduce regular bootup time Problem: Lots of initialization

More information

LINUX KERNEL UPDATES FOR AUTOMOTIVE: LESSONS LEARNED

LINUX KERNEL UPDATES FOR AUTOMOTIVE: LESSONS LEARNED LINUX KERNEL UPDATES FOR AUTOMOTIVE: LESSONS LEARNED TOM MCREYNOLDS, VLAD BUZOV AUTOMOTIVE SOFTWARE OCTOBER 15TH, 2013 Why kernel upgrades : the problem Linux Kernel cadence doesn t match Automotive s

More information

Four Components of a Computer System

Four Components of a Computer System Four Components of a Computer System Operating System Concepts Essentials 2nd Edition 1.1 Silberschatz, Galvin and Gagne 2013 Operating System Definition OS is a resource allocator Manages all resources

More information

Mainline on form-factor devices / Improving AOSP

Mainline on form-factor devices / Improving AOSP Mainline on form-factor devices / Improving AOSP Presented by John Stultz Date Thursday 24 September 2015 Event SFO15 John Stultz Topics from Linux Plumbers Barriers to running

More information

Chapter 2. Operating-System Structures

Chapter 2. Operating-System Structures Chapter 2 Operating-System Structures 2.1 Chapter 2: Operating-System Structures Operating System Services User Operating System Interface System Calls Types of System Calls System Programs Operating System

More information

Android System Development Training 4-day session

Android System Development Training 4-day session Android System Development Training 4-day session Title Android System Development Training Overview Understanding the Android Internals Understanding the Android Build System Customizing Android for a

More information

ECE 598 Advanced Operating Systems Lecture 4

ECE 598 Advanced Operating Systems Lecture 4 ECE 598 Advanced Operating Systems Lecture 4 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 28 January 2016 Announcements HW#1 was due HW#2 was posted, will be tricky Let me know

More information

LCA14-412: GPGPU on ARM SoC. Thu 6 March, 2.00pm, T.Gall, G.Pitney

LCA14-412: GPGPU on ARM SoC. Thu 6 March, 2.00pm, T.Gall, G.Pitney LCA14-412: GPGPU on ARM SoC Thu 6 March, 2.00pm, T.Gall, G.Pitney Agenda Shamrock - Gil Pitney sqlite accelerated with OpenCL - Tom Gall GPGPU Goals Recognizing that: GPUs are much more energy efficient

More information

CHAPTER 2: SYSTEM STRUCTURES. By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

CHAPTER 2: SYSTEM STRUCTURES. By I-Chen Lin Textbook: Operating System Concepts 9th Ed. CHAPTER 2: SYSTEM STRUCTURES By I-Chen Lin Textbook: Operating System Concepts 9th Ed. Chapter 2: System Structures Operating System Services User Operating System Interface System Calls Types of System

More information

Profiling and Debugging OpenCL Applications with ARM Development Tools. October 2014

Profiling and Debugging OpenCL Applications with ARM Development Tools. October 2014 Profiling and Debugging OpenCL Applications with ARM Development Tools October 2014 1 Agenda 1. Introduction to GPU Compute 2. ARM Development Solutions 3. Mali GPU Architecture 4. Using ARM DS-5 Streamline

More information

<Insert Picture Here> Btrfs Filesystem

<Insert Picture Here> Btrfs Filesystem Btrfs Filesystem Chris Mason Btrfs Goals General purpose filesystem that scales to very large storage Feature focused, providing features other Linux filesystems cannot Administration

More information

Graphics Stack Update

Graphics Stack Update Graphics Stack Update Presented by Jammy Zhou Date March 9, 2016 Event BKK16 Agenda X11/Wayland/Android graphics overview Mali and Adreno driver status Linaro effort around graphics Discussion and Q&A

More information

Keeping up with LTS Linux Kernel Functional Testing on Devices

Keeping up with LTS Linux Kernel Functional Testing on Devices Keeping up with LTS Linux Kernel Functional Testing on Devices Tom Gall Director, Linaro Mobile Group Who is Linaro? Linaro is leading software collaboration in the ARM ecosystem Instead of duplicating

More information

Advanced file systems: LFS and Soft Updates. Ken Birman (based on slides by Ben Atkin)

Advanced file systems: LFS and Soft Updates. Ken Birman (based on slides by Ben Atkin) : LFS and Soft Updates Ken Birman (based on slides by Ben Atkin) Overview of talk Unix Fast File System Log-Structured System Soft Updates Conclusions 2 The Unix Fast File System Berkeley Unix (4.2BSD)

More information

Chapter 2: Operating-System Structures

Chapter 2: Operating-System Structures Chapter 2: Operating-System Structures Chapter 2: Operating-System Structures Operating System Services User Operating System Interface System Calls Types of System Calls System Programs Operating System

More information

Ext3/4 file systems. Don Porter CSE 506

Ext3/4 file systems. Don Porter CSE 506 Ext3/4 file systems Don Porter CSE 506 Logical Diagram Binary Formats Memory Allocators System Calls Threads User Today s Lecture Kernel RCU File System Networking Sync Memory Management Device Drivers

More information

CS 111. Operating Systems Peter Reiher

CS 111. Operating Systems Peter Reiher Operating System Principles: File Systems Operating Systems Peter Reiher Page 1 Outline File systems: Why do we need them? Why are they challenging? Basic elements of file system design Designing file

More information

Operating Systems. File Systems. Thomas Ropars.

Operating Systems. File Systems. Thomas Ropars. 1 Operating Systems File Systems Thomas Ropars thomas.ropars@univ-grenoble-alpes.fr 2017 2 References The content of these lectures is inspired by: The lecture notes of Prof. David Mazières. Operating

More information

The HiKey AOSP collaborative experience

The HiKey AOSP collaborative experience The HiKey AOSP collaborative experience Presented by John Stultz (With help from Amit Pundir, Guodong Xu, and Vishal Bhoj) Date BKK16-310 March 9, 2016 Event Linaro Connect BKK16 Outline HiKey in AOSP

More information

F28HS Hardware-Software Interface: Systems Programming

F28HS Hardware-Software Interface: Systems Programming F28HS Hardware-Software Interface: Systems Programming Hans-Wolfgang Loidl School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh Semester 2 2017/18 0 No proprietary software has

More information

Chapter 2: Operating-System Structures. Operating System Concepts 9 th Edit9on

Chapter 2: Operating-System Structures. Operating System Concepts 9 th Edit9on Chapter 2: Operating-System Structures Operating System Concepts 9 th Edit9on Silberschatz, Galvin and Gagne 2013 Chapter 2: Operating-System Structures 1. Operating System Services 2. User Operating System

More information

Operating Systems (2INC0) 2018/19. Introduction (01) Dr. Tanir Ozcelebi. Courtesy of Prof. Dr. Johan Lukkien. System Architecture and Networking Group

Operating Systems (2INC0) 2018/19. Introduction (01) Dr. Tanir Ozcelebi. Courtesy of Prof. Dr. Johan Lukkien. System Architecture and Networking Group Operating Systems (2INC0) 20/19 Introduction (01) Dr. Courtesy of Prof. Dr. Johan Lukkien System Architecture and Networking Group Course Overview Introduction to operating systems Processes, threads and

More information

<Insert Picture Here> Filesystem Features and Performance

<Insert Picture Here> Filesystem Features and Performance Filesystem Features and Performance Chris Mason Filesystems XFS Well established and stable Highly scalable under many workloads Can be slower in metadata intensive workloads Often

More information

The Embedded Linux Problem

The Embedded Linux Problem The Embedded Linux Problem Mark.gross@intel.com Android-Linux kernel Architect February 2013 outline Little about me Intro History Environment Key questions Techniques Moving modules out of tree Summary

More information

Managing build infrastructure of a Debian derivative

Managing build infrastructure of a Debian derivative Managing build infrastructure of a Debian derivative Andrej Shadura 4 February 2018 Presentation Outline Who am I Enter Apertis Build infrastructure Packaging workflows Image builds Andrej Shadura contributing

More information

Chapter 2: Operating-System Structures. Operating System Concepts 9 th Edition

Chapter 2: Operating-System Structures. Operating System Concepts 9 th Edition Chapter 2: Operating-System Structures Silberschatz, Galvin and Gagne 2013 Chapter 2: Operating-System Structures Operating System Services User Operating System Interface System Calls Types of System

More information

Mainline Explicit Fencing

Mainline Explicit Fencing Mainline Explicit Fencing A new era for graphics Gustavo Padovan Open First Agenda Introduction Android Sync Framework Mainline Explicit Fencing Current Status 2 Fencing Ensure ordering between operations

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Spring 2018 Lecture 22 File Systems Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 Disk Structure Disk can

More information

Flash filesystem benchmarks

Flash filesystem benchmarks Embedded Linux Conference Europe 21 Flash filesystem benchmarks Michael Opdenacker Free Electrons Copyright 21, Free Electrons. 1 Free FreeElectrons Electrons Free embedded Linux and kernel materials http://free

More information

What is version control? (discuss) Who has used version control? Favorite VCS? Uses of version control (read)

What is version control? (discuss) Who has used version control? Favorite VCS? Uses of version control (read) 1 For the remainder of the class today, I want to introduce you to a topic we will spend one or two more classes discussing and that is source code control or version control. What is version control?

More information

Are you Really Helped by Upstream Kernel Code?

Are you Really Helped by Upstream Kernel Code? Are you Really Helped by Upstream Kernel Code? 1 HISAO MUNAKATA RENESAS SOLUTIONS CORP hisao.munakata.vt(at)renesas.com who am I Working for Renesas (semiconductor) 2 Over 15 years real embedded Linux

More information

IVI Fast boot approach

IVI Fast boot approach IVI Fast boot approach 07/13/2016 Yuichi Kusakabe SS Engineering Group Fujitsu TEN LIMITED 1 About Myself Yuichi Kusakabe (Fujitsu TEN LIMITED) Software Engineer of IVI about 10 years (for 16-bit and 32-bit

More information

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University File System Case Studies Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics The Original UNIX File System FFS Ext2 FAT 2 UNIX FS (1)

More information

CS3600 SYSTEMS AND NETWORKS

CS3600 SYSTEMS AND NETWORKS CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 11: File System Implementation Prof. Alan Mislove (amislove@ccs.neu.edu) File-System Structure File structure Logical storage unit Collection

More information

Chapter 12: File System Implementation

Chapter 12: File System Implementation Chapter 12: File System Implementation Silberschatz, Galvin and Gagne 2013 Chapter 12: File System Implementation File-System Structure File-System Implementation Allocation Methods Free-Space Management

More information

OpenACC Course. Office Hour #2 Q&A

OpenACC Course. Office Hour #2 Q&A OpenACC Course Office Hour #2 Q&A Q1: How many threads does each GPU core have? A: GPU cores execute arithmetic instructions. Each core can execute one single precision floating point instruction per cycle

More information

Running Android on the Mainline Graphics Stack. Robert

Running Android on the Mainline Graphics Stack. Robert Running Android on the Mainline Graphics Stack Robert Foss @memcpy_io Agenda Android History Android on Mainline Current Status Big Picture Android History Android History Qualcomm diff with mainline,

More information

Profiling: Understand Your Application

Profiling: Understand Your Application Profiling: Understand Your Application Michal Merta michal.merta@vsb.cz 1st of March 2018 Agenda Hardware events based sampling Some fundamental bottlenecks Overview of profiling tools perf tools Intel

More information

CS 4284 Systems Capstone

CS 4284 Systems Capstone CS 4284 Systems Capstone Disks & File Systems Godmar Back Disks & Filesystems Disk Schematics Source: Micro House PC Hardware Library Volume I: Hard Drives 3 Tracks, Sectors, Cylinders 4 Hard Disk Example

More information

Boosting Quasi-Asynchronous I/Os (QASIOs)

Boosting Quasi-Asynchronous I/Os (QASIOs) Boosting Quasi-hronous s (QASIOs) Joint work with Daeho Jeong and Youngjae Lee Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu The Problem 2 Why?

More information

Short Notes of CS201

Short Notes of CS201 #includes: Short Notes of CS201 The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with < and > if the file is a system

More information

COS 318: Operating Systems. File Systems. Topics. Evolved Data Center Storage Hierarchy. Traditional Data Center Storage Hierarchy

COS 318: Operating Systems. File Systems. Topics. Evolved Data Center Storage Hierarchy. Traditional Data Center Storage Hierarchy Topics COS 318: Operating Systems File Systems hierarchy File system abstraction File system operations File system protection 2 Traditional Data Center Hierarchy Evolved Data Center Hierarchy Clients

More information

IT Best Practices Audit TCS offers a wide range of IT Best Practices Audit content covering 15 subjects and over 2200 topics, including:

IT Best Practices Audit TCS offers a wide range of IT Best Practices Audit content covering 15 subjects and over 2200 topics, including: IT Best Practices Audit TCS offers a wide range of IT Best Practices Audit content covering 15 subjects and over 2200 topics, including: 1. IT Cost Containment 84 topics 2. Cloud Computing Readiness 225

More information

Crash Consistency: FSCK and Journaling. Dongkun Shin, SKKU

Crash Consistency: FSCK and Journaling. Dongkun Shin, SKKU Crash Consistency: FSCK and Journaling 1 Crash-consistency problem File system data structures must persist stored on HDD/SSD despite power loss or system crash Crash-consistency problem The system may

More information

H.-S. Oh, B.-J. Kim, H.-K. Choi, S.-M. Moon. School of Electrical Engineering and Computer Science Seoul National University, Korea

H.-S. Oh, B.-J. Kim, H.-K. Choi, S.-M. Moon. School of Electrical Engineering and Computer Science Seoul National University, Korea H.-S. Oh, B.-J. Kim, H.-K. Choi, S.-M. Moon School of Electrical Engineering and Computer Science Seoul National University, Korea Android apps are programmed using Java Android uses DVM instead of JVM

More information

64-bit ARM Unikernels on ukvm

64-bit ARM Unikernels on ukvm 64-bit ARM Unikernels on ukvm Wei Chen Senior Software Engineer Tokyo / Open Source Summit Japan 2017 2017-05-31 Thanks to Dan Williams, Martin Lucina, Anil Madhavapeddy and other Solo5

More information

The Btrfs Filesystem. Chris Mason

The Btrfs Filesystem. Chris Mason The Btrfs Filesystem Chris Mason The Btrfs Filesystem Jointly developed by a number of companies Oracle, Redhat, Fujitsu, Intel, SUSE, many others All data and metadata is written via copy-on-write CRCs

More information

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017 ECE 550D Fundamentals of Computer Systems and Engineering Fall 2017 The Operating System (OS) Prof. John Board Duke University Slides are derived from work by Profs. Tyler Bletsch and Andrew Hilton (Duke)

More information

File Systems Management and Examples

File Systems Management and Examples File Systems Management and Examples Today! Efficiency, performance, recovery! Examples Next! Distributed systems Disk space management! Once decided to store a file as sequence of blocks What s the size

More information

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto Ricardo Rocha Department of Computer Science Faculty of Sciences University of Porto Slides based on the book Operating System Concepts, 9th Edition, Abraham Silberschatz, Peter B. Galvin and Greg Gagne,

More information

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. File-System Structure File structure Logical storage unit Collection of related information File

More information

CS201 - Introduction to Programming Glossary By

CS201 - Introduction to Programming Glossary By CS201 - Introduction to Programming Glossary By #include : The #include directive instructs the preprocessor to read and include a file into a source code file. The file name is typically enclosed with

More information

ò Very reliable, best-of-breed traditional file system design ò Much like the JOS file system you are building now

ò Very reliable, best-of-breed traditional file system design ò Much like the JOS file system you are building now Ext2 review Very reliable, best-of-breed traditional file system design Ext3/4 file systems Don Porter CSE 506 Much like the JOS file system you are building now Fixed location super blocks A few direct

More information

FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23

FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23 FILE SYSTEMS CS124 Operating Systems Winter 2015-2016, Lecture 23 2 Persistent Storage All programs require some form of persistent storage that lasts beyond the lifetime of an individual process Most

More information

BUD Status of Android AOSP TV Project. Khasim Syed Mohammed, Tech Lead Linaro Home Group

BUD Status of Android AOSP TV Project. Khasim Syed Mohammed, Tech Lead Linaro Home Group BUD17-118 Status of Android AOSP TV Project Khasim Syed Mohammed, Tech Lead Linaro Home Group Overview ENGINEERS AND DEVICES WORKING TOGETHER What is AOSP TV Project about? Focus and Goals of AOSP TV project

More information

Memory management. Last modified: Adaptation of Silberschatz, Galvin, Gagne slides for the textbook Applied Operating Systems Concepts

Memory management. Last modified: Adaptation of Silberschatz, Galvin, Gagne slides for the textbook Applied Operating Systems Concepts Memory management Last modified: 26.04.2016 1 Contents Background Logical and physical address spaces; address binding Overlaying, swapping Contiguous Memory Allocation Segmentation Paging Structure of

More information

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24 FILE SYSTEMS, PART 2 CS124 Operating Systems Fall 2017-2018, Lecture 24 2 Last Time: File Systems Introduced the concept of file systems Explored several ways of managing the contents of files Contiguous

More information

96Boards - TV Platform

96Boards - TV Platform 96Boards - TV Platform Presented by Mark Gregotski Developing the Specification Date BKK16-303 March 9, 2016 Event Linaro Connect BKK16 Overview Motivation for a TV Platform Specification Comparison with

More information

Under The Hood: Performance Tuning With Tizen. Ravi Sankar Guntur

Under The Hood: Performance Tuning With Tizen. Ravi Sankar Guntur Under The Hood: Performance Tuning With Tizen Ravi Sankar Guntur How to write a Tizen App Tools already available in IDE v2.3 Dynamic Analyzer Valgrind 2 What s NEXT? Want to optimize my application App

More information

Hostless Xen Deployment

Hostless Xen Deployment Hostless Xen Deployment Xen Summit Fall 2007 David Lively dlively@virtualiron.com dave.lively@gmail.com Hostless Xen Deployment What Hostless Means Motivation System Architecture Challenges and Solutions

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2017 Lecture 24 File Systems Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 Questions from last time How

More information

SFO17-315: OpenDataPlane Testing in Travis. Dmitry Eremin-Solenikov, Cavium Maxim Uvarov, Linaro

SFO17-315: OpenDataPlane Testing in Travis. Dmitry Eremin-Solenikov, Cavium Maxim Uvarov, Linaro SFO17-315: OpenDataPlane Testing in Travis Dmitry Eremin-Solenikov, Cavium Maxim Uvarov, Linaro What is ODP (OpenDataPlane) The ODP project is an open-source, cross-platform set of APIs for the networking

More information

Building a reference IoT product with Zephyr. Ricardo Salveti Michael Scott Tyler Baker

Building a reference IoT product with Zephyr. Ricardo Salveti Michael Scott Tyler Baker Building a reference IoT product with Zephyr Ricardo Salveti Michael Scott Tyler Baker Introduction Linaro Technologies A small team within Linaro focusing on open source end-to-end solutions Who is here?

More information

CS 318 Principles of Operating Systems

CS 318 Principles of Operating Systems CS 318 Principles of Operating Systems Fall 2017 Lecture 16: File Systems Examples Ryan Huang File Systems Examples BSD Fast File System (FFS) - What were the problems with the original Unix FS? - How

More information

qcow2 Red Hat Kevin Wolf 15 August 2011

qcow2 Red Hat Kevin Wolf 15 August 2011 qcow2 Red Hat Kevin Wolf 15 August 2011 Section 1 qcow2 format basics qcow2 format basics Overview of qcow2 features Sparse images Snapshots Internal or external Internal snapshots can contain VM state

More information

CHAPTER 8: MEMORY MANAGEMENT. By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

CHAPTER 8: MEMORY MANAGEMENT. By I-Chen Lin Textbook: Operating System Concepts 9th Ed. CHAPTER 8: MEMORY MANAGEMENT By I-Chen Lin Textbook: Operating System Concepts 9th Ed. Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the

More information

LMG Lightning Talks LMG

LMG Lightning Talks LMG LMG Lightning Talks LMG linaro-android kernel topic branch updates Amit Pundir linaro-android kernel updates lsk-v3.18-android Not actively maintained by LMG. lsk-v4.4-android Weekly/Bi-weekly android-4.4

More information

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES OBJECTIVES Detailed description of various ways of organizing memory hardware Various memory-management techniques, including paging and segmentation To provide

More information

ION - Large pages for devices

ION - Large pages for devices ION - Large pages for devices John Einar Reitan Android/Mobile Microconference - LPC 2016 Motivation ARM Display + IOMMU need 2MB pages when rotating Native page size 4kB 64kB pages

More information

meta-raspberrypi Documentation

meta-raspberrypi Documentation meta-raspberrypi Documentation Release rocko meta-raspberrypi contributors Sep 06, 2018 Contents 1 meta-raspberrypi 3 1.1 Quick links................................................ 3 1.2 Description................................................

More information

Linux Filesystems and Storage Chris Mason Fusion-io

Linux Filesystems and Storage Chris Mason Fusion-io Linux Filesystems and Storage Chris Mason Fusion-io 2012 Storage Developer Conference. Insert Your Company Name. All Rights Reserved. Linux 2.4.x Enterprise Ready! Start of SMP scalability Many journaled

More information

Chapter 8: Memory-Management Strategies

Chapter 8: Memory-Management Strategies Chapter 8: Memory-Management Strategies Chapter 8: Memory Management Strategies Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel 32 and

More information

VIXL. A runtime assembler C++ API

VIXL. A runtime assembler C++ API VIXL A runtime assembler C++ API 1 Agenda What is VIXL? What is it used for? What s new in VIXL? Conclusion & Q&A 2 What is VIXL A Runtime code-generation library for ARM AArch64 now, AArch32 (Arm and

More information

Efficient and Large Scale Program Flow Tracing in Linux. Alexander Shishkin, Intel

Efficient and Large Scale Program Flow Tracing in Linux. Alexander Shishkin, Intel Efficient and Large Scale Program Flow Tracing in Linux Alexander Shishkin, Intel 16.09.2013 Overview Program flow tracing - What is it? - What is it good for? Intel Processor Trace - Features / capabilities

More information

CS 318 Principles of Operating Systems

CS 318 Principles of Operating Systems CS 318 Principles of Operating Systems Fall 2018 Lecture 16: Advanced File Systems Ryan Huang Slides adapted from Andrea Arpaci-Dusseau s lecture 11/6/18 CS 318 Lecture 16 Advanced File Systems 2 11/6/18

More information

22 File Structure, Disk Scheduling

22 File Structure, Disk Scheduling Operating Systems 102 22 File Structure, Disk Scheduling Readings for this topic: Silberschatz et al., Chapters 11-13; Anderson/Dahlin, Chapter 13. File: a named sequence of bytes stored on disk. From

More information

ARM Trusted Firmware From Embedded to Enterprise. Dan Handley

ARM Trusted Firmware From Embedded to Enterprise. Dan Handley ARM Trusted Firmware From Embedded to Enterprise Dan Handley Agenda Quick recap Project news Security hardening AArch32 support ENGINEERS AND DEVICES WORKING TOGETHER Other enhancements Translation table

More information

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1 Agenda CSE P 501 Compilers Java Implementation JVMs, JITs &c Hal Perkins Summer 2004 Java virtual machine architecture.class files Class loading Execution engines Interpreters & JITs various strategies

More information

Inside the PostgreSQL Shared Buffer Cache

Inside the PostgreSQL Shared Buffer Cache Truviso 07/07/2008 About this presentation The master source for these slides is http://www.westnet.com/ gsmith/content/postgresql You can also find a machine-usable version of the source code to the later

More information

CSE 374 Programming Concepts & Tools

CSE 374 Programming Concepts & Tools CSE 374 Programming Concepts & Tools Hal Perkins Fall 2017 Lecture 11 gdb and Debugging 1 Administrivia HW4 out now, due next Thursday, Oct. 26, 11 pm: C code and libraries. Some tools: gdb (debugger)

More information

Chapter 12: File System Implementation

Chapter 12: File System Implementation Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency

More information

Block Device Scheduling. Don Porter CSE 506

Block Device Scheduling. Don Porter CSE 506 Block Device Scheduling Don Porter CSE 506 Logical Diagram Binary Formats Memory Allocators System Calls Threads User Kernel RCU File System Networking Sync Memory Management Device Drivers CPU Scheduler

More information

Block Device Scheduling

Block Device Scheduling Logical Diagram Block Device Scheduling Don Porter CSE 506 Binary Formats RCU Memory Management File System Memory Allocators System Calls Device Drivers Interrupts Net Networking Threads Sync User Kernel

More information

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Operating Systems Lecture 7.2 - File system implementation Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Design FAT or indexed allocation? UFS, FFS & Ext2 Journaling with Ext3

More information

OPERATING SYSTEM. Chapter 12: File System Implementation

OPERATING SYSTEM. Chapter 12: File System Implementation OPERATING SYSTEM Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management

More information

Operating System Services. User Services. System Operation Services. User Operating System Interface - CLI. A View of Operating System Services

Operating System Services. User Services. System Operation Services. User Operating System Interface - CLI. A View of Operating System Services Operating System Services One set of services for users The other set of services for system operations Operating Systems Structures Notice: This set of slides is based on the notes by Professor Perrone

More information

Chapter 8: Main Memory

Chapter 8: Main Memory Chapter 8: Main Memory Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel 32 and 64-bit Architectures Example:

More information

Another difference is that the kernel includes only the suspend to memory mechanism, and not the suspend to hard disk, which is used on PCs.

Another difference is that the kernel includes only the suspend to memory mechanism, and not the suspend to hard disk, which is used on PCs. 9. Android is an open-source operating system for mobile devices. Nowadays, it has more than 1.4 billion monthly active users (statistic from September 2015) and the largest share on the mobile device

More information

File System Implementation

File System Implementation File System Implementation Last modified: 16.05.2017 1 File-System Structure Virtual File System and FUSE Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance. Buffering

More information

Tux3 linux filesystem project

Tux3 linux filesystem project Tux3 linux filesystem project A Shiny New Filesystem for Linux http://tux3.org What is a next gen filesystem? Snapshots, writable and recursive Incremental backup, online Replication Good Extended Attribute

More information

OS Structure. Kevin Webb Swarthmore College January 25, Relevant xkcd:

OS Structure. Kevin Webb Swarthmore College January 25, Relevant xkcd: OS Structure Kevin Webb Swarthmore College January 25, 2018 Relevant xkcd: One of the survivors, poking around in the ruins with the point of a spear, uncovers a singed photo of Richard Stallman. They

More information

Chapter 11: Implementing File Systems

Chapter 11: Implementing File Systems Chapter 11: Implementing File Systems Operating System Concepts 99h Edition DM510-14 Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory Implementation Allocation

More information

LLVMLinux: x86 Kernel Build

LLVMLinux: x86 Kernel Build LLVMLinux: x86 Kernel Build Presented by: Jan-Simon Möller Presentation Date: 2012.08.30 Topics Common issues (x86 perspective) Specific Issues with Clang/LLVM Specific Issues with the Linux Kernel Status

More information

CSE 333 Lecture 9 - storage

CSE 333 Lecture 9 - storage CSE 333 Lecture 9 - storage Steve Gribble Department of Computer Science & Engineering University of Washington Administrivia Colin s away this week - Aryan will be covering his office hours (check the

More information

Chapter 8: Main Memory. Operating System Concepts 9 th Edition

Chapter 8: Main Memory. Operating System Concepts 9 th Edition Chapter 8: Main Memory Silberschatz, Galvin and Gagne 2013 Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel

More information

Chapter 2: Operating-System Structures

Chapter 2: Operating-System Structures Chapter 2: Operating-System Structures Chapter 2: Operating-System Structures Operating System Services User Operating System Interface System Calls Types of System Calls System Programs Operating System

More information

DDMD AND AUTOMATED CONVERSION FROM C++ TO D

DDMD AND AUTOMATED CONVERSION FROM C++ TO D 1 DDMD AND AUTOMATED CONVERSION FROM C++ TO D Daniel Murphy (aka yebblies ) ABOUT ME Using D since 2009 Compiler contributor since 2011 2 OVERVIEW Why convert the frontend to D What s so hard about it

More information