Challenges in Mining Whole Software Universe

Size: px
Start display at page:

Download "Challenges in Mining Whole Software Universe"

Transcription

1 Challenges in Mining Whole Software Universe Katsuro Inoue Osaka University

2 Cover Ratio Analyzing Evolution of kern_malloc Lites 1.0 Kernel Source 2 Archive - CMU Mach Lites 1.1.u Kame SimOS Kame Lites The Rio (RAM I/O) Project ftp in The University of Edinburgh 7 Mip-summer98 8 freebsd/sparc 9-12 ftp in Stockholm University 13 freebsd-cam2.1.5r SonicOSX 16 Labyrinth BSD(labyrinthos) 47 Netnice 48 Kame Psumip 51 Netnice 52 Reflexprotocol 53 Netnice 54 NetBSD v OpenBSD PV Xen 56 OpenBSD v Oskit 18 Psumip 57 Pmon Proyecto A.T.L.D GNU/hurd(extremeli nux) 19 Mach 63 OpenBSD v Savannah Unofficial OSKit 23 source Unofficial OSKit source(oskit) ftp in Stockholm 27 University 64 Pmon Chord-ns3 67 Openbsd-loongsonvc 0.4 Results by G(Google Code Search) 1993/01/ /10/ /07/ /04/ /01/ /10/ /07/06 and 2012/04/01 K(Koders) Last modified time : File in New BSD License : File in Original BSD License 2

3 v v1.2.1 v1.2.7 v1.2.8 v1.2.5 v v v v v v v v v v v v v v v v v1.4.1 v v1.4.2 v1.4.4 v v1.4.6beta06 v1.5.1 v1.5.4 v1.5.7 v v1.4.8 v v1.5.9 v v v Analyzing Reuse of Outdated Libraries Vulnerability of 50 OSS Projects Using libpng 0 Vulnerabilities reported No defects reported Result from Google Code Search and Koders 3

4 Experience and Concern Mining source code repositories, e.g., SourceForge, Github, Open Hub, Google Code, Marven,... (BlackDuck) Outcomes heavily depend on repository contents Aren't we mining a small world? There may be many other source code contents in the universe

5

6

7

8

9 Whole Software Universe U Whole Software Universe U Collection of All Software Developed by Human in the Past Open source software Personally-developed software Proprietary software... any others P : Set of all meaningful software U P (a countable infinite set)

10 Questions for U A) How do we get U?? B) What do we mine from U? C) How do we mine U? D) Why do we mine U?

11 A) How Do We Get U? No one knows actual U So we would collect many repositories, and construct a subset U U U should be as large as possible, of course U should reflect characteristics of U Challenges Collecting and unifying different repositories into U Duplication, coherence,... Performance and capacity for U Updating and maintaining U

12 B) What Do We Mine from U? Examples Simple metrics of U over history Size U t1, U t2, Language usage Density of U with respect to P History and evolution of code c in U Origin version of c Closely related code c (clone, variation, family,...) Future prediction for c

13 C) How Do We Mine U (U )? 1. Direct mining Good model Powerful machine 2. Indirect mining Use external services Reconstruct mining result from those external services

14 Direct Mining U Copy of U U

15 Indirect Mining Want to know about U Mashup Engine U Query Decomposition and Result Composition U

16 D) Why Do We Mine U? Objectives of mining U Reuse and knowledge transfer We do not want to reinvent the wheel Historical Archive... Frontier's wisdom

17 Discussion! Is it interesting research topics? Can we get useful research results? Is it feasible research target?

18 Thank you

19

ICECCS 2017 Exploring Similar Code

ICECCS 2017 Exploring Similar Code ICECCS 2017 Exploring Similar Code - From Code Clone Detection to Provenance Identification - Katsuro Inoue Osaka University 1 Software Engineering Laboratory, Department of Computer Science, Graduate

More information

The Good, the Bad, and the Ugly?

The Good, the Bad, and the Ugly? Corporate Technology The Good, the Bad, and the Ugly? Structure and Trends of Open Unix Kernels Dr. Wolfgang Mauerer, Siemens AG, CT SE 2 Corporate Competence Centre Embedded Linux wolfgang.mauerer@siemens.com

More information

CoxR: Open Source Development History Search System

CoxR: Open Source Development History Search System CoxR: Open Source Development History Search System Makoto Matsushita, Kei Sasaki and Katsuro Inoue Graduate School of Information Science and Technology, Osaka University 1-3, Machikaneyama-cho, Toyonaka-shi,

More information

How are Developers Treating License Inconsistency Issues? A Case Study on License Inconsistency Evolution in FOSS Projects

How are Developers Treating License Inconsistency Issues? A Case Study on License Inconsistency Evolution in FOSS Projects How are Developers Treating License Inconsistency Issues? A Case Study on License Inconsistency Evolution in FOSS Projects Yuhao Wu 1(B), Yuki Manabe 2, Daniel M. German 3, and Katsuro Inoue 1 1 Graduate

More information

Multi-Project Software Engineering: An Example

Multi-Project Software Engineering: An Example Multi-Project Software Engineering: An Example Pankaj K Garg garg@zeesource.net Zee Source 1684 Nightingale Avenue, Suite 201, Sunnyvale, CA 94087, USA Thomas Gschwind tom@infosys.tuwien.ac.at Technische

More information

SHISA: The Mobile IPv6/NEMO BS Stack Implementation Current Status

SHISA: The Mobile IPv6/NEMO BS Stack Implementation Current Status SHISA: The Mobile IPv6/NEMO BS Stack Implementation Current Status Asia BSD Conference 2007 11th March 2007 @ Tokyo, Japan Keiichi Shima 1, Koshiro Mitsuya 2, Ryuji Wakikawa 2, Tsuyoshi Momose 3 and Keisuke

More information

Improved Analysis of Refactoring in Forked Project to Remove the Bugs Present in the System

Improved Analysis of Refactoring in Forked Project to Remove the Bugs Present in the System Improved Analysis of Refactoring in Forked Project to Remove the Bugs Present in the System Inderjeet Kour Jhans 1, Dr.V.Krishna Priya 2 Research Scholar, Department of Computer Science, Sri Ramakrishna

More information

An Experience Report on Analyzing Industrial Software Systems Using Code Clone Detection Techniques

An Experience Report on Analyzing Industrial Software Systems Using Code Clone Detection Techniques An Experience Report on Analyzing Industrial Software Systems Using Code Clone Detection Techniques Norihiro Yoshida (NAIST) Yoshiki Higo, Shinji Kusumoto, Katsuro Inoue (Osaka University) Outline 1. What

More information

Classification of Java Programs in SPARS-J. Kazuo Kobori, Tetsuo Yamamoto, Makoto Matsusita and Katsuro Inoue Osaka University

Classification of Java Programs in SPARS-J. Kazuo Kobori, Tetsuo Yamamoto, Makoto Matsusita and Katsuro Inoue Osaka University Classification of Java Programs in SPARS-J Kazuo Kobori, Tetsuo Yamamoto, Makoto Matsusita and Katsuro Inoue Osaka University Background SPARS-J Reuse Contents Similarity measurement techniques Characteristic

More information

Baishakhi Ray and Miryung Kim The University of Texas at Austin

Baishakhi Ray and Miryung Kim The University of Texas at Austin Baishakhi Ray and Miryung Kim The University of Texas at Austin 1 Software forking has become popular. Developers may need to port similar feature additions and bug- fixes across the projects. The characteristics

More information

Using a VMware Network Infrastructure to Collect Traffic Traces for Intrusion Detection Evaluation

Using a VMware Network Infrastructure to Collect Traffic Traces for Intrusion Detection Evaluation Using a VMware Network Infrastructure to Collect Traffic Traces for Intrusion Detection Evaluation by Frederic Massicotte, Mathieu Couture and Annie De Montigny Leboeuf http://www.crc.ca/networksystems_security/

More information

Common Coupling as a Measure of Reuse Effort in Kernel-Based Software with Case Studies on the Creation of MkLinux and Darwin

Common Coupling as a Measure of Reuse Effort in Kernel-Based Software with Case Studies on the Creation of MkLinux and Darwin Common Coupling as a Measure of Reuse Effort in Kernel-Based Software with Case Studies on the Creation of MkLinux and Darwin Liguo Yu Department of Informatics Computer and Information Sciences Department

More information

(S)LOC Count Evolution for Selected OSS Projects. Tik Report 315

(S)LOC Count Evolution for Selected OSS Projects. Tik Report 315 (S)LOC Count Evolution for Selected OSS Projects Tik Report 315 Arno Wagner arno@wagner.name December 11, 009 Abstract We measure the dynamics in project code size for several large open source projects,

More information

Free Unix: the BSD one(s)

Free Unix: the BSD one(s) LinuxFocus article number 276 http://linuxfocus.org Free Unix: the BSD one(s) by Georges Tarbouriech About the author: Georges is a long time Unix user. He likes the free BSD variants

More information

MUDABlue: An Automatic Categorization System for Open Source Repositories

MUDABlue: An Automatic Categorization System for Open Source Repositories MUDABlue: An Automatic Categorization System for Open Source Repositories By Shinji Kawaguchi, Pankaj Garg, Makoto Matsushita, and Katsuro Inoue In 11th Asia Pacific software engineering conference (APSEC

More information

RetroBSD and LiteBSD: Meet the Smallest BSDs. Brian Callahan New York City *BSD User Group July 2016 meeting

RetroBSD and LiteBSD: Meet the Smallest BSDs. Brian Callahan New York City *BSD User Group July 2016 meeting RetroBSD and LiteBSD: Meet the Smallest BSDs Brian Callahan New York City *BSD User Group July 2016 meeting First thing s first Interrupt me if you have questions. About me George continues

More information

DETECTING SIMPLE AND FILE CLONES IN SOFTWARE

DETECTING SIMPLE AND FILE CLONES IN SOFTWARE DETECTING SIMPLE AND FILE CLONES IN SOFTWARE *S.Ajithkumar, P.Gnanagurupandian, M.Senthilvadivelan, Final year Information Technology **Mr.K.Palraj ME, Assistant Professor, ABSTRACT: The objective of this

More information

Predicting Vulnerable Software Components

Predicting Vulnerable Software Components Predicting Vulnerable Software Components Stephan Neuhaus, et. al. 10/29/2008 Stuart A Jaskowiak, CSC 682 1 What's in the paper? Introduction Scope of this Work Components and Vulnerabilities Imports and

More information

Parallels Virtuozzo Containers

Parallels Virtuozzo Containers Parallels Virtuozzo Containers White Paper Deploying Application and OS Virtualization Together: Citrix and Parallels Virtuozzo Containers www.parallels.com Version 1.0 Table of Contents The Virtualization

More information

Hands-On Ethical Hacking and Network Defense Chapter 6 Enumeration

Hands-On Ethical Hacking and Network Defense Chapter 6 Enumeration Hands-On Ethical Hacking and Network Defense Chapter 6 Enumeration Modified 2-22-14 Objectives Describe the enumeration step of security testing Enumerate Microsoft OS targets Enumerate NetWare OS targets

More information

Deploying Application and OS Virtualization Together: Citrix and Virtuozzo

Deploying Application and OS Virtualization Together: Citrix and Virtuozzo White Paper Deploying Application and OS Virtualization Together: Citrix and Virtuozzo www.swsoft.com Version 1.0 Table of Contents The Virtualization Continuum: Deploying Virtualization Together... 3

More information

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How

More information

Operating System Kernels

Operating System Kernels Operating System Kernels Presenter: Saikat Guha Cornell University CS 614, Fall 2005 Operating Systems Initially, the OS was a run-time library Batch ( 55 65): Resident, spooled jobs Multiprogrammed (late

More information

Introduction to Computer Science

Introduction to Computer Science Introduction to Computer Science CSCI 109 China Tianhe-2 Andrew Goodney Fall 2017 Lecture 8: Operating Systems October 16, 2017 Operating Systems ì Working Together 1 Agenda u Talk about operating systems

More information

Latest releases: 5.3, The most popular of the *BSDs. Historically aimed for maximum. performance on X86. Now supports most of the popular

Latest releases: 5.3, The most popular of the *BSDs. Historically aimed for maximum. performance on X86. Now supports most of the popular Short history Based on: http://www.levenez.com/unix/ 1978 BSD (Barkeley software distribution) Based on unix system developed by Bell. 1991 386BSD BSD port to Intel (Based on 4.3BSD). 1991 Linux based

More information

Lecture 9: I: Web Retrieval II: Webology. Johan Bollen Old Dominion University Department of Computer Science

Lecture 9: I: Web Retrieval II: Webology. Johan Bollen Old Dominion University Department of Computer Science Lecture 9: I: Web Retrieval II: Webology Johan Bollen Old Dominion University Department of Computer Science jbollen@cs.odu.edu http://www.cs.odu.edu/ jbollen April 10, 2003 Page 1 WWW retrieval Two approaches

More information

Empirical Study on Impact of Developer Collaboration on Source Code

Empirical Study on Impact of Developer Collaboration on Source Code Empirical Study on Impact of Developer Collaboration on Source Code Akshay Chopra, Sahil Puri and Parul Verma 03 April 2018 Outline Introduction Research Questions Methodology Data Characteristics Analysis

More information

Maintain the NetBSD Base System Using pkg * Tools

Maintain the NetBSD Base System Using pkg * Tools Maintain the NetBSD Base System Using pkg * Tools Yuuki Enomoto Ken ichi Fukamachi Abstract This paper describes the script basepkg.sh for base system packaging to make NetBSD base system more granular.

More information

UBC: An Efficient Unified I/O and Memory Caching Subsystem for NetBSD

UBC: An Efficient Unified I/O and Memory Caching Subsystem for NetBSD UBC: An Efficient Unified I/O and Memory Caching Subsystem for NetBSD Chuck Silvers The NetBSD Project chuq@chuq.com, http://www.netbsd.org/ Abstract This paper introduces UBC ( Unified Buffer Cache ),

More information

UEFI Porting Update for ARM Platforms

UEFI Porting Update for ARM Platforms UEFI Porting Update for ARM Platforms What did we do since July? Leif Lindholm UEFI tech lead Linaro Enterprise Group presented by UEFI Plugfest May 2014 Agenda Introduction Linux Support EDK2 Development

More information

Part I: Data Mining Foundations

Part I: Data Mining Foundations Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?

More information

License Usage and Changes: A Large-Scale Study of Java Projects on GitHub

License Usage and Changes: A Large-Scale Study of Java Projects on GitHub 2015 IEEE 23rd International Conference on Program Comprehension License Usage and Changes: A Large-Scale Study of Java Projects on GitHub Christopher Vendome 1, Mario Linares-Vásquez 1, Gabriele Bavota

More information

Sourceforge.net CVS ~ Compile Farm

Sourceforge.net CVS ~ Compile Farm Sourceforge.net CVS ~ Compile Farm Sourceforge.net CVS Each project is provided with a repository Developers automatically granted permissions to commit changes Read-only anonymous pserver-based access

More information

Automatic Categorization Algorithm for Evolvable Software Archive

Automatic Categorization Algorithm for Evolvable Software Archive Automatic Categorization Algorithm for Evolvable Software Archive Shinji Kawaguchi, Pankaj K. Garg Makoto Matsushita and Katsuro Inoue Graduate School of Information Science and Technology, Osaka University

More information

Istanbul Kemerburgaz University. UNIX FreeBSD CPU Scheduling

Istanbul Kemerburgaz University. UNIX FreeBSD CPU Scheduling Istanbul Kemerburgaz University Student Name: Alaa Firas Jasim Student NO: 163101031 UNIX FreeBSD CPU Scheduling Prof.Dr. Hasan Hussien Balik OUTLINE: 1. Introduction 2. History 3. Features 4. Development

More information

Dr. Sushil Garg Professor, Dept. of Computer Science & Applications, College City, India

Dr. Sushil Garg Professor, Dept. of Computer Science & Applications, College City, India Volume 3, Issue 11, November 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Study of Different

More information

Table 2. Further details on the explanatory variables used (sorted by the physical scheme and variable name)

Table 2. Further details on the explanatory variables used (sorted by the physical scheme and variable name) Table 2. Further details on the explanatory variables used (sorted by the physical scheme and variable name) Explanatory eacf Cloud physics Empirically adjusted cloud fraction rhcrit Cloud physics Critical

More information

Towards Better Understanding of Software Quality Evolution Through Commit Impact Analysis

Towards Better Understanding of Software Quality Evolution Through Commit Impact Analysis Towards Better Understanding of Software Quality Evolution Through Commit Impact Analysis Sponsor: DASD(SE) By Mr. Pooyan Behnamghader 5 th Annual SERC Doctoral Students Forum November 7, 2017 FHI 360

More information

Hands-On Ethical Hacking and Network Defense Chapter 6 Enumeration

Hands-On Ethical Hacking and Network Defense Chapter 6 Enumeration Hands-On Ethical Hacking and Network Defense Chapter 6 Enumeration Modified 1-11-17 Objectives Describe the enumeration step of security testing Enumerate Microsoft OS targets Enumerate *NIX OS targets

More information

Xen Summit Spring 2007

Xen Summit Spring 2007 Xen Summit Spring 2007 Platform Virtualization with XenEnterprise Rich Persaud 4/20/07 Copyright 2005-2006, XenSource, Inc. All rights reserved. 1 Xen, XenSource and XenEnterprise

More information

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures Springer Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web

More information

X.Org & BSD - Upcoming Plans

X.Org & BSD - Upcoming Plans X.Org & BSD - Upcoming Plans Matthieu Herrb OpenBSD/X.Org BSDCan, May 17 2008 http://www.laas.fr/~matthieu/talks/bsdcan2008.pdf Agenda 1 Introduction 2 Some history... 3 The present 4 The future 5 Conclusion

More information

Mobile IPv6 in 6NET: An Overview. Chris Edwards, Lancaster University, UK

Mobile IPv6 in 6NET: An Overview. Chris Edwards, Lancaster University, UK Mobile IPv6 in 6NET: An Overview Chris Edwards, Lancaster University, UK Summary Mobile IPv6 Overview Status of the Protocol Available Implementations Deployment in 6NET Trials and Testing MIPv6++ Related

More information

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining Data Mining: Data Lecture Notes for Chapter 2 Introduction to Data Mining by Tan, Steinbach, Kumar Data Preprocessing Aggregation Sampling Dimensionality Reduction Feature subset selection Feature creation

More information

A Framework for Utility-Based Service Oriented Design in SASSY

A Framework for Utility-Based Service Oriented Design in SASSY A Framework for Utility-Based Service Oriented Design in SASSY The material in these slides comes from the paper A Framework for Utility-Based Service Oriented Design in SASSY, D.A. Menasce, J. Ewing,

More information

Baishakhi Ray, Christopher Wiley, Miryung Kim The University of Texas at Austin

Baishakhi Ray, Christopher Wiley, Miryung Kim The University of Texas at Austin Baishakhi Ray, Christopher Wiley, Miryung Kim The University of Texas at Austin Software forking has become popular. Open source forked projects: OpenBSD from NetBSD XEmacs from GNU Emacs Proprietary forked

More information

Comparison between SLOCs and number of files as size metrics for software evolution analysis 1

Comparison between SLOCs and number of files as size metrics for software evolution analysis 1 Comparison between SLOCs and number of files as size metrics for software evolution analysis 1 Comparison between SLOCs and number of files as size metrics for software evolution analysis Israel Herraiz,

More information

An investigation into the impact of software licenses on copy-and-paste reuse among OSS projects

An investigation into the impact of software licenses on copy-and-paste reuse among OSS projects An investigation into the impact of software licenses on copy-and-paste reuse among OSS projects Yu Kashima, Yasuhiro Hayase, Norihiro Yoshida, Yuki Manabe, Katsuro Inoue Graduate School of Information

More information

Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London

Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services Patrick Wendel Imperial College, London Data Mining and Exploration Middleware for Distributed and Grid Computing,

More information

Re-engineering Software Variants into Software Product Line

Re-engineering Software Variants into Software Product Line Re-engineering Software Variants into Software Product Line Présentation extraite de la soutenance de thèse de M. Ra'Fat AL-Msie'Deen University of Montpellier Software product variants 1. Software product

More information

Hands-On Ethical Hacking and Network Defense Chapter 6 Enumeration

Hands-On Ethical Hacking and Network Defense Chapter 6 Enumeration Hands-On Ethical Hacking and Network Defense Chapter 6 Enumeration Updated 3-3-18 Objectives Describe the enumeration step of security testing Enumerate Microsoft OS targets Enumerate *NIX OS targets Introduction

More information

Towards Understanding of Software Changes

Towards Understanding of Software Changes Towards Understanding of Software Changes Audris Mockus audris@avaya.com Avaya Labs Research Basking Ridge, NJ 07920 http://mockus.org/ Goals To understand software evolution by understanding changes Universe

More information

Welcome! Virtual tutorial starts at 15:00 GMT. Please leave feedback afterwards at:

Welcome! Virtual tutorial starts at 15:00 GMT. Please leave feedback afterwards at: Welcome! Virtual tutorial starts at 15:00 GMT Please leave feedback afterwards at: www.archer.ac.uk/training/feedback/online-course-feedback.php Introduction to Version Control (part 1) ARCHER Virtual

More information

SOURCERER: MINING AND SEARCHING INTERNET- SCALE SOFTWARE REPOSITORIES

SOURCERER: MINING AND SEARCHING INTERNET- SCALE SOFTWARE REPOSITORIES SOURCERER: MINING AND SEARCHING INTERNET- SCALE SOFTWARE REPOSITORIES Introduction to Information Retrieval CS 150 Donald J. Patterson This content based on the paper located here: http://dx.doi.org/10.1007/s10618-008-0118-x

More information

Technical White Paper. All-flash storage is breaking these 5 storage rules

Technical White Paper. All-flash storage is breaking these 5 storage rules Technical White Paper All-flash storage is breaking these 5 storage rules 1 3 5 2 4 @tintri www.tintri.com Contents Evolution of Storage....1 Flash is Changing Five Core Storage Rules...1 Rule 1: Controller

More information

Analysis of License Inconsistency in Large Collections of Open Source Projects

Analysis of License Inconsistency in Large Collections of Open Source Projects Noname manuscript No. (will be inserted by the editor) Analysis of License Inconsistency in Large Collections of Open Source Projects Yuhao Wu Yuki Manabe Tetsuya Kanda Daniel M. German Katsuro Inoue Received:

More information

BOTH the trade press and researchers have examined

BOTH the trade press and researchers have examined IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 31, NO. 6, JUNE 2005 481 The FreeBSD Project: A Replication Case Study of Open Source Development Trung T. Dinh-Trong and James M. Bieman, Senior Member,

More information

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?

More information

Operating system hardening

Operating system hardening Operating system Comp Sci 3600 Security Outline 1 2 3 4 5 6 What is OS? Hardening process that includes planning, ation, uration, update, and maintenance of the operating system and the key applications

More information

IBM. CICSPlex SM Concepts and Planning. CICS Transaction Server for z/os. Version 5 Release 4

IBM. CICSPlex SM Concepts and Planning. CICS Transaction Server for z/os. Version 5 Release 4 for z/os IBM CICSPlex SM Concepts and Planning Version 5 Release 4 for z/os IBM CICSPlex SM Concepts and Planning Version 5 Release 4 Note Before using this information and the product it supports, read

More information

Designing an AllJoyn Interface

Designing an AllJoyn Interface Designing an AllJoyn Interface Dave Thaler Interface Review Board (IRB) Member Software Engineer, Microsoft 7 October 2015 AllSeen Alliance 1 Agenda 1. Background 2. Design Guidelines 3. Process for Authoring

More information

College Technical Mathematics 1

College Technical Mathematics 1 WTCS Repository 10-804-115 College Technical Mathematics 1 Course Outcome Summary Course Information Description Total Credits 5.00 Topics include: solving linear, quadratic, and rational equations; graphing;

More information

Data Entry, and Manipulation. DataONE Community Engagement & Outreach Working Group

Data Entry, and Manipulation. DataONE Community Engagement & Outreach Working Group Data Entry, and Manipulation DataONE Community Engagement & Outreach Working Group Lesson Topics Best Practices for Creating Data Files Data Entry Options Data Integration Best Practices Data Manipulation

More information

Quantitative Vulnerability Assessment of Systems Software

Quantitative Vulnerability Assessment of Systems Software Quantitative Vulnerability Assessment of Systems Software Omar H. Alhazmi Yashwant K. Malaiya Colorado State University Motivation Vulnerabilities: defect which enables an attacker to bypass security measures

More information

Software Reuse and Component-Based Software Engineering

Software Reuse and Component-Based Software Engineering Software Reuse and Component-Based Software Engineering Minsoo Ryu Hanyang University msryu@hanyang.ac.kr Contents Software Reuse Components CBSE (Component-Based Software Engineering) Domain Engineering

More information

Optimizing Grouped Aggregation in Geo- Distributed Streaming Analytics

Optimizing Grouped Aggregation in Geo- Distributed Streaming Analytics Optimizing Grouped Aggregation in Geo- Distributed Streaming Analytics Benjamin Heintz, Abhishek Chandra University of Minnesota Ramesh K. Sitaraman UMass Amherst & Akamai Technologies Wide- Area Streaming

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 3/6/2012 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu 2 In many data mining

More information

Making the Right Decision: Supporting Architects with Design Decision Data

Making the Right Decision: Supporting Architects with Design Decision Data Making the Right Decision: Supporting Architects with Design Decision Data Jan Salvador van der Ven 1 and Jan Bosch 2 1 Factlink, Groningen, the Netherlands 2 Chalmers University of Technology Gothenborg,

More information

Pandektis: Implementing a repository of greek historical and cultural material with DSpace

Pandektis: Implementing a repository of greek historical and cultural material with DSpace Pandektis: Implementing a repository of greek historical and cultural material with DSpace Nikos Houssos Ilias Stavrakis Kostas Stamatis Ioanna-Ourania Stathopoulou Christina Paschou National Documentation

More information

Advanced Knowledge Discovery Tools: Designed to Unlock and Leverage Prior Knowledge to Drive Cost Reduction, Increase Innovation and Mitigate Risk

Advanced Knowledge Discovery Tools: Designed to Unlock and Leverage Prior Knowledge to Drive Cost Reduction, Increase Innovation and Mitigate Risk Advanced Knowledge Discovery Tools: Designed to Unlock and Leverage Prior Knowledge to Drive Cost Reduction, Increase Innovation and Mitigate Risk Ian F Mitchell BSc CEng MIMechE Senior Innovation Manager

More information

Technical Debt Reduction Using a Game Theoretic Competitive Source Control Approach

Technical Debt Reduction Using a Game Theoretic Competitive Source Control Approach Technical Debt Reduction Using a Game Theoretic Competitive Source Control Approach Sarah Morrison-Smith sarah.morrisonsmith@msu. montana.edu Chad Marmon chad.marmon@msu.montana.edu Stephen Dighans stephen.dighans@msu.

More information

Extracting a Unified Directory Tree to Compare Similar Software Products

Extracting a Unified Directory Tree to Compare Similar Software Products Extracting a Unified Directory Tree to Compare Similar Software Products Yusuke Sakaguchi, Takashi Ishio, Tetsuya Kanda, Katsuro Inoue Graduate School of Information Science and Technology, Osaka University,

More information

History of FreeBSD. FreeBSD Kernel Facilities

History of FreeBSD. FreeBSD Kernel Facilities History of FreeBSD FreeBSD Kernel Facilities 1979 3BSD Added virtual memory to UNIX/32V 1981 4.1BSD 1983 4.2BSD Final release from Berkeley DARPA UNIX project 1986 4.3BSD 1988 4.3BSD Tahoe 1989 4.3BSD

More information

Cisco Security Manager 4.1: Integrated Security Management for Cisco Firewalls, IPS, and VPN Solutions

Cisco Security Manager 4.1: Integrated Security Management for Cisco Firewalls, IPS, and VPN Solutions Data Sheet Cisco Security Manager 4.1: Integrated Security Management for Cisco Firewalls, IPS, and VPN Solutions Security Operations Challenges Businesses are facing daunting new challenges in security

More information

Microsoft Office Access 2007: Intermediate Course 01 Relational Databases

Microsoft Office Access 2007: Intermediate Course 01 Relational Databases Microsoft Office Access 2007: Intermediate Course 01 Relational Databases Slide 1 Relational Databases Course objectives Normalize tables Set relationships between tables Implement referential integrity

More information

LONG Laboratories Over Next Generation Networks

LONG Laboratories Over Next Generation Networks LONG Laboratories Over Next Generation Networks Participants : Portugal Telecom Inovacao (PTIN), Telefónica I+D (TID), Universidad Carlos III de Madrid (UC3M), Universidad de Evora (UEV), Universitat Politècnica

More information

Advanced file systems: LFS and Soft Updates. Ken Birman (based on slides by Ben Atkin)

Advanced file systems: LFS and Soft Updates. Ken Birman (based on slides by Ben Atkin) : LFS and Soft Updates Ken Birman (based on slides by Ben Atkin) Overview of talk Unix Fast File System Log-Structured System Soft Updates Conclusions 2 The Unix Fast File System Berkeley Unix (4.2BSD)

More information

CS307 Operating Systems Introduction Fan Wu

CS307 Operating Systems Introduction Fan Wu CS307 Introduction Fan Wu Department of Computer Science and Engineering Shanghai Jiao Tong University Spring 2018 2 UNIX-family: BSD(Berkeley Software Distribution), System-V, GNU/Linux, MINIX, Nachos,

More information

Mining Data Streams. From Data-Streams Management System Queries to Knowledge Discovery from continuous and fast-evolving Data Records.

Mining Data Streams. From Data-Streams Management System Queries to Knowledge Discovery from continuous and fast-evolving Data Records. DATA STREAMS MINING Mining Data Streams From Data-Streams Management System Queries to Knowledge Discovery from continuous and fast-evolving Data Records. Hammad Haleem Xavier Plantaz APPLICATIONS Sensors

More information

Analyzing the Relationship between the License of Packages and Their Files in Free and Open Source Software

Analyzing the Relationship between the License of Packages and Their Files in Free and Open Source Software Analyzing the Relationship between the License of Packages and Their Files in Free and Open Source Software Yuki Manabe, Daniel German, Katsuro Inoue To cite this version: Yuki Manabe, Daniel German, Katsuro

More information

Comparison between SLOCs and number of files as size metrics for software evolution analysis

Comparison between SLOCs and number of files as size metrics for software evolution analysis Comparison between SLOCs and number of files as size metrics for software evolution analysis Israel Herraiz, Gregorio Robles, Jesús M. González-Barahona Grupo de Sistemas y Comunicaciones Universidad Rey

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

Audio/Video Collection Migration, Storage and Preservation

Audio/Video Collection Migration, Storage and Preservation Audio/Video Collection Migration, Storage and Preservation PASIG Madrid, Spain July 5, 2010 Brian Campanotti CTO Front Porch Digital Video Tape Preservation The Reality Video tape and film archives are

More information

Lecture 25 Clone Detection CCFinder. EE 382V Spring 2009 Software Evolution - Instructor Miryung Kim

Lecture 25 Clone Detection CCFinder. EE 382V Spring 2009 Software Evolution - Instructor Miryung Kim Lecture 25 Clone Detection CCFinder Today s Agenda (1) Recap of Polymetric Views Class Presentation Suchitra (advocate) Reza (skeptic) Today s Agenda (2) CCFinder, Kamiya et al. TSE 2002 Recap of Polymetric

More information

GRUB2 and Yeeloong. From BIOS bootloader to MIPS firmware

GRUB2 and Yeeloong. From BIOS bootloader to MIPS firmware GRUB2 and Yeeloong From BIOS bootloader to MIPS firmware GRUB2 history 1995: Start of GRUB Legacy 1999: GRUB Legacy becomes GNU project 2002: PUPA (Yoshinori K Okuji) 2004: GRUB2 2004-2005: PowerPC and

More information

DATA MINING II - 1DL460. Spring 2014"

DATA MINING II - 1DL460. Spring 2014 DATA MINING II - 1DL460 Spring 2014" A second course in data mining http://www.it.uu.se/edu/course/homepage/infoutv2/vt14 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

DMI251DTV (C61) v1.0. Open Source Report. This document aims to describe the Open Source Software which are embedded in product DMI251DTV (C61)

DMI251DTV (C61) v1.0. Open Source Report. This document aims to describe the Open Source Software which are embedded in product DMI251DTV (C61) v1.0 This document aims to describe the Open Source Software which are embedded in product Copyright 2017 Technicolor Group (Connected Home Division of Technicolor Group Technicolor Delivery Technologies,

More information

It s a Unix(-like) System? An Introduction to TrueOS and Open Source Software. Copyright ixsystems, Inc. 2017

It s a Unix(-like) System? An Introduction to TrueOS and Open Source Software. Copyright ixsystems, Inc. 2017 It s a Unix(-like) System? An Introduction to TrueOS and Open Source Software Copyright ixsystems, Inc. 2017 Introduction Ken Moore - Senior software engineer at ixsystems. Core team member for TrueOS

More information

Empirical Study on Impact of Developer Collaboration on Source Code

Empirical Study on Impact of Developer Collaboration on Source Code Empirical Study on Impact of Developer Collaboration on Source Code Akshay Chopra University of Waterloo Waterloo, Ontario a22chopr@uwaterloo.ca Parul Verma University of Waterloo Waterloo, Ontario p7verma@uwaterloo.ca

More information

The NetBSD Operating. Overview

The NetBSD Operating. Overview The NetBSD Operating System Jason R. Thorpe The NetBSD Foundation, Inc. June 17, 1998 6/17/98 Jason R. Thorpe 1 Overview What is NetBSD? NetBSD Project Goals NetBSD Project Organization

More information

Dr. K. Y. Srinivasan. Jason Goldschmidt. Technical Lead NetApp Principal Architect Microsoft Corp.

Dr. K. Y. Srinivasan. Jason Goldschmidt. Technical Lead NetApp Principal Architect Microsoft Corp. Dr. K. Y. Srinivasan Principal Architect Microsoft Corp kys@microsoft.com Jason Goldschmidt Technical Lead NetApp jgoldsch@netapp.com ! Support FreeBSD running as a guest on Hyper-V! Collaboration between

More information

Managing Open Bug Repositories through Bug Report Prioritization Using SVMs

Managing Open Bug Repositories through Bug Report Prioritization Using SVMs Managing Open Bug Repositories through Bug Report Prioritization Using SVMs Jaweria Kanwal Quaid-i-Azam University, Islamabad kjaweria09@yahoo.com Onaiza Maqbool Quaid-i-Azam University, Islamabad onaiza@qau.edu.pk

More information

Computer Education Revisited: A Comprehensive System of Education Using Free Culture. Jon "maddog" Hall Executive Director Linux International

Computer Education Revisited: A Comprehensive System of Education Using Free Culture. Jon maddog Hall Executive Director Linux International Computer Education Revisited: A Comprehensive System of Education Using Free Culture by Jon "maddog" Hall Executive Director Linux International 1 of 52 Dedicated to: John Lions 2 of 52 Who Am I? Half

More information

CNT 4603, Spring 2009: Introduction

CNT 4603, Spring 2009: Introduction , : A practical hands-on approach Also higher-level concepts Expertise is distributed: system administration happens everywhere from your PC to large servers, and system administration is generally collaborative.

More information

UNIT-V WEB MINING. 3/18/2012 Prof. Asha Ambhaikar, RCET Bhilai.

UNIT-V WEB MINING. 3/18/2012 Prof. Asha Ambhaikar, RCET Bhilai. UNIT-V WEB MINING 1 Mining the World-Wide Web 2 What is Web Mining? Discovering useful information from the World-Wide Web and its usage patterns. 3 Web search engines Index-based: search the Web, index

More information

CS 550 Operating Systems Spring Operating Systems Overview

CS 550 Operating Systems Spring Operating Systems Overview 1 CS 550 Operating Systems Spring 2018 Operating Systems Overview 2 What is an OS? Applications OS Hardware A software layer between the hardware and the application programs/users which provides a virtualization

More information

pkgsrc for users and developers

pkgsrc for users and developers pkgsrc for users and developers Guillaume Lasmayous gls@netbsd.org FOSDEM Brussels, Feb. 5 2012 WTF is pkgsrc? aka package source NetBSD packaging system for 3rd party applications Initially based on FreeBSD

More information

OSGi Subsystems from theory to practice Glyn Normington. Eclipse Virgo Project Lead SpringSource/VMware

OSGi Subsystems from theory to practice Glyn Normington. Eclipse Virgo Project Lead SpringSource/VMware from theory to practice Glyn Normington Eclipse Virgo Project Lead SpringSource/VMware 1 Software rots 2 modularity helps 3 but... 4 A clean design 5 without enforcement 6 works fine for a while 7 then

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2/24/2014 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu 2 High dim. data

More information

Code Clone Analysis and Application

Code Clone Analysis and Application Code Clone Analysis and Application Katsuro Inoue Osaka University Talk Structure Clone Detection CCFinder and Associate Tools Applications Summary of Code Clone Analysis and Application Clone Detection

More information