Analyzing the software development process with SyQL and Lagrein
|
|
- Milo Stephens
- 6 years ago
- Views:
Transcription
1 Analyzing the software development process with SyQL and Lagrein Mirco Bianco Center for Applied Software Engineering Free University of Bolzano Via della Mostra, 4 I Bolzano-Bozen Mirco.Bianco@unibz.it Alberto Sillitti Center for Applied Software Engineering Free University of Bolzano Via della Mostra, 4 I Bolzano-Bozen Alberto.Sillitti@unibz.it Giancarlo Succi Center for Applied Software Engineering Free University of Bolzano Via della Mostra, 4 I Bolzano-Bozen Giancarlo.Succi@unibz.it Abstract Mining information from software products metrics and software process data is very hard[14]. Automatic collected data from the source code metrics extractors and from the software development process probes have different formats, it makes difficult to use both at the same time. In this paper, we present a data manipulation language called System Query Language (SyQL), which overcomes problems of other similar languages and allows the user to access data stored in a relational-temporal database. Developers and managers can look at effort data and code metrics by writing very concise SQL-like queries and by using linguistic variables that are unavailable in other existing similar query languages. SyQL helps the user to access temporal data of the software process, providing a set of temporal constructs. Examples of problems solved using SyQL queries and Lagrein (a tool for source code analysis) are provided, evidencing the advantage of the proposed approach. Keywords Query languages, data warehouses, software metrics, effort, development process. 1. Introduction Mining information from software process and products metrics at the same time is challenging [5]. The relations between them can be different depending on the analysis to perform. Typically, researchers mine information from relational data warehouses in asynchronous way, using SQL to perform data extraction and other data manipulation tools (Weka, RapidMiner, Matlab, etc) to perform elaboration, such as filtering, clustering, etc. These data warehouses grow up to 1.5 GB/day [14], therefore the asynchronous approach is very time consuming. Moreover, the structure of the data warehouse is usually fairly complex [13]. To overcome such problems we propose a new language: System Query Language (SyQL). SyQL is a domain specific language based on fuzzy-temporal logic to query a data-warehouse of software process data [15] through SQL like queries. SyQL is used inside Lagrein [7] for retrieving and visualizing the historical data about the software development process (code metrics, effort, bugs, etc.). Software development takes place over time. To allow the user to consider the time aspect when evaluating software metrics, SyQL offers the possibility to filter data using temporal conditions. The paper is organized as follows: sections 2 defines the goals of this work, section 3 discusses the related work, section 4 presents our solution, section 5 gives an overview of our automatic metrics collection system, section 6 introduces the syntax of SyQL, section 7 describes how the SyQL query engine works, section 8 shows examples of visualization, finally section 9 draws the conclusion and presents future directions. 2. The Goals SyQL has been designed to achieve the following goals: Build an abstraction layer between the user and the tables of the data-warehouse; Make the query preparation process against the metrics data warehouse [14] trivial; Help software engineers to evaluate software along the timeline; Support the evaluation of the effort spent by the developers along the temporal line; Help the user to evaluate product quality using simple logic constructs; Make the language extensible. Summarizing, SyQL has a high aggregation capacity and it supports extensible fuzzy logic and temporal functions. The Fuzzy logic is useful for performing qualitative analysis on large datasets, which sometimes is more useful than quantitative analysis, because the user cannot a priori estimate the value of software metrics [19]. The user can miss some important results if
2 he/she uses a wrong threshold value. Therefore, a fuzzy set encapsulates the experience to evaluate a particular metric. By temporal functions, we mean language clauses that help the user to write shorter queries. Temporal functions are needed because the software development process evolves over time, and so queries must include temporal conditions [1][16][18][10]. 3. Related work There are several works on languages that can be used to query repositories of software data. The features that appear most relevant to consider are: the capabilities to perform temporal queries on product and process metrics, the possibility to help the user to filter the results through linguistic variables [17] (such as high, medium, low), and the possibility to be used into a general context. In addition it is important to consider some other technical aspects such as: support to combined analysis (software metrics/effort), temporal management, fuzzy logic support, supported programming languages (languages from which the tool is able to extract information for analysis tasks), and object orientation. In Table 1 we use such criteria to compare some of the most relevant existing work and SyQL. Language Integrated Query LINQ [9] is designed to be embedded into another programming language. Therefore, queries can be performed with the same expressive power from a program written either in C# 3.5, VB 9.0, or another.net language. FuzzySQL [4] is a commercial relational database front-end; it supports fuzzy conditions and it is designed to assist the user during the analysis tasks..ql [11] is a commercial tool designed to perform code analysis tasks as reverse engineering and discovery of bad code smells. DmFSQL [2] is a general-purpose fuzzy query language data-mining oriented implemented as an Oracle database front-end. SCQL [6] is a domain-specific temporal query language used to retrieve information from a relational database containing information gathered from a source control system. NDepend is an application, which uses CQL (Code Query Language), for extracting information from.net projects. With this program is possible to extract a lot of information from the source code. 4. Our proposal To enable the final user to perform fuzzy-temporal query against a metrics data warehouse [15] we decided to implement a new query language, SyQL. The reasons of this choice are now discussed, showing the main differences with those of languages introduced above. The syntax of SyQL is similar to the one of LINQ [9], but it is designed to achieve different purposes. LINQ is more general and can perform queries on different data sources, while SyQL is tight to a specific data source (the metrics data warehouse [15]). Both of them are fully object oriented; SyQL allows the use of Fuzzy equal operator and temporal tokens, LINQ does not. The main difference between SyQL and FuzzySQL [4] is that FuzzySQL is a general-purpose relational database front-end, while SyQL is a specific tool to perform information retrieval tasks on metrics data warehouse [14] with additional features to handle temporal analysis of the software development process. SyQL can be used to perform software metrics and effort analysis, on the contrary.ql [11] can handle only software data. SyQL can perform tasks on different project written in different programming languages, while.ql can perform analysis only on Java projects. SyQL supports the fuzzy logic conditions,.ql does not. SyQL is completely different from dmfsql [2], the only evident similarity between them is the fuzzy logic support, because the purposes of these two languages are different. Both SCQL [6] and SyQL have keywords to manage temporal data. The main difference is that SyQL is designed to be extended to handle different aspects of development process (effort, software metrics, requirements, etc.), while SCQL is designed only to perform information retrieval tasks on software repository data. SCQL has not fuzzy logic support. NDepend and SyQL have been designed for achieving different goals. With NDepend is easier keep under control a set of.net projects, because it is highly integrated with the.net environment, on the other hand SyQL is more platform independent (it supports also C/C++ and Java) and it wants to help the users to control different aspect of the software development process. With SyQL is possible to visualize and compare the values of a specific metric into a specified time interval (e.g. show the total number of line of code in the last 6 months), with NDepend is possible only to compare two different versions of the code showing the changes. SyQL makes possible running real effort analyses on source code (e.g. compute the total effort spent by the developers on a specific package/namespace), it enables the user to track the bug fixing process showing which methods had been modified during a specific fixing task, with NDepend it is not possible.
3 5. Languages Table 1: Comparison between different query languages. Support to combined analysis (software metrics/effort) Temporal management Fuzzy Logic General Purpose language Supported programming languages (for analysis task) Object Orientation LINQ [9] NO NO NO YES None YES FuzzySQL [4] NO NO YES YES None NO.QL [11] NO NO NO NO Java YES dmfsql [2] NO NO YES YES None NO SCQL [6] NO YES NO NO None NO NDepend 1 NO NO NO NO All.NET languages YES SyQL YES YES YES NO C/C++, Java, C#, VB.NET YES Architecture description Before presenting the architecture of SyQL and how the results are displayed, we are going to give a brief introduction to our distributed non-intrusive system for collecting software metrics [14]. Figure 1 shows the role of SyQL and Lagrein in the system. The metric collection system is distributed: the applications plugins are installed on the clients and they are able to trace the user activities inside the most common IDEs (Microsoft Visual Studio, Eclipse, etc.); the Source Code Analysis components runs on a standalone machine that takes daily snapshots of the source code from the Versioning System. These components send the collected data to the Metric Server using Apache XML- RPC protocol implementation. Then, the Metrics Server organizes these data and stores them inside the relational data warehouse. The extracted information are delivered to the managers and to the developers in two possible ways, either by an automatic statically generated report (using Eclipse BIRT) or by Lagrein/SyQL in a "dynamic/visual" way. Figure 1: The System Architecture. 6. Language description We introduce the structure of the language through an example. [01] FROM Class c, Method m [02] WHERE c.getfullname() = [03] m.getdefclassfullname() [04] AND c.geteffort(yesterday) IS High [05] SELECT c.getfullname(), [06] c.geteffort(today 1 day ), [07] COUNT(m) [08] GROUP BY c.getfullname(), [09] c.geteffort(today 1 day ); The above query returns a collection of class names, the related effort spent by the developers since yesterday, and the number of methods for each class. The first row introduces the FromClause, which could contain one or more FromElement(s). Each of them is composed by two literals, the former identifies the concept type, the latter declares the concept name (like in SQL). The second, third and fourth rows introduce the WhereClause. In the example there are two conditions: an equal join condition and a fuzzy condition. The fuzzy condition evaluates the effort spent yesterday by the developers. The method c.geteffort(...) is a Java method that returns a value. In the fifth, sixth, and seventh rows the SelectClause is shown. This is a non empty collection of MethodCall(s) and/or aggregation functions (like Count, Sum, Max, Min, etc.). In the last two rows we declare the GroupByClause, which is similar to SQL one. As happens in others similar query languages [9] [11], we decide to put the FromClause at the beginning of the query for allowing to use the auto completion in Where, Select, and GroupBy clauses. 1
4 7. How the query engine works 7.1 Concepts and methods The extensibility is one of the main requisite of SyQL engine, different concepts (the non-terminal symbol FromElement) and methods (the non-terminal symbol MethodCall) used into a SyQL query are shipped in a separate library. This allows us to implement new concepts and new methods during the entire lifecycle of SyQL. Another advantage is that SyQL acts as an abstraction layer between the user and the data-warehouse. Therefore, we can modify the schema of the data-warehouse without affecting the user, if the library is updated properly. Implementing a new concept in SyQL has only one requirement: an instance of one concept must be an entry of a relation defined with a SQL statement. In this way, we can perform the mapping between the SyQL concepts and the tables. The materialization of the object is performed through a constructor, which takes as input an entry of the relation defined above. All the methods of a concept class that can appear in the SyQLExpression are annotated in two different ways. An annotated method can become part of an external or an internal calculable condition. A method can be annotated as external if the returned value is present in one column of the defining concept relation, otherwise it must be annotated as internal. If a condition, which is represented by an instance of SyQLRelationalExpression, is composed by at least one internal calculable method, it must be evaluated into the SyQL query engine, otherwise it can be evaluated by the query engine of the underlying DBMS. The FuzzyExpression(s) are internal by default. 7.2 Query Execution The SyQL query engine works on top of the DBMS (Figure 2). Figure 2: The Data Layers. The SyQL query engine has been implemented without the need of developing a sophisticated query planner and executor. The idea is to push as much conditions as possible into the query engine of the underlying DBMS, in this way we obtain better time performance because the SyQL query engine does not execute any join. To perform it correctly, we convert the conditions that appear in the WhereClause into an equivalent Conjunctive Normal Form (CNF) formula using Boolean algebra and the De Morgan s theorem. The CNF notation is very helpful, because a block of OR conditions can be processed by the underlying DBMS query engine only if all the conditions (inside the block) are evaluated as external, otherwise the block of conditions must be evaluated by the SyQL query engine. A condition is evaluated as an external one if all the predicates (of the condition) are external, otherwise a condition is evaluated internally. The query execution workflow is shown in Figure 3. Figure 3: The SyQL Query Workflow. To perform always this conversion, we convert the parsed formula into an equivalent Disjunctive Normal Form (DNF) formula. Then, we convert it into an equivalent CNF formula doing the Cartesian product among all the condition contained into the AND blocks. The most critical component for the perform-
5 ance is the internal condition evaluators, usually internal conditions require a lot of computation, because most of them need to fetch data from the database. To address this problem we adopted two solutions: 1) sorting these conditions according to their cost, the cost is estimated by the developer of the SyQL libraries during the implementation; 2) evaluating these conditions in parallel taking advantage of the modern parallel/multicore hardware architectures. 8. Query visualization SyQL query results may produce a large quantity of data. Extracting useful information from a large temporal series may be difficult for a human user. Inspect a large software system (about 1,000 classes) on a temporal line of one month (20 working days) generates about 20,000 values per selected class metric, assuming that we collect one metrics snapshot per day without specify any filtering condition. If we perform queries on methods instead on classes, the reader can easily understand how the number of results grows up. Computer animation can easily be a useful and intuitive solution for displaying evolving datasets [12]. We solve this problem mapping the query results inside the metric views of Lagrein. Mapping these results it is straight forward because the SyQL query engine is written in Java, the common implementation technology simplifies the integration between the two tools. 8.1 Introducing query visualization by examples Example 1: In this example we visualize the growth of the classes (in term of LOC) where the developers have spent high effort during the last four days. FROM Class c, Chron chr WHERE chr.getdate() >= TODAY - 4 'days' AND chr.getdate() < TODAY AND c.geteffort(today - 4 'days', TODAY) IS High SELECT c.getloc(chr.getdate()); The result of this query can be visualized either in an Evolution matrix (Figure 4) or in a Evolution Chart. The query above is a collection of ClassLOC instances. The ClassLOC class implements the interface ClassMetric. Through this interface is possible to retrieve the date, the class owner, and the value of the metric. In this way, it is possible to create an animated view of the growth of the classes in the last four days. It is also possible repeat this query for all the software metrics collected by the source code analyzer (Cyclomatic Complexity, Halstead Volume, CK metrics [3]). Figure 4: Evolution Matrix Example 2: It is also possible to create static views of the system. In this example we perform selection of the classes with high value of Coupling Between Objects (CBO). FROM Class c WHERE c.getcbo(today) IS High SELECT c; The result of this query (static result) can be visualized in several views (Figure 5) available in Lagrein (e.g., Inheritance tree, Dependency graph, etc). 9. Conclusion and future work This paper discussed a possible approach for visualizing and mining software metrics and software process data. The whole architecture of the metric collection system and the language structure of SyQL have been presented and a comparison to existing systems is provided, showing that SyQL can go further than the other existing languages. The query execution workflow has been discussed. As a proof of concept a set of examples has been provided to the reader. Now we are using this language to build training dataset for estimating the fault-proneness of a method. We will embed these models into SyQL concept libraries, so we will enable the language user to estimate the fault-proneness of a method simply from a SyQL queries.
6 Figure 5: Inheritance Tree of High CBO Classes References [1] M. Böhlen, J. Gamper, and C. Jensen. Multidimensional aggregation for temporal data. In Advances in Database Technology - EDBT 2006, pages , [2] R. Carrasco, M. Vila, and F. Araque. dmfsql: A language for data mining. In Proceedings of the 17th International Conference on Database and Expert Systems Applications, pages , [3] S. Chidamber and C. Kemerer. A metrics suite for object oriented design. IEEE TSE, 20(6): , [4] E. Cox. FuzzySQL a tool for finding the truth: the power of approximate database queries. PC AI, 14(1):48-51, [5] F. Fioravanti, P. Nesi. Estimation and prediction metrics for adaptive maintenance effort of object-oriented systems. IEEE TSE, 27(12): , [6] A. Hindle and D. M. German. SCQL: a formal model and a query language for source control repositories. In Proceedings of the 2005 workshop on Mining software repositories, pages 1-5, [7] A. Jermakovics, R. Moser, A. Sillitti, and G. Succi. Visualizing software evolution with lagrein. In OOPSLA Companion, pages , [8] M. Karaila and T. Systa. Applying template metaprogramming techniques for a domain-specific visual language An industrial experience report. In Proceedings of the 29th international Conference on Software Engineering, pages , [9] E. Meijer, B. Beckman, and G. Bierman. LINQ: reconciling object, relations and XML in the.net framework. In Proceedings of the 2006 ACM SIGMOD international conference on Management of data, pages , [10] B. Moon and F.V. Lopez. Efficient algorithms for large-scale temporal aggregation. IEEE Transactions on Knowledge and Data Engineering, 15(3): , [11] O. d. Moor, M. Verbaere, E. Hajiyev, P. Avgustinov, T. Ekman, N. Ongkingco, D. Sereni, and J. Tibble. Keynote Address:.QL for source code analysis. In Proceedings of the Seventh IEEE international Working Conference on Source Code Analysis and Manipulation, pages 3-16, [12] M. Pinzger, H. Gall, M. Fischer, and M. Lanza. Visualizing multiple evolution metrics. In Proceedings of the 2005 ACM Symposium on Software Visualization, pages 67-75, [13] K. Ramamurthy, A. Sen, and A.P. Sinha. Data Warehousing Infusion and Organizational Effectiveness. IEEE Transactions on Systems, Man and Cybernetics, Part A, 38(4): , [14] M. Scotto, A. Sillitti, G. Succi, and T. Vernazza. A non-invasive approach to product metrics collection. Journal of System Architecture, 52(11): , [15] M. Scotto, A. Sillitti, G. Succi, and T. Vernazza. Noninvasive collection of software metrics: some issues and experiences. In Sharing experiences on agile methodologies in open source software development, Polimetrica Publisher, Italy, pages 31-38, [16] J. Yang and J. Widom. Incremental computation and maintenance of temporal aggregates. The VLDB Journal, 12(3): , [17] L. A. Zadeh. The Concept of a Linguistic Variable and its Application to Approximate Reasoning. Information Science, 8: , [18] D. Zhang, A. Markowetz, V. Tsotras, D. Gunopulos, and B. Seeger. Efficient computation of temporal aggregates with range predicates. In Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages , [19] S. Zhang, J. Lu, and C. Zhang. A fuzzy logic based method to acquire user threshold of minimum-support for mining association rules. Information Sciences, 164(1-4): 1-16, 2004.
SyQL: Querying Software Process Data Through an Object-Oriented Metamodel
SyQL: Querying Software Process Data Through an Object-Oriented Metamodel Mirco Bianco 1 1 Faculty of Computer Science, Free University of Bolzano-Bozen, Italy {Mirco.Bianco}@unibz.it Abstract. The effective
More informationFault-Proneness Estimation and Java Migration: A Preliminary Case Study
Fault-Proneness Estimation and Java Migration: A Preliminary Case Study Mirco Bianco, Daniel Kaneider, Alberto Sillitti, and Giancarlo Succi Center for Applied Software Engineering, Free University of
More informationEmploying Query Technologies for Crosscutting Concern Comprehension
Employing Query Technologies for Crosscutting Concern Comprehension Marius Marin Accenture The Netherlands Marius.Marin@accenture.com Abstract Common techniques for improving comprehensibility of software
More informationAn Overview of various methodologies used in Data set Preparation for Data mining Analysis
An Overview of various methodologies used in Data set Preparation for Data mining Analysis Arun P Kuttappan 1, P Saranya 2 1 M. E Student, Dept. of Computer Science and Engineering, Gnanamani College of
More informationHorizontal Aggregations in SQL to Prepare Data Sets Using PIVOT Operator
Horizontal Aggregations in SQL to Prepare Data Sets Using PIVOT Operator R.Saravanan 1, J.Sivapriya 2, M.Shahidha 3 1 Assisstant Professor, Department of IT,SMVEC, Puducherry, India 2,3 UG student, Department
More informationDynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering
Dynamic Optimization of Generalized SQL Queries with Horizontal Aggregations Using K-Means Clustering Abstract Mrs. C. Poongodi 1, Ms. R. Kalaivani 2 1 PG Student, 2 Assistant Professor, Department of
More informationA Case Study on the Similarity Between Source Code and Bug Reports Vocabularies
A Case Study on the Similarity Between Source Code and Bug Reports Vocabularies Diego Cavalcanti 1, Dalton Guerrero 1, Jorge Figueiredo 1 1 Software Practices Laboratory (SPLab) Federal University of Campina
More informationVisualizing the evolution of software using softchange
Visualizing the evolution of software using softchange Daniel M. German, Abram Hindle and Norman Jordan Software Engineering Group Department of Computer Science University of Victoria dmgerman,abez,njordan
More informationFurther GroupBy & Extend Operations
Slide 1 Further GroupBy & Extend Operations Objectives of the Lecture : To consider whole relation Grouping; To consider the SQL Grouping option Having; To consider the Extend operator & its implementation
More informationInternational Journal for Management Science And Technology (IJMST)
Volume 4; Issue 03 Manuscript- 1 ISSN: 2320-8848 (Online) ISSN: 2321-0362 (Print) International Journal for Management Science And Technology (IJMST) GENERATION OF SOURCE CODE SUMMARY BY AUTOMATIC IDENTIFICATION
More informationChapter 3. Interactive Software Development Assistants Logic-based Software Representation. Logic-based Software Analysis
Advanced Logic Programming Summer semester 2012 R O O T S Chapter 3. Logic-based Analysis Interactive Development Assistants Logic-based Representation Logic-based Analysis Logic-based Transformation Motivation
More informationSNS College of Technology, Coimbatore, India
Support Vector Machine: An efficient classifier for Method Level Bug Prediction using Information Gain 1 M.Vaijayanthi and 2 M. Nithya, 1,2 Assistant Professor, Department of Computer Science and Engineering,
More informationData Warehouse Design Using Row and Column Data Distribution
Int'l Conf. Information and Knowledge Engineering IKE'15 55 Data Warehouse Design Using Row and Column Data Distribution Behrooz Seyed-Abbassi and Vivekanand Madesi School of Computing, University of North
More informationHyperion Interactive Reporting Reports & Dashboards Essentials
Oracle University Contact Us: +27 (0)11 319-4111 Hyperion Interactive Reporting 11.1.1 Reports & Dashboards Essentials Duration: 5 Days What you will learn The first part of this course focuses on two
More informationData Warehouses Chapter 12. Class 10: Data Warehouses 1
Data Warehouses Chapter 12 Class 10: Data Warehouses 1 OLTP vs OLAP Operational Database: a database designed to support the day today transactions of an organization Data Warehouse: historical data is
More informationQUERY RECOMMENDATION SYSTEM USING USERS QUERYING BEHAVIOR
International Journal of Emerging Technology and Innovative Engineering QUERY RECOMMENDATION SYSTEM USING USERS QUERYING BEHAVIOR V.Megha Dept of Computer science and Engineering College Of Engineering
More informationEnhanced Performance of Database by Automated Self-Tuned Systems
22 Enhanced Performance of Database by Automated Self-Tuned Systems Ankit Verma Department of Computer Science & Engineering, I.T.M. University, Gurgaon (122017) ankit.verma.aquarius@gmail.com Abstract
More informationHorizontal Aggregations for Mining Relational Databases
Horizontal Aggregations for Mining Relational Databases Dontu.Jagannadh, T.Gayathri, M.V.S.S Nagendranadh. Department of CSE Sasi Institute of Technology And Engineering,Tadepalligudem, Andhrapradesh,
More informationAn Improved Apriori Algorithm for Association Rules
Research article An Improved Apriori Algorithm for Association Rules Hassan M. Najadat 1, Mohammed Al-Maolegi 2, Bassam Arkok 3 Computer Science, Jordan University of Science and Technology, Irbid, Jordan
More informationHandout 12 Data Warehousing and Analytics.
Handout 12 CS-605 Spring 17 Page 1 of 6 Handout 12 Data Warehousing and Analytics. Operational (aka transactional) system a system that is used to run a business in real time, based on current data; also
More informationArcGIS Pro SDK for.net: An Overview of the Geodatabase API. Colin Zwicker Ling Zhang Nghiep Quang
ArcGIS Pro SDK for.net: An Overview of the Geodatabase API Colin Zwicker Ling Zhang Nghiep Quang What will not be deeply discussed Add-in model & threading model - ArcGIS Pro SDK for.net: Beginning Pro
More informationChurrasco: Supporting Collaborative Software Evolution Analysis
Churrasco: Supporting Collaborative Software Evolution Analysis Marco D Ambros a, Michele Lanza a a REVEAL @ Faculty of Informatics - University of Lugano, Switzerland Abstract Analyzing the evolution
More informationTable of Contents Chapter 1 - Introduction Chapter 2 - Designing XML Data and Applications Chapter 3 - Designing and Managing XML Storage Objects
Table of Contents Chapter 1 - Introduction 1.1 Anatomy of an XML Document 1.2 Differences Between XML and Relational Data 1.3 Overview of DB2 purexml 1.4 Benefits of DB2 purexml over Alternative Storage
More informationInterview Questions on DBMS and SQL [Compiled by M V Kamal, Associate Professor, CSE Dept]
Interview Questions on DBMS and SQL [Compiled by M V Kamal, Associate Professor, CSE Dept] 1. What is DBMS? A Database Management System (DBMS) is a program that controls creation, maintenance and use
More informationArchiving and Maintaining Curated Databases
Archiving and Maintaining Curated Databases Heiko Müller University of Edinburgh, UK hmueller@inf.ed.ac.uk Abstract Curated databases represent a substantial amount of effort by a dedicated group of people
More informationFast Discovery of Sequential Patterns Using Materialized Data Mining Views
Fast Discovery of Sequential Patterns Using Materialized Data Mining Views Tadeusz Morzy, Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science ul. Piotrowo
More informationCourse Modules for MCSA: SQL Server 2016 Database Development Training & Certification Course:
Course Modules for MCSA: SQL Server 2016 Database Development Training & Certification Course: 20762C Developing SQL 2016 Databases Module 1: An Introduction to Database Development Introduction to the
More informationOral Questions and Answers (DBMS LAB) Questions & Answers- DBMS
Questions & Answers- DBMS https://career.guru99.com/top-50-database-interview-questions/ 1) Define Database. A prearranged collection of figures known as data is called database. 2) What is DBMS? Database
More informationUML-Based Conceptual Modeling of Pattern-Bases
UML-Based Conceptual Modeling of Pattern-Bases Stefano Rizzi DEIS - University of Bologna Viale Risorgimento, 2 40136 Bologna - Italy srizzi@deis.unibo.it Abstract. The concept of pattern, meant as an
More information1. Inroduction to Data Mininig
1. Inroduction to Data Mininig 1.1 Introduction Universe of Data Information Technology has grown in various directions in the recent years. One natural evolutionary path has been the development of the
More informationA Novel Application of Open Source Technologies to Measure Agile Software Development Process
A Novel Application of Open Source Technologies to Measure Agile Software Development Process Luis Corral, Andrea Janes, Tadas Remencius, Juri Strumpflohner, and Jelena Vlasenko Free University of Bozen-Bolzano
More informationTIM 50 - Business Information Systems
TIM 50 - Business Information Systems Lecture 15 UC Santa Cruz May 20, 2014 Announcements DB 2 Due Tuesday Next Week The Database Approach to Data Management Database: Collection of related files containing
More informationData Mining with Elastic
2017 IJSRST Volume 3 Issue 3 Print ISSN: 2395-6011 Online ISSN: 2395-602X Themed Section: Science and Technology Data Mining with Elastic Mani Nandhini Sri, Mani Nivedhini, Dr. A. Balamurugan Sri Krishna
More informationAdvanced Data Management Technologies
ADMT 2017/18 Unit 13 J. Gamper 1/42 Advanced Data Management Technologies Unit 13 DW Pre-aggregation and View Maintenance J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Acknowledgements:
More informationAn Eclipse Plug-In for Generating Database Access Documentation in Java Code
An Eclipse Plug-In for Generating Database Access Documentation in Java Code Paul L. Bergstein and Aditya Gade Dept. of Computer and Information Science, University of Massachusetts Dartmouth, Dartmouth,
More informationIntroduction to Microsoft.NET Programming Using Microsoft Visual Studio 2008 (C#) Course Overview. Prerequisites. Audience.
Introduction to Microsoft.NET Programming Using Microsoft Visual Studio 2008 (C#) Course Number: 6368A Course Length: 1 Day Course Overview This instructor-led course provides an introduction to developing
More informationBuilt for Speed: Comparing Panoply and Amazon Redshift Rendering Performance Utilizing Tableau Visualizations
Built for Speed: Comparing Panoply and Amazon Redshift Rendering Performance Utilizing Tableau Visualizations Table of contents Faster Visualizations from Data Warehouses 3 The Plan 4 The Criteria 4 Learning
More information(All chapters begin with an Introduction end with a Summary, Exercises, and Reference and Bibliography) Preliminaries An Overview of Database
(All chapters begin with an Introduction end with a Summary, Exercises, and Reference and Bibliography) Preliminaries An Overview of Database Management What is a database system? What is a database? Why
More informationQuerying Data with Transact SQL
Course 20761A: Querying Data with Transact SQL Course details Course Outline Module 1: Introduction to Microsoft SQL Server 2016 This module introduces SQL Server, the versions of SQL Server, including
More informationA Better Approach for Horizontal Aggregations in SQL Using Data Sets for Data Mining Analysis
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 2, Issue. 8, August 2013,
More informationMicrosoft. [MS20762]: Developing SQL Databases
[MS20762]: Developing SQL Databases Length : 5 Days Audience(s) : IT Professionals Level : 300 Technology : Microsoft SQL Server Delivery Method : Instructor-led (Classroom) Course Overview This five-day
More informationQuestion Bank. 4) It is the source of information later delivered to data marts.
Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile
More informationTaxonomy Dimensions of Complexity Metrics
96 Int'l Conf. Software Eng. Research and Practice SERP'15 Taxonomy Dimensions of Complexity Metrics Bouchaib Falah 1, Kenneth Magel 2 1 Al Akhawayn University, Ifrane, Morocco, 2 North Dakota State University,
More informationReading part: Design-Space Exploration with Alloy
Reading part: Design-Space Exploration with Alloy Ing. Ken Vanherpen Abstract In the growing world of MDE many tools are offered to describe a (part of a) system, constrain it, and check some properties
More informationInformation Management (IM)
1 2 3 4 5 6 7 8 9 Information Management (IM) Information Management (IM) is primarily concerned with the capture, digitization, representation, organization, transformation, and presentation of information;
More informationInformation Discovery, Extraction and Integration for the Hidden Web
Information Discovery, Extraction and Integration for the Hidden Web Jiying Wang Department of Computer Science University of Science and Technology Clear Water Bay, Kowloon Hong Kong cswangjy@cs.ust.hk
More informationAn Optimization Algorithm for Physical Database Design
Proceedings of the 5th WSEAS Int. Conf. on DATA NETWORKS, COMMUNICATIONS & COMPUTERS, Bucharest, Romania, October 16-17, 2006 13 An Optimization Algorithm for Physical Database Design ADI-CRISTINA MITEA
More informationT-SQL Training: T-SQL for SQL Server for Developers
Duration: 3 days T-SQL Training Overview T-SQL for SQL Server for Developers training teaches developers all the Transact-SQL skills they need to develop queries and views, and manipulate data in a SQL
More informationSQL Server Interview Questions
This Download is from www.downloadmela.com. The main motto of this website is to provide free download links of ebooks,video tutorials,magazines,previous papers,interview related content. To download more
More informationC# Programming in the.net Framework
50150B - Version: 2.1 04 May 2018 C# Programming in the.net Framework C# Programming in the.net Framework 50150B - Version: 2.1 6 days Course Description: This six-day instructor-led course provides students
More informationCOMPUTER SCIENCE (ELECTIVE) Paper-A (100 Marks) Section-I: INTRODUCTION TO INFORMATION TECHNOLOGY Computer and its characteristics, Computer Organization & operation, Components of Computer, Input/Output
More informationData Mining Technology Based on Bayesian Network Structure Applied in Learning
, pp.67-71 http://dx.doi.org/10.14257/astl.2016.137.12 Data Mining Technology Based on Bayesian Network Structure Applied in Learning Chunhua Wang, Dong Han College of Information Engineering, Huanghuai
More informationData warehouse architecture consists of the following interconnected layers:
Architecture, in the Data warehousing world, is the concept and design of the data base and technologies that are used to load the data. A good architecture will enable scalability, high performance and
More informationEmpirical Study on Impact of Developer Collaboration on Source Code
Empirical Study on Impact of Developer Collaboration on Source Code Akshay Chopra University of Waterloo Waterloo, Ontario a22chopr@uwaterloo.ca Parul Verma University of Waterloo Waterloo, Ontario p7verma@uwaterloo.ca
More informationA New Generation PEPA Workbench
A New Generation PEPA Workbench Mirco Tribastone Stephen Gilmore Abstract We present recent developments on the implementation of a new PEPA Workbench, a cross-platform application for editing, analysing,
More informationHOW AND WHEN TO FLATTEN JAVA CLASSES?
HOW AND WHEN TO FLATTEN JAVA CLASSES? Jehad Al Dallal Department of Information Science, P.O. Box 5969, Safat 13060, Kuwait ABSTRACT Improving modularity and reusability are two key objectives in object-oriented
More informationHow to use Pivot table macro
How to use Pivot table macro Managing Pivot Tables Table Filter and Charts for Confluence add-on allows you to summarize your table data and produce its aggregated view in the form of a pivot table. You
More informationData Warehouse and Mining
Data Warehouse and Mining 1. is a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management decisions. A. Data Mining. B. Data Warehousing. C. Web Mining. D. Text
More informationYBS ORACLE FORMS APPLICATION STRATEGY IN A SOA WORLD
07/05/2015 YBS ORACLE FORMS APPLICATION STRATEGY IN A SOA WORLD Created by: Graham Brown, Application Architecture Manager Public AGENDA Background to Yorkshire Building Society History of YBS Oracle Forms
More informationPROGRAMMING IN VISUAL BASIC WITH MICROSOFT VISUAL STUDIO Course: 10550A; Duration: 5 Days; Instructor-led
CENTER OF KNOWLEDGE, PATH TO SUCCESS Website: PROGRAMMING IN VISUAL BASIC WITH MICROSOFT VISUAL STUDIO 2010 Course: 10550A; Duration: 5 Days; Instructor-led WHAT YOU WILL LEARN This course teaches you
More informationUnit 10 Databases. Computer Concepts Unit Contents. 10 Operational and Analytical Databases. 10 Section A: Database Basics
Unit 10 Databases Computer Concepts 2016 ENHANCED EDITION 10 Unit Contents Section A: Database Basics Section B: Database Tools Section C: Database Design Section D: SQL Section E: Big Data Unit 10: Databases
More informationSemantic-Based Web Mining Under the Framework of Agent
Semantic-Based Web Mining Under the Framework of Agent Usha Venna K Syama Sundara Rao Abstract To make automatic service discovery possible, we need to add semantics to the Web service. A semantic-based
More informationMobility Data Management and Exploration: Theory and Practice
Mobility Data Management and Exploration: Theory and Practice Chapter 4 -Mobility data management at the physical level Nikos Pelekis & Yannis Theodoridis InfoLab, University of Piraeus, Greece infolab.cs.unipi.gr
More informationAdvanced Data Management Technologies Written Exam
Advanced Data Management Technologies Written Exam 02.02.2016 First name Student number Last name Signature Instructions for Students Write your name, student number, and signature on the exam sheet. This
More informationPersonalised Learning Checklist ( ) SOUND
Personalised Learning Checklist (2015-2016) Subject: Computing Level: A2 Name: Outlined below are the topics you have studied for this course. Inside each topic area you will find a breakdown of the topic
More informationHorizontal Aggregation in SQL to Prepare Dataset for Generation of Decision Tree using C4.5 Algorithm in WEKA
Horizontal Aggregation in SQL to Prepare Dataset for Generation of Decision Tree using C4.5 Algorithm in WEKA Mayur N. Agrawal 1, Ankush M. Mahajan 2, C.D. Badgujar 3, Hemant P. Mande 4, Gireesh Dixit
More informationDatabase Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL)
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 7 Introduction to Structured Query Language (SQL) Objectives In this chapter, students will learn: The basic commands and
More informationTopicViewer: Evaluating Remodularizations Using Semantic Clustering
TopicViewer: Evaluating Remodularizations Using Semantic Clustering Gustavo Jansen de S. Santos 1, Katyusco de F. Santos 2, Marco Tulio Valente 1, Dalton D. S. Guerrero 3, Nicolas Anquetil 4 1 Federal
More informationCTI Short Learning Programme in Internet Development Specialist
CTI Short Learning Programme in Internet Development Specialist Module Descriptions 2015 1 Short Learning Programme in Internet Development Specialist (10 months full-time, 25 months part-time) Computer
More informationXML-OLAP: A Multidimensional Analysis Framework for XML Warehouses
XML-OLAP: A Multidimensional Analysis Framework for XML Warehouses Byung-Kwon Park 1,HyoilHan 2,andIl-YeolSong 2 1 Dong-A University, Busan, Korea bpark@dau.ac.kr 2 Drexel University, Philadelphia, PA
More informationSql Server Syllabus. Overview
Sql Server Syllabus Overview This SQL Server training teaches developers all the Transact-SQL skills they need to create database objects like Tables, Views, Stored procedures & Functions and triggers
More informationT.Y.B.Sc. Syllabus Under Autonomy Mathematics Applied Component(Paper-I)
T.Y.B.Sc. Syllabus Under Autonomy Mathematics Applied Component(Paper-I) Course: S.MAT. 5.03 COMPUTER PROGRAMMING AND SYSTEM ANALYSIS (JAVA PROGRAMMING & SSAD) [25 Lectures] Learning Objectives:- To learn
More informationQuery optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.
Database Management Systems DBMS Architecture SQL INSTRUCTION OPTIMIZER MANAGEMENT OF ACCESS METHODS CONCURRENCY CONTROL BUFFER MANAGER RELIABILITY MANAGEMENT Index Files Data Files System Catalog DATABASE
More informationWEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS
1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,
More informationTIM 50 - Business Information Systems
TIM 50 - Business Information Systems Lecture 15 UC Santa Cruz Nov 10, 2016 Class Announcements n Database Assignment 2 posted n Due 11/22 The Database Approach to Data Management The Final Database Design
More informationEffect of Principle Component Analysis and Support Vector Machine in Software Fault Prediction
International Journal of Computer Trends and Technology (IJCTT) volume 7 number 3 Jan 2014 Effect of Principle Component Analysis and Support Vector Machine in Software Fault Prediction A. Shanthini 1,
More informationMining Software Repositories for Software Change Impact Analysis: A Case Study
Mining Software Repositories for Software Change Impact Analysis: A Case Study Lile Hattori 1, Gilson dos Santos Jr. 2, Fernando Cardoso 2, Marcus Sampaio 2 1 Faculty of Informatics University of Lugano
More informationPreparation of Data Set for Data Mining Analysis using Horizontal Aggregation in SQL
Preparation of Data Set for Data Mining Analysis using Horizontal Aggregation in SQL Vidya Bodhe P.G. Student /Department of CE KKWIEER Nasik, University of Pune, India vidya.jambhulkar@gmail.com Abstract
More informationDryadLINQ. by Yuan Yu et al., OSDI 08. Ilias Giechaskiel. January 28, Cambridge University, R212
DryadLINQ by Yuan Yu et al., OSDI 08 Ilias Giechaskiel Cambridge University, R212 ig305@cam.ac.uk January 28, 2014 Conclusions Takeaway Messages SQL cannot express iteration Unsuitable for machine learning,
More informationAnalysis of Query Processing and Optimization
Analysis of Query Processing and Optimization Nimra Memon, Muhammad Saleem Vighio, Shah Zaman Nizamani, Niaz Ahmed Memon, Adeel Riaz Memon, Umair Ramzan Shaikh Abstract Modern database management systems
More informationINTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY
[Agrawal, 2(4): April, 2013] ISSN: 2277-9655 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY An Horizontal Aggregation Approach for Preparation of Data Sets in Data Mining Mayur
More informationThis tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.
About the Tutorial Data Mining is defined as the procedure of extracting information from huge sets of data. In other words, we can say that data mining is mining knowledge from data. The tutorial starts
More informationOptimizing Testing Performance With Data Validation Option
Optimizing Testing Performance With Data Validation Option 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording
More informationWhom Is This Book For?... xxiv How Is This Book Organized?... xxiv Additional Resources... xxvi
Foreword by Bryan Hunter xv Preface xix Acknowledgments xxi Introduction xxiii Whom Is This Book For?... xxiv How Is This Book Organized?... xxiv Additional Resources... xxvi 1 Meet F# 1 F# in Visual Studio...
More informationOracle Syllabus Course code-r10605 SQL
Oracle Syllabus Course code-r10605 SQL Writing Basic SQL SELECT Statements Basic SELECT Statement Selecting All Columns Selecting Specific Columns Writing SQL Statements Column Heading Defaults Arithmetic
More informationAN EFFICIENT ALGORITHM FOR DATABASE QUERY OPTIMIZATION IN CROWDSOURCING SYSTEM
AN EFFICIENT ALGORITHM FOR DATABASE QUERY OPTIMIZATION IN CROWDSOURCING SYSTEM Miss. Pariyarath Jesnaraj 1, Dr. K. V. Metre 2 1 Department of Computer Engineering, MET s IOE, Maharashtra, India 2 Department
More informationRelational Databases
Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 49 Plan of the course 1 Relational databases 2 Relational database design 3 Conceptual database design 4
More informationChapter 3. Databases and Data Warehouses: Building Business Intelligence
Chapter 3 Databases and Data Warehouses: Building Business Intelligence How Can a Business Increase its Intelligence? Summary Overview of Main Concepts Details/Design of a Relational Database Creating
More informationCS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)
CS614- Data Warehousing Solved MCQ(S) From Midterm Papers (1 TO 22 Lectures) BY Arslan Arshad Nov 21,2016 BS110401050 BS110401050@vu.edu.pk Arslan.arshad01@gmail.com AKMP01 CS614 - Data Warehousing - Midterm
More informationSpecial Issue of IJCIM Proceedings of the
Special Issue of IJCIM Proceedings of the Eigh th-.. '' '. Jnte ' '... : matio'....' ' nal'.. '.. -... ;p ~--.' :'.... :... ej.lci! -1'--: "'..f(~aa, D-.,.,a...l~ OR elattmng tot.~av-~e-ijajil:u. ~~ Pta~.,
More informationAn Archiving System for Managing Evolution in the Data Web
An Archiving System for Managing Evolution in the Web Marios Meimaris *, George Papastefanatos and Christos Pateritsas * Institute for the Management of Information Systems, Research Center Athena, Greece
More informationData Mining and Warehousing
Data Mining and Warehousing Sangeetha K V I st MCA Adhiyamaan College of Engineering, Hosur-635109. E-mail:veerasangee1989@gmail.com Rajeshwari P I st MCA Adhiyamaan College of Engineering, Hosur-635109.
More information8) A top-to-bottom relationship among the items in a database is established by a
MULTIPLE CHOICE QUESTIONS IN DBMS (unit-1 to unit-4) 1) ER model is used in phase a) conceptual database b) schema refinement c) physical refinement d) applications and security 2) The ER model is relevant
More informationMeaning & Concepts of Databases
27 th August 2015 Unit 1 Objective Meaning & Concepts of Databases Learning outcome Students will appreciate conceptual development of Databases Section 1: What is a Database & Applications Section 2:
More informationSEF DATABASE FOUNDATION ON ORACLE COURSE CURRICULUM
On a Mission to Transform Talent SEF DATABASE FOUNDATION ON ORACLE COURSE CURRICULUM Table of Contents Module 1: Introduction to Linux & RDBMS (Duration: 1 Week)...2 Module 2: Oracle SQL (Duration: 3 Weeks)...3
More informationDOT NET Syllabus (6 Months)
DOT NET Syllabus (6 Months) THE COMMON LANGUAGE RUNTIME (C.L.R.) CLR Architecture and Services The.Net Intermediate Language (IL) Just- In- Time Compilation and CLS Disassembling.Net Application to IL
More informationIntroduction to Trajectory Clustering. By YONGLI ZHANG
Introduction to Trajectory Clustering By YONGLI ZHANG Outline 1. Problem Definition 2. Clustering Methods for Trajectory data 3. Model-based Trajectory Clustering 4. Applications 5. Conclusions 1 Problem
More informationChapter 3. The Multidimensional Model: Basic Concepts. Introduction. The multidimensional model. The multidimensional model
Chapter 3 The Multidimensional Model: Basic Concepts Introduction Multidimensional Model Multidimensional concepts Star Schema Representation Conceptual modeling using ER, UML Conceptual modeling using
More informationBugzillaMetrics - Design of an adaptable tool for evaluating user-defined metric specifications on change requests
BugzillaMetrics - A tool for evaluating metric specifications on change requests BugzillaMetrics - Design of an adaptable tool for evaluating user-defined metric specifications on change requests Lars
More informationA Data warehouse within a Federated database architecture
Association for Information Systems AIS Electronic Library (AISeL) AMCIS 1997 Proceedings Americas Conference on Information Systems (AMCIS) 8-15-1997 A Data warehouse within a Federated database architecture
More information