Which Line Is it Anyway : Clustering for Staff Line Removal in Optical Music Recognition. Sam Vinitsky

Size: px
Start display at page:

Download "Which Line Is it Anyway : Clustering for Staff Line Removal in Optical Music Recognition. Sam Vinitsky"

Transcription

1 Which Line Is it Anyway : Clustering for Staff Line Removal in Optical Music Recognition Sam Vinitsky

2 Optical Music Recognition: An Overview

3 Optical Music Recognition: An Overview

4 Sheet Music Primer

5 Sheet Music Primer: Staff Lines

6 Sheet Music Primer: Symbols

7 Back to Optical Music Recognition...

8 Applications of OMR: Handwritten Music

9 Optical Music Recognition: Pipeline

10 Optical Music Recognition: Pipeline 1. Find and remove staff lines

11 Optical Music Recognition: Pipeline Find and remove staff lines Find symbols

12 Optical Music Recognition: Pipeline = note 2 = treble clef 3 = sharp 4 = sharp 5 = three 6 = note... Find and remove staff lines Find symbols Determine type of each symbol

13 Optical Music Recognition: Pipeline = note 2 = treble clef 3 = sharp 4 = sharp 5 = three 6 = note... (G4) (first clef) (key sig with 4) (key sig with 3) (triplet for ) (E5, triplet 5) Find and remove staff lines Find symbols Determine type of each symbol Determine how symbols relate to each other

14 Optical Music Recognition: Pipeline = note 2 = treble clef 3 = sharp 4 = sharp 5 = three 6 = note... (G4) (first clef) (key sig with 4) (key sig with 3) (triplet for ) (E5, triplet 5) Find and remove staff lines Find symbols Determine type of each symbol Determine how symbols relate to each other Convert into computer readable format (musicxml, MIDI, etc...)

15 Optical Music Recognition: Pipeline = note 2 = treble clef 3 = sharp 4 = sharp 5 = three 6 = note... (G4) (first clef) (key sig with 4) (key sig with 3) (triplet for ) (E5, triplet 5) *** Find and remove staff lines *** Find symbols Determine type of each symbol Determine how symbols relate to each other Convert into computer readable format (musicxml, MIDI, etc...)

16 Staff Line Removal: Overview Input: original score image Output: staffless image

17 Reminder: Hough Transform...

18 Staff Line Removal...

19 Staff Line Removal...

20 Staff Line Removal...

21 Staff Line Removal...

22 Staff Line Removal: Algorithms Handcrafted: - Run-length Shortest Path Etc [Carter & Bacon 92] [Cardoso et al. 08] [see Dalitz et al. 08]

23 Staff Line Removal: Algorithms Handcrafted: - Run-length Shortest Path Etc [Carter & Bacon 92] [Cardoso et al. 08] [see Dalitz et al. 08] Supervised Machine Learning: (State of the Art) - Classification of pixels Deep learning on image - Convolutional Neural Networks - Generative Adversarial Networks [Calvo-Zarogoza et al. 16] [Calvo-Zarogoza et al. 17] [Bhunia et al. 18]

24 Staff Line Removal: Algorithms Handcrafted: - Run-length Shortest Path Etc [Carter & Bacon 92] [Cardoso et al. 08] [see Dalitz et al. 08] Supervised Machine Learning: (State of the Art) - Classification of pixels Deep learning on image - Convolutional Neural Networks - Generative Adversarial Networks [Calvo-Zarogoza et al. 16] [Calvo-Zarogoza et al. 17] [Bhunia et al. 18] Unsupervised Machine Learning: - Clustering of pixels [Vinitsky 18]

25 Staff Line Removal via Pixel Clustering Algorithm: (Vinitsky 18) 1. Convert each black pixel into a feature vector Image modified from

26 Staff Line Removal via Pixel Clustering Image source: Calvo-Zaragova et al., 16

27 Staff Line Removal via Pixel Clustering Algorithm: (Vinitsky 18) 1. Convert each black pixel into a feature vector Image modified from

28 Staff Line Removal via Pixel Clustering Algorithm: (Vinitsky 18) Convert each black pixel into a feature vector Cluster the feature vectors into two groups Image modified from

29 Staff Line Removal via Pixel Clustering Algorithm: (Vinitsky 18) Convert each black pixel into a feature vector Cluster the feature vectors into two groups Figure out which cluster is staff and which is non-staff Image modified from

30 Staff Line Removal: Results * = implemented ** = original algorithm

31 Staff Line Removal: Results Clustering output Ground truth staff-less

32 Staff Line Removal: Results Clustering output Ground truth staff-less

33 Future Work Clustering for Staff Line Removal: - Other clustering methods

34 Future Work Clustering for Staff Line Removal: - Other clustering methods Better feature vectors

35 Future Work Clustering for Staff Line Removal: - Other clustering methods Better feature vectors Picking the staff cluster

36 Future Work Clustering for Staff Line Removal: - Other clustering methods Better feature vectors Picking the staff cluster White pixels can be staff (due to noise)

37 Future Work Clustering for Staff Line Removal: - Other clustering methods Better feature vectors Picking the staff cluster White pixels can be staff (due to noise) Optical Music Recognition: - The rest of the pipeline...

38 References [1] Jorge Calvo-Zaragoza, Luisa Mico, and Jose Oncina. Music staff removal with supervised pixel classification. International Journal on Document Analysis and Recognition (IJDAR), 19(3): , Sep [2] Jorge Calvo-Zaragoza, Jose J. Valero-Mas, and Antonio Pertusa. End-to-end optical music recognition using neural networks. In ISMIR, [3] Jaime Cardoso, Artur Capela, Ana Rebelo, and Carlos Guedes. A connected path approach for staff detection on a music score, [4] I. Fujinaga, M. Droettboom, C. Dalitz, and B. Pranzas. A comparative study of staff removal algorithms. IEEE Transactions on Pattern Analysis & Machine Intelligence, 30: , [5] A. Konwer, A. K. Bhunia, A. Bhowmick, P. Banerjee, P. Pratim Roy, and U. Pal. Staff line Removal using Generative Adversarial Networks. ArXiv e-prints, [6] J. Novotny and J. Pokorny. Introduction to optical music recognition: Overview and practical challenges [7] N.P. Carter, R.A. Bacon: Automatic Recognition of Printed Music. In H.S. Baird, H. Bunke, K. Yamamoto (editors): Structured Document Image Analysis, pp , Springer, 1992.

Staff line Removal using Generative Adversarial Networks

Staff line Removal using Generative Adversarial Networks Staff line Removal using Generative Adversarial Networks a Aishik Konwer, a Ayan Kumar Bhunia, a Abir Bhowmick, b Ankan Kumar Bhunia, c Prithaj Banerjee, d Partha Pratim Roy, e Umapada Pal a Department

More information

STAFF LINE DETECTION AND REMOVAL WITH STABLE PATHS

STAFF LINE DETECTION AND REMOVAL WITH STABLE PATHS STAFF LINE DETECTION AND REMOVAL WITH STABLE PATHS Artur Capela, Ana Rebelo INESC Porto, Campus da FEUP, Rua Dr. Roberto Frias 378, 4200-465 Porto, Portugal gcapela@inescporto.pt, arebelo@inescporto.pt

More information

Music Document Layout Analysis through Machine Learning and Human Feedback

Music Document Layout Analysis through Machine Learning and Human Feedback Music Document Layout Analysis through Machine Learning and Human Feedback Jorge Calvo-Zaragoza, Ké Zhang, Zeyad Saleh, Gabriel Vigliensoni, Ichiro Fujinaga Schulich School of Music McGill University,

More information

Optical Music Recognition using Hidden Markov Models

Optical Music Recognition using Hidden Markov Models Optical Music Recognition using Hidden Markov Models Natalie Wilkinson April 25, 2016 1 Introduction Optical Music Recognition software (OMR) is software that has the ability to read sheet music. This

More information

1134 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 6, JUNE A STABLE PATH APPROACH FOR STAFF LINE DETECTION

1134 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 6, JUNE A STABLE PATH APPROACH FOR STAFF LINE DETECTION 1134 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 31, NO. 6, JUNE 2009 Short Papers Staff Detection with Stable Paths Jaime dos Santos Cardoso, Member, IEEE, Artur Capela, Ana Rebelo,

More information

The ICDAR/GREC 2013 Music Scores Competition: Staff Removal

The ICDAR/GREC 2013 Music Scores Competition: Staff Removal The ICDAR/GREC 2013 Music Scores Competition: Staff Removal Alicia Fornés, Van Cuong Kieu, Muriel Visani, Nicholas Journet, Dutta Anjan To cite this version: Alicia Fornés, Van Cuong Kieu, Muriel Visani,

More information

Pixel.js: Web-based Pixel Classification Correction Platform for Ground Truth Creation

Pixel.js: Web-based Pixel Classification Correction Platform for Ground Truth Creation Pixel.js: Web-based Pixel Classification Correction Platform for Ground Truth Creation Zeyad Saleh, Ké Zhang, Jorge Calvo-Zaragoza, Gabriel Vigliensoni, and Ichiro Fujinaga Distributed Digital Music Archives

More information

The ICDAR 2013 Music Scores Competition: Staff Removal

The ICDAR 2013 Music Scores Competition: Staff Removal The ICDAR 2013 Music Scores Competition: Staff Removal Muriel Visani, Van Cuong Kieu, Alicia Fornés, Nicholas Journet To cite this version: Muriel Visani, Van Cuong Kieu, Alicia Fornés, Nicholas Journet.

More information

IMPROVING DOCUMENT BINARIZATION VIA ADVERSARIAL NOISE-TEXTURE AUGMENTATION

IMPROVING DOCUMENT BINARIZATION VIA ADVERSARIAL NOISE-TEXTURE AUGMENTATION IMPROVING DOCUMENT BINARIZATION VIA ADVERSARIAL NOISE-TEXTURE AUGMENTATION Ankan Kumar Bhunia 1, Ayan Kumar Bhunia 2, Aneeshan Sain 3, Partha Pratim Roy 4 1 Jadavpur University, India 2 Nanyang Technological

More information

Staff Line Detection by Skewed Projection

Staff Line Detection by Skewed Projection Staff Line Detection by Skewed Projection Diego Nehab May 11, 2003 Abstract Most optical music recognition systems start image analysis by the detection of staff lines. This work explores simple techniques

More information

Online recognition of handwritten music symbols

Online recognition of handwritten music symbols IJDAR (2017) 20:79 89 DOI 10.1007/s10032-017-0281-y ORIGINAL PAPER Online recognition of handwritten music symbols Jiyong Oh 1 Sung Joon Son 2 Sangkuk Lee 2 Ji-Won Kwon 3 Nojun Kwak 2 Received: 28 July

More information

MIDI-Assisted Egocentric Optical Music Recognition

MIDI-Assisted Egocentric Optical Music Recognition MIDI-Assisted Egocentric Optical Music Recognition Liang Chen Indiana University Bloomington, IN chen348@indiana.edu Kun Duan GE Global Research Niskayuna, NY kun.duan@ge.com Abstract Egocentric vision

More information

Alternatives to Direct Supervision

Alternatives to Direct Supervision CreativeAI: Deep Learning for Graphics Alternatives to Direct Supervision Niloy Mitra Iasonas Kokkinos Paul Guerrero Nils Thuerey Tobias Ritschel UCL UCL UCL TUM UCL Timetable Theory and Basics State of

More information

Session 5: Raster to Vector and drawings

Session 5: Raster to Vector and drawings Session 5: Raster to Vector and drawings Chairs: Jorge Calvo-Zaragoza & Wataru Ohyama Session chairs Jorge Calvo-Zaragoza PhD candidate from 2012 to 2016 University of Alicante, Alicante (Spain) Postdoctoral

More information

Feature Visualization

Feature Visualization CreativeAI: Deep Learning for Graphics Feature Visualization Niloy Mitra Iasonas Kokkinos Paul Guerrero Nils Thuerey Tobias Ritschel UCL UCL UCL TU Munich UCL Timetable Theory and Basics State of the Art

More information

OPTICAL MEASURE RECOGNITION IN COMMON MUSIC NOTATION

OPTICAL MEASURE RECOGNITION IN COMMON MUSIC NOTATION OPTICAL MEASURE RECOGNITION IN COMMON MUSIC NOTATION Gabriel Vigliensoni, Gregory Burlet, and Ichiro Fujinaga Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT) McGill University,

More information

arxiv: v1 [cs.cv] 26 May 2018

arxiv: v1 [cs.cv] 26 May 2018 DEEP WATERSHED DETECTOR FOR MUSIC OBJECT RECOGNITION Lukas Tuggener 1,3 Ismail Elezi 1,2 Jürgen Schmidhuber 3 Thilo Stadelmann 1 1 ZHAW Datalab, Zurich University of Applied Sciences, Winterthur, Switzerland

More information

Abstract. Problem Statement. Objective. Benefits

Abstract. Problem Statement. Objective. Benefits Abstract The purpose of this final year project is to create an Android mobile application that can automatically extract relevant information from pictures of receipts. Users can also load their own images

More information

Polytechnic University of Tirana

Polytechnic University of Tirana 1 Polytechnic University of Tirana Department of Computer Engineering SIBORA THEODHOR ELINDA KAJO M ECE 2 Computer Vision OCR AND BEYOND THE PRESENTATION IS ORGANISED IN 3 PARTS : 3 Introduction, previous

More information

Ryerson University CP8208. Soft Computing and Machine Intelligence. Naive Road-Detection using CNNS. Authors: Sarah Asiri - Domenic Curro

Ryerson University CP8208. Soft Computing and Machine Intelligence. Naive Road-Detection using CNNS. Authors: Sarah Asiri - Domenic Curro Ryerson University CP8208 Soft Computing and Machine Intelligence Naive Road-Detection using CNNS Authors: Sarah Asiri - Domenic Curro April 24 2016 Contents 1 Abstract 2 2 Introduction 2 3 Motivation

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Visualization and text mining of patent and non-patent data

Visualization and text mining of patent and non-patent data of patent and non-patent data Anton Heijs Information Solutions Delft, The Netherlands http://www.treparel.com/ ICIC conference, Nice, France, 2008 Outline Introduction Applications on patent and non-patent

More information

Depth from Stereo. Dominic Cheng February 7, 2018

Depth from Stereo. Dominic Cheng February 7, 2018 Depth from Stereo Dominic Cheng February 7, 2018 Agenda 1. Introduction to stereo 2. Efficient Deep Learning for Stereo Matching (W. Luo, A. Schwing, and R. Urtasun. In CVPR 2016.) 3. Cascade Residual

More information

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal Object Detection Lecture 10.3 - Introduction to deep learning (CNN) Idar Dyrdal Deep Learning Labels Computational models composed of multiple processing layers (non-linear transformations) Used to learn

More information

Avoiding staff removal stage in Optical Music Recognition - Application to scores written in White Mensural notation

Avoiding staff removal stage in Optical Music Recognition - Application to scores written in White Mensural notation Pattern Analysis and Applications manuscript No. (will be inserted by the editor) Avoiding staff removal stage in Optical Music Recognition - Application to scores written in White Mensural notation Authors

More information

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing 3 Object Detection BVM 2018 Tutorial: Advanced Deep Learning Methods Paul F. Jaeger, of Medical Image Computing What is object detection? classification segmentation obj. detection (1 label per pixel)

More information

Measuring similarities in contextual maps as a support for handwritten classification using recurrent neural networks. Pilar Gómez-Gil, PhD ISCI 2012

Measuring similarities in contextual maps as a support for handwritten classification using recurrent neural networks. Pilar Gómez-Gil, PhD ISCI 2012 Measuring similarities in contextual maps as a support for handwritten classification using recurrent neural networks Pilar Gómez-Gil, PhD National Institute of Astrophysics, Optics and Electronics (INAOE)

More information

Unsupervised Learning

Unsupervised Learning Deep Learning for Graphics Unsupervised Learning Niloy Mitra Iasonas Kokkinos Paul Guerrero Vladimir Kim Kostas Rematas Tobias Ritschel UCL UCL/Facebook UCL Adobe Research U Washington UCL Timetable Niloy

More information

Advanced Image Processing, TNM034 Optical Music Recognition

Advanced Image Processing, TNM034 Optical Music Recognition Advanced Image Processing, TNM034 Optical Music Recognition Linköping University By: Jimmy Liikala, jimli570 Emanuel Winblad, emawi895 Toms Vulfs, tomvu491 Jenny Yu, jenyu080 1 Table of Contents Optical

More information

Seminar. Topic: Object and character Recognition

Seminar. Topic: Object and character Recognition Seminar Topic: Object and character Recognition Tse Ngang Akumawah Lehrstuhl für Praktische Informatik 3 Table of content What's OCR? Areas covered in OCR Procedure Where does clustering come in Neural

More information

The Boundary Forest Algorithm

The Boundary Forest Algorithm The Boundary Forest Algorithm Jonathan Yedidia Director of AI Research, Analog Devices Director of the Algorithmic Systems Group, Analog Garage With Charles Mathy, Nate Derbinsky, José Bento and Jonathan

More information

ABSTRACT I. INTRODUCTION. Dr. J P Patra 1, Ajay Singh Thakur 2, Amit Jain 2. Professor, Department of CSE SSIPMT, CSVTU, Raipur, Chhattisgarh, India

ABSTRACT I. INTRODUCTION. Dr. J P Patra 1, Ajay Singh Thakur 2, Amit Jain 2. Professor, Department of CSE SSIPMT, CSVTU, Raipur, Chhattisgarh, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 4 ISSN : 2456-3307 Image Recognition using Machine Learning Application

More information

ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems (Supplementary Materials)

ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems (Supplementary Materials) ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems (Supplementary Materials) Yinda Zhang 1,2, Sameh Khamis 1, Christoph Rhemann 1, Julien Valentin 1, Adarsh Kowdle 1, Vladimir

More information

Handwritten Hindi Numerals Recognition System

Handwritten Hindi Numerals Recognition System CS365 Project Report Handwritten Hindi Numerals Recognition System Submitted by: Akarshan Sarkar Kritika Singh Project Mentor: Prof. Amitabha Mukerjee 1 Abstract In this project, we consider the problem

More information

VISION & LANGUAGE From Captions to Visual Concepts and Back

VISION & LANGUAGE From Captions to Visual Concepts and Back VISION & LANGUAGE From Captions to Visual Concepts and Back Brady Fowler & Kerry Jones Tuesday, February 28th 2017 CS 6501-004 VICENTE Agenda Problem Domain Object Detection Language Generation Sentence

More information

Writer Identification In Music Score Documents Without Staff-Line Removal

Writer Identification In Music Score Documents Without Staff-Line Removal Writer Identification In Music Score Documents Without Staff-Line Removal Anirban Hati, Partha P. Roy and Umapada Pal Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata,

More information

Final Report: Smart Trash Net: Waste Localization and Classification

Final Report: Smart Trash Net: Waste Localization and Classification Final Report: Smart Trash Net: Waste Localization and Classification Oluwasanya Awe oawe@stanford.edu Robel Mengistu robel@stanford.edu December 15, 2017 Vikram Sreedhar vsreed@stanford.edu Abstract Given

More information

LEECH STEP PATH FINDING ALGORITHM

LEECH STEP PATH FINDING ALGORITHM LEECH STEP PATH FINDING ALGORITHM 1 DAWPADEE B. KIRIELLA, 2 LAKSHMAN JAYARATNE 1,2 University of Colombo School of Computing, 35, Reid Avenue, Colombo 07, Sri Lanka E-mail: 1 kiriella.dawpadee@gmail.com,

More information

Machine Learning : Clustering, Self-Organizing Maps

Machine Learning : Clustering, Self-Organizing Maps Machine Learning Clustering, Self-Organizing Maps 12/12/2013 Machine Learning : Clustering, Self-Organizing Maps Clustering The task: partition a set of objects into meaningful subsets (clusters). The

More information

Text Line Detection for Heterogeneous Documents

Text Line Detection for Heterogeneous Documents Text Line Detection for Heterogeneous Documents Markus Diem, Florian Kleber and Robert Sablatnig Computer Vision Lab Vienna University of Technology Email: diem@caa.tuwien.ac.at Abstract Text line detection

More information

NEON.JS An Application of JavaScript. Gregory Burlet

NEON.JS An Application of JavaScript. Gregory Burlet NEON.JS An Application of JavaScript 1 Gregory Burlet OUTLINE JavaScript Overview variables, arrays, functions, objects JavaScript Libraries jquery, audiolib.js, Fabric.js, Kinetic.js Aesthetics Twitter

More information

Improved Optical Music Recognition (OMR)

Improved Optical Music Recognition (OMR) Improved Optical Music Recognition (OMR) Justin Greet Stanford University jgreet@stanford.edu Abstract This project focuses on identifying and ordering the notes and rests in a given measure of sheet music

More information

Detecting and Parsing of Visual Objects: Humans and Animals. Alan Yuille (UCLA)

Detecting and Parsing of Visual Objects: Humans and Animals. Alan Yuille (UCLA) Detecting and Parsing of Visual Objects: Humans and Animals Alan Yuille (UCLA) Summary This talk describes recent work on detection and parsing visual objects. The methods represent objects in terms of

More information

A Multiple-Choice Test Recognition System based on the Gamera Framework

A Multiple-Choice Test Recognition System based on the Gamera Framework A Multiple-Choice Test Recognition System based on the Gamera Framework arxiv:1105.3834v1 [cs.cv] 19 May 2011 Andrea Spadaccini Dipartimento di Ingegneria Informatica e delle Telecomunicazioni University

More information

Towards End-to-End Audio-Sheet-Music Retrieval

Towards End-to-End Audio-Sheet-Music Retrieval Towards End-to-End Audio-Sheet-Music Retrieval Matthias Dorfer, Andreas Arzt and Gerhard Widmer Department of Computational Perception Johannes Kepler University Linz Altenberger Str. 69, A-4040 Linz matthias.dorfer@jku.at

More information

Signature Based Document Retrieval using GHT of Background Information

Signature Based Document Retrieval using GHT of Background Information 2012 International Conference on Frontiers in Handwriting Recognition Signature Based Document Retrieval using GHT of Background Information Partha Pratim Roy Souvik Bhowmick Umapada Pal Jean Yves Ramel

More information

Master research Internship. Internship report. Segmentation and recognition of symbols for printed and handwritten music scores

Master research Internship. Internship report. Segmentation and recognition of symbols for printed and handwritten music scores Master research Internship Internship report Segmentation and recognition of symbols for printed and handwritten music scores Domain: Document and Text Processing, Computer Vision and Pattern Recognition

More information

Vision is inferential. (

Vision is inferential. ( Announcements Final: Thursday, December 15, 8am, here. Review Session, Wednesday, Dec 14, 1pm, AV Williams 4424. Review sheet with practice problems on-line. Hints for Final Focus on core techniques/ideas:

More information

CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015

CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015 CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015 Etienne Gadeski, Hervé Le Borgne, and Adrian Popescu CEA, LIST, Laboratory of Vision and Content Engineering, France

More information

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet Classification with Deep Convolutional Neural Networks ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey Hinton University of Toronto Canada Paper with same name to appear in NIPS 2012 Main idea Architecture

More information

CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN

CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN CS6501: Deep Learning for Visual Recognition Object Detection I: RCNN, Fast-RCNN, Faster-RCNN Today s Class Object Detection The RCNN Object Detector (2014) The Fast RCNN Object Detector (2015) The Faster

More information

Supplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains

Supplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains Supplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains Jiahao Pang 1 Wenxiu Sun 1 Chengxi Yang 1 Jimmy Ren 1 Ruichao Xiao 1 Jin Zeng 1 Liang Lin 1,2 1 SenseTime Research

More information

Gradient of the lower bound

Gradient of the lower bound Weakly Supervised with Latent PhD advisor: Dr. Ambedkar Dukkipati Department of Computer Science and Automation gaurav.pandey@csa.iisc.ernet.in Objective Given a training set that comprises image and image-level

More information

Unsupervised Feature Learning for Optical Character Recognition

Unsupervised Feature Learning for Optical Character Recognition Unsupervised Feature Learning for Optical Character Recognition Devendra K Sahu and C. V. Jawahar Center for Visual Information Technology, IIIT Hyderabad, India. Abstract Most of the popular optical character

More information

3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington

3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington 3D Object Recognition and Scene Understanding from RGB-D Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World

More information

Extracting Layers and Recognizing Features for Automatic Map Understanding. Yao-Yi Chiang

Extracting Layers and Recognizing Features for Automatic Map Understanding. Yao-Yi Chiang Extracting Layers and Recognizing Features for Automatic Map Understanding Yao-Yi Chiang 0 Outline Introduction/ Problem Motivation Map Processing Overview Map Decomposition Feature Recognition Discussion

More information

Yiqi Yan. May 10, 2017

Yiqi Yan. May 10, 2017 Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field

More information

CS 223B Computer Vision Problem Set 3

CS 223B Computer Vision Problem Set 3 CS 223B Computer Vision Problem Set 3 Due: Feb. 22 nd, 2011 1 Probabilistic Recursion for Tracking In this problem you will derive a method for tracking a point of interest through a sequence of images.

More information

Web-Scale Image Search and Their Applications

Web-Scale Image Search and Their Applications Web-Scale Image Search and Their Applications Sung-Eui Yoon KAIST http://sglab.kaist.ac.kr Project Guidelines: Project Topics Any topics related to the course theme are okay You can find topics by browsing

More information

A THREE LAYERED MODEL TO PERFORM CHARACTER RECOGNITION FOR NOISY IMAGES

A THREE LAYERED MODEL TO PERFORM CHARACTER RECOGNITION FOR NOISY IMAGES INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONSAND ROBOTICS ISSN 2320-7345 A THREE LAYERED MODEL TO PERFORM CHARACTER RECOGNITION FOR NOISY IMAGES 1 Neha, 2 Anil Saroliya, 3 Varun Sharma 1,

More information

Lecture 5: Object Detection

Lecture 5: Object Detection Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based

More information

arxiv: v1 [cs.ne] 28 Nov 2017

arxiv: v1 [cs.ne] 28 Nov 2017 Block Neural Network Avoids Catastrophic Forgetting When Learning Multiple Task arxiv:1711.10204v1 [cs.ne] 28 Nov 2017 Guglielmo Montone montone.guglielmo@gmail.com Alexander V. Terekhov avterekhov@gmail.com

More information

Stochastic Simulation with Generative Adversarial Networks

Stochastic Simulation with Generative Adversarial Networks Stochastic Simulation with Generative Adversarial Networks Lukas Mosser, Olivier Dubrule, Martin J. Blunt lukas.mosser15@imperial.ac.uk, o.dubrule@imperial.ac.uk, m.blunt@imperial.ac.uk (Deep) Generative

More information

Deep Face Recognition. Nathan Sun

Deep Face Recognition. Nathan Sun Deep Face Recognition Nathan Sun Why Facial Recognition? Picture ID or video tracking Higher Security for Facial Recognition Software Immensely useful to police in tracking suspects Your face will be an

More information

Real-time Object Detection CS 229 Course Project

Real-time Object Detection CS 229 Course Project Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection

More information

Efficient Segmentation-Aided Text Detection For Intelligent Robots

Efficient Segmentation-Aided Text Detection For Intelligent Robots Efficient Segmentation-Aided Text Detection For Intelligent Robots Junting Zhang, Yuewei Na, Siyang Li, C.-C. Jay Kuo University of Southern California Outline Problem Definition and Motivation Related

More information

Counting Vehicles with Cameras

Counting Vehicles with Cameras Counting Vehicles with Cameras Luca Ciampi 1, Giuseppe Amato 1, Fabrizio Falchi 1, Claudio Gennaro 1, and Fausto Rabitti 1 Institute of Information, Science and Technologies of the National Research Council

More information

Motion Detection. Final project by. Neta Sokolovsky

Motion Detection. Final project by. Neta Sokolovsky Motion Detection Final project by Neta Sokolovsky Introduction The goal of this project is to recognize a motion of objects found in the two given images. This functionality is useful in the video processing

More information

ArcGIS Pro: Image Segmentation, Classification, and Machine Learning. Jeff Liedtke and Han Hu

ArcGIS Pro: Image Segmentation, Classification, and Machine Learning. Jeff Liedtke and Han Hu ArcGIS Pro: Image Segmentation, Classification, and Machine Learning Jeff Liedtke and Han Hu Overview of Image Classification in ArcGIS Pro Overview of the classification workflow Classification tools

More information

Perceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington

Perceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington Perceiving the 3D World from Images and Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World 3 Understand

More information

Droid Notes Notebook on a phone

Droid Notes Notebook on a phone Droid Notes Notebook on a phone 28 Gurpreet Gill 1, Pratik Parookaran 2, Judson William 3, Samir Gulave 4 1 Department of Information Technology, KJs Educational Institute s Trinity College of Engineering

More information

Extraction and Recognition of Alphanumeric Characters from Vehicle Number Plate

Extraction and Recognition of Alphanumeric Characters from Vehicle Number Plate Extraction and Recognition of Alphanumeric Characters from Vehicle Number Plate Surekha.R.Gondkar 1, C.S Mala 2, Alina Susan George 3, Beauty Pandey 4, Megha H.V 5 Associate Professor, Department of Telecommunication

More information

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple

More information

Deep Learning for Robust Normal Estimation in Unstructured Point Clouds. Alexandre Boulch. Renaud Marlet

Deep Learning for Robust Normal Estimation in Unstructured Point Clouds. Alexandre Boulch. Renaud Marlet Deep Learning for Robust Normal Estimation in Unstructured Point Clouds Alexandre Boulch Renaud Marlet Normal estimation in point clouds Normal: 3D normalized vector At each point: local orientation of

More information

Several pattern recognition approaches for region-based image analysis

Several pattern recognition approaches for region-based image analysis Several pattern recognition approaches for region-based image analysis Tudor Barbu Institute of Computer Science, Iaşi, Romania Abstract The objective of this paper is to describe some pattern recognition

More information

Deconvolution Networks

Deconvolution Networks Deconvolution Networks Johan Brynolfsson Mathematical Statistics Centre for Mathematical Sciences Lund University December 6th 2016 1 / 27 Deconvolution Neural Networks 2 / 27 Image Deconvolution True

More information

AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation

AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation Introduction Supplementary material In the supplementary material, we present additional qualitative results of the proposed AdaDepth

More information

SHREDDED DOCUMENT RECONSTRUCTION. CS 297 Report. Presented to. Dr. Chris Pollett. Department of Computer Science. San José State University

SHREDDED DOCUMENT RECONSTRUCTION. CS 297 Report. Presented to. Dr. Chris Pollett. Department of Computer Science. San José State University SHREDDED DOCUMENT RECONSTRUCTION 1 SHREDDED DOCUMENT RECONSTRUCTION CS 297 Report Presented to Dr. Chris Pollett Department of Computer Science San José State University In Partial Fulfillment Of the Requirements

More information

arxiv: v1 [cs.cv] 9 Aug 2017

arxiv: v1 [cs.cv] 9 Aug 2017 Anveshak - A Groundtruth Generation Tool for Foreground Regions of Document Images Soumyadeep Dey, Jayanta Mukherjee, Shamik Sural, and Amit Vijay Nandedkar arxiv:1708.02831v1 [cs.cv] 9 Aug 2017 Department

More information

2. Basic Task of Pattern Classification

2. Basic Task of Pattern Classification 2. Basic Task of Pattern Classification Definition of the Task Informal Definition: Telling things apart 3 Definition: http://www.webopedia.com/term/p/pattern_recognition.html pattern recognition Last

More information

Synscapes A photorealistic syntehtic dataset for street scene parsing Jonas Unger Department of Science and Technology Linköpings Universitet.

Synscapes A photorealistic syntehtic dataset for street scene parsing Jonas Unger Department of Science and Technology Linköpings Universitet. Synscapes A photorealistic syntehtic dataset for street scene parsing Jonas Unger Department of Science and Technology Linköpings Universitet 7D Labs VINNOVA https://7dlabs.com Photo-realistic image synthesis

More information

MNIST handwritten digits

MNIST handwritten digits MNIST handwritten digits Description and using Mazdak Fatahi Oct. 2014 Abstract Every scientific work needs some measurements. To compare the performance and accuracy of handwriting recognition methods

More information

A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition

A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition Théodore Bluche, Hermann Ney, Christopher Kermorvant SLSP 14, Grenoble October

More information

CS5670: Computer Vision

CS5670: Computer Vision CS5670: Computer Vision Noah Snavely Lecture 33: Recognition Basics Slides from Andrej Karpathy and Fei-Fei Li http://vision.stanford.edu/teaching/cs231n/ Announcements Quiz moved to Tuesday Project 4

More information

S7348: Deep Learning in Ford's Autonomous Vehicles. Bryan Goodman Argo AI 9 May 2017

S7348: Deep Learning in Ford's Autonomous Vehicles. Bryan Goodman Argo AI 9 May 2017 S7348: Deep Learning in Ford's Autonomous Vehicles Bryan Goodman Argo AI 9 May 2017 1 Ford s 12 Year History in Autonomous Driving Today: examples from Stereo image processing Object detection Using RNN

More information

DEEP LEARNING OF COMPRESSED SENSING OPERATORS WITH STRUCTURAL SIMILARITY (SSIM) LOSS

DEEP LEARNING OF COMPRESSED SENSING OPERATORS WITH STRUCTURAL SIMILARITY (SSIM) LOSS DEEP LEARNING OF COMPRESSED SENSING OPERATORS WITH STRUCTURAL SIMILARITY (SSIM) LOSS ABSTRACT Compressed sensing (CS) is a signal processing framework for efficiently reconstructing a signal from a small

More information

MoonRiver: Deep Neural Network in C++

MoonRiver: Deep Neural Network in C++ MoonRiver: Deep Neural Network in C++ Chung-Yi Weng Computer Science & Engineering University of Washington chungyi@cs.washington.edu Abstract Artificial intelligence resurges with its dramatic improvement

More information

How to give another person access to an applicant you ve created

How to give another person access to an applicant you ve created How to give another person access to an applicant you ve created 1 Table of Contents Introduction... 3 How to link a user to an applicant... 4 How to unlink a user from an applicant... 8 Amending user

More information

DEEP LEARNING APPROACH FOR REMOTE SENSING IMAGE ANALYSIS

DEEP LEARNING APPROACH FOR REMOTE SENSING IMAGE ANALYSIS DEEP LEARNING APPROACH FOR REMOTE SENSING IMAGE ANALYSIS Amina Ben Hamida*,** Alexandre Benoit*, Patrick Lambert*, Chokri Ben Amar** * LISTIC, Université Savoie Mont Blanc, France {amina.ben-hamida,alexandre.benoit,

More information

Partitioning of the Degradation Space for OCR Training

Partitioning of the Degradation Space for OCR Training Boise State University ScholarWorks Electrical and Computer Engineering Faculty Publications and Presentations Department of Electrical and Computer Engineering 1-16-2006 Partitioning of the Degradation

More information

Segmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach

Segmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach Segmentation of Isolated and Touching characters in Handwritten Gurumukhi Word using Clustering approach Akashdeep Kaur Dr.Shaveta Rani Dr. Paramjeet Singh M.Tech Student (Associate Professor) (Associate

More information

A Method of Annotation Extraction from Paper Documents Using Alignment Based on Local Arrangements of Feature Points

A Method of Annotation Extraction from Paper Documents Using Alignment Based on Local Arrangements of Feature Points A Method of Annotation Extraction from Paper Documents Using Alignment Based on Local Arrangements of Feature Points Tomohiro Nakai, Koichi Kise, Masakazu Iwamura Graduate School of Engineering, Osaka

More information

Recognize Complex Events from Static Images by Fusing Deep Channels Supplementary Materials

Recognize Complex Events from Static Images by Fusing Deep Channels Supplementary Materials Recognize Complex Events from Static Images by Fusing Deep Channels Supplementary Materials Yuanjun Xiong 1 Kai Zhu 1 Dahua Lin 1 Xiaoou Tang 1,2 1 Department of Information Engineering, The Chinese University

More information

High Performance Layout Analysis of Arabic and Urdu Document Images

High Performance Layout Analysis of Arabic and Urdu Document Images High Performance Layout Analysis of Arabic and Urdu Document Images Syed Saqib Bukhari 1, Faisal Shafait 2, and Thomas M. Breuel 1 1 Technical University of Kaiserslautern, Germany 2 German Research Center

More information

Optimizing (Hiding Staves) in PrintMusic 2011 (with a little help from PrintMusic 2010)

Optimizing (Hiding Staves) in PrintMusic 2011 (with a little help from PrintMusic 2010) Optimizing (Hiding Staves) in PrintMusic 2011 (with a little help from PrintMusic 2010) Unfortunately, PrintMusic 2011 no longer provides an option for hiding staves. It will, however, support hidden staves

More information

ECE 484 Digital Image Processing Lec 17 - Part II Review & Final Projects Topics

ECE 484 Digital Image Processing Lec 17 - Part II Review & Final Projects Topics ECE 484 Digital Image Processing Lec 17 - Part II Review & Final Projects opics Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: lizhu@umkc.edu, Ph: x 2346. http://l.web.umkc.edu/lizhu slides created with

More information

5th Vienna Deep Learning Meetup

5th Vienna Deep Learning Meetup 5th Vienna Meetup 22 September 2016 @ Automic Software Hosts: Josef Puchinger, Thomas Lidy, Jan Schlüter 5th Vienna Meetup Agenda: Welcome (Thomas Lidy) & The Future of Automation (Josef Puchinger, Automic

More information

Detecting Salient Contours Using Orientation Energy Distribution. Part I: Thresholding Based on. Response Distribution

Detecting Salient Contours Using Orientation Energy Distribution. Part I: Thresholding Based on. Response Distribution Detecting Salient Contours Using Orientation Energy Distribution The Problem: How Does the Visual System Detect Salient Contours? CPSC 636 Slide12, Spring 212 Yoonsuck Choe Co-work with S. Sarma and H.-C.

More information

Character Segmentation for Telugu Image Document using Multiple Histogram Projections

Character Segmentation for Telugu Image Document using Multiple Histogram Projections Global Journal of Computer Science and Technology Graphics & Vision Volume 13 Issue 5 Version 1.0 Year 2013 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc.

More information

Part Localization by Exploiting Deep Convolutional Networks

Part Localization by Exploiting Deep Convolutional Networks Part Localization by Exploiting Deep Convolutional Networks Marcel Simon, Erik Rodner, and Joachim Denzler Computer Vision Group, Friedrich Schiller University of Jena, Germany www.inf-cv.uni-jena.de Abstract.

More information