Refactored Moses. Hieu Hoang MT Marathon Prague August 2013
|
|
- Deborah Francine Robbins
- 5 years ago
- Views:
Transcription
1 Refactored Moses Hieu Hoang MT Marathon Prague August 2013
2 What did you Refactor? Decoder Feature Func?on Framework moses.ini format Simplify class structure Deleted func?onality NOT decoding algorithms Tuning NOT tuning algorithm EMS (Experiment Management System)
3 Why did you Refactor? Feature Func?on Framework easier to implement new features use sparse features Simplify class structure easier to develop with Moses Delete func?onality easier to refactor code very linle dele?on
4 Why did you Refactor? Feature Func?on Framework easier to implement new features use sparse features Simplify class structure easier to develop with Moses Delete func?onality easier to refactor code very linle dele?on
5 Why did you Refactor? Feature Func?on Framework easier to implement new features use sparse features Simplify class structure easier to develop with Moses Delete func?onality easier to refactor code very linle dele?on
6 Language Model IN THE BEGINNING LanguageModel LanguageModelSingleFactor LanguageModelMul?Factor SRI IRST KEN
7 Language Model THEN. LM LMImplementa?on LMRefCount LMSingleFactor LMMul?Factor KEN LMPointerState SRI IRST
8 Language Model NOW LM LMImplementa?on KEN LMSingleFactor LMMul?Factor SRI IRST
9 Phrase Tables IN THE BEGINNING PhraseDic?onary Memory TreeAdaptor SCFG OnDisk
10 Phrase Tables THEN. PhraseDic?onary PDFeature Memory TreeAdaptor RuleTableTrie OnDisk Compact SCFG RuleTableUTrie
11 Phrase Tables NOW PhraseDic?onary TreeAdaptor RuleTableTrie OnDisk Compact Memory RuleTableUTrie
12 Why did you Refactor? Feature Func?on Framework easier to implement new features use sparse features Simplify class structure easier to develop with Moses Delete func?onality easier to refactor code very linle dele?on
13 Deleted 1. Transla?on Systems mul?ple engines in 1 decoding process replaced with alterna?ve weights func?on 2. Distributed Language Model send LM queries to different machines replace with mul?pass/asynchronous decoding? 3. Con?nue Par?al Transla?on start from non- empty hypothesis
14 Why did you Refactor? Feature Func?on Framework easier to implement new features use sparse features Simplify class structure easier to develop with Moses Delete func?onality easier to refactor code very linle dele?on
15 Adding new Feature Func?on THEN. Add entry to Parameter class Read parameter in Sta?cData add variable to hold new feature load feature Bespoke ini file: [new- feature- sec?on] file.name [weight- new- feature]
16 Adding new Feature Func?on Sta?cData Now. Simple framework to load feature ini file: [feature] WordPenalty Distor?on NEW- FEATURE file=path factor=0 [weight] WordPenalty0= Distor?on0= 0.14 NEW- FEATURE0= 0.54
17 MERT THEN. Only know about certain feature func?ons Hardcoded array = qw(d=weight- d lm=weight- l tm=weight- t w=weight- w g=weight- genera?on lex=weight- lex I=weight- i dlm=weight- dlm pp=weight- pp wt=weight- wt pb=weight- pb lex=weight- lex glm=weight- glm); - requires feature name and abbrevia?on
18 MERT THEN. Only know about certain feature func?ons Hardcoded array = qw(d=weight- d lm=weight- l tm=weight- t w=weight- w g=weight- genera?on lex=weight- lex I=weight- i dlm=weight- dlm pp=weight- pp wt=weight- wt pb=weight- pb lex=weight- lex glm=weight- glm); - requires feature name and abbrevia?on Deleted array Ask decoder for feature func?on abbrevia?ons deleted
19 Timeline of a Transla?on Rule File Load Memory Apply to input sentence Transla?on Op?on Hypothesis Search
20 Timeline of a Transla?on Rule File Load Source phrase Target phrase Memory Apply to input sentence Input sentence Input path Transla?on Op?on Search Transla?on context Segmenta?on Hypothesis
21 Timeline of a Transla?on Rule File Load Once Memory Apply to input sentence Per occurrence in sentence Transla?on Op?on Search Per hypothesis Hypothesis
22 File Feature Func?on API Loading Xà je suis X 1 I am X 1 Source phrase: je suis X Access to: 1 Target phrase: I am X 1 void Evaluate(source,!! target,! scores,!! estimated future scores)! Memory Feature func?ons that use this: Word Penalty Phrase penalty Language model (par?al)
23 Memory Feature Func?on API Apply to input sentence Input: je PRO suis VB 25 NUM ans NNS.. Input subphrase: je PRO suis VB 25 NUM Access to: Input scores: void Evaluate(input,!!! input path,! scores)! Transla?on Op?on Feature func?ons that uses this: Input feature Bag- of- word features.
24 Transla?on Op?on Feature Func?on API Hypothesis: Search enumerated tuples defines the search Access to: Current hypothesis Previous hypotheses Segmenta?on Hypothesis Stateless features: void Evaluate(hypo, scores)! void EvaluateChart(hypo, scores)!! Stateful features: state Evaluate(hypo,!!! previous state,!!! scores)! state EvaluateChart(hypo!!!! previous state,!!!! scores)!
25 Transla?on Op?on Hypothesis Feature Func?on API Decoding Feature func?ons that uses this: All stateful features Language models Distor?on model Lexicalized distor?on Some stateless features Global lexicalized model Word transla?on
26 Feature Func?on Loading: Apply to Input: Search: void Evaluate(source,!! target,! scores,!! estimated future scores)! void Evaluate(input,!!! input path,! scores)! Stateless features: void Evaluate(hypo, scores)! void EvaluateChart(hypo, scores)!! Stateful features: state Evaluate(hypo,!!! previous state,!!! scores)! state EvaluateChart(hypo!!!! previous state,!!!! scores)!
27 Strange Features func?ons (1) Language model implement 2 Evaluate() 1. Loading evaluate full n- grams es?mate future cost par?al n- grams 2. Decoding reprise de la session resump?on of the session evaluate overlapping n- grams
28 Strange Feature Func?ons (2) Phrase- tables Unknown Word Penalty Genera?on Model integral part of decoding process Uses no Evaluate() scores assign by decoder
29 Register a Feature Func?on Register in moses/ff/factory.cpp add entry MOSES_FNAME(ClassName);
30 Language Model Inherit from LanguageModelSingleFactor class LanguageModelSingleFactor :! {! LMResult GetValue(!!const std::vector<const Word*> &contextfactor,!state* finalstate = NULL) const = 0;! }!
31 Phrase Table Inherit from PhraseDic?onary Legacy API: class PhraseDictionary :...! {! public:! TargetPhraseCollection *GetTargetPhraseCollectionLEGACY(!!!Phrase &src) const;!! protected:! const TargetPhraseCollection!*GetTargetPhraseCollectionNonCacheLEGACY(Phrase &src);!! }!
32 Phrase Table New API: class PhraseDictionary :...! {! public:! void GetTargetPhraseCollectionBatch(InputPathList &);! }! InputPathList Input Sentence: Input Path List: je suis 25 ans. je je suis je suis 25 suis suis 25
33 Phrase- table Op?miza?on Input Sentence: je suis 25 ans. Input Path List: je je suis je suis 25 suis suis suis Phrase- table trie 0 je suis
34 Phrase- table Lalce Input Input Path List: einen einen wenbewerb einen wenbewerbs einen wenbewerbsbedingten einen wenbewerb bedingten einen wenbewerbs bedingten einen wenbewerb bedingten preissturz einen wenbewerbs bedingten preis einen wenbewerbsbedingten preissturz einen wenbewerbsbedingten preis.
35 New API: Ac?ve Parsing CKY+ Phrase Table Syntax Decoding class PhraseDictionary :...! {! virtual ChartRuleLookupManager *CreateRuleLookupManager(! const ChartParser &,! const ChartCellCollectionBase &) = 0;! }! class ChartRuleLookupManager! {! void GetChartRuleCollection(! const WordsRange &range,! ChartParserCallback &outcoll) = 0;! }!
36 Results Lalce decoding use all phrase- table implementa?ons Syntax model (nearly ) One in- memory phrase- table implementa?on One on- disk phrase- table Syntax models AND phrase- based model Zen s binary implementa?on phrase- based only deprecated to be removed
37 Pass regression tests tests have also changed BLEU unchanged subject to random variability Results Transla?on Quality # Description v th July # Description v th July En-es 1 pb De-en 1 pb hiero hiero (1) + CreateOnDiskPt (1) + POS de Es-en 1 pb (3) + POS en hiero (1) + CreateOnDiskPt (1) + CreateOnDiskPt (3) + CreateOnDiskPt 15.7 En-cs 1 pb (4) + CreateOnDiskPt hiero En-fr 1 pb truecase (1) + placeholders pb recase (1) + CreateOnDiskPt (2) + hiero (2) + Ken's incre algo (2) + POS en Cs-en 1 pb (2) + kbmira hiero (2) + pro (1) + placeholders (2) + CreateOnDiskPt (1) + CreateOnDiskPt Fr-en 1 pb truecase (2) + Ken's incre algo pb recased En-de 1 pb (2) + hiero hiero (2) + factors (1) + POS de (2) + kbmira (3) + POS en (2) + pro (1) + CreateOnDiskPt (2) + CreateOnDiskPt 22.46
38 Phrase- based Results Speed speed (sec) speed (sec) 1 v master + caching 1, % With compact pt: 3 v % 4 master (with caching) % Hierarchical Pop Limit Loading v 1.0 1,044 1,393 2,620 4, th Aug Reduction (excl load) -65% -26% -21% -18%
39 Comparison with the decoders
Moses version 2.1 Release Notes
Moses version 2.1 Release Notes Overview The document is the release notes for the Moses SMT toolkit, version 2.1. It describes the changes to toolkit since version 1.0 from January 2013. The aim during
More informationMoses Version 3.0 Release Notes
Moses Version 3.0 Release Notes Overview The document contains the release notes for the Moses SMT toolkit, version 3.0. It describes the changes to the toolkit since version 2.0 from January 2014. In
More informationMOSES. Statistical Machine Translation System. User Manual and Code Guide. Philipp Koehn University of Edinburgh
MOSES Statistical Machine Translation System User Manual and Code Guide Philipp Koehn pkoehn@inf.ed.ac.uk University of Edinburgh Abstract This document serves as user manual and code guide for the Moses
More informationJoint Decoding with Multiple Translation Models
Joint Decoding with Multiple Translation Models Yang Liu, Haitao Mi, Yang Feng, and Qun Liu Institute of Computing Technology, Chinese Academy of ciences {yliu,htmi,fengyang,liuqun}@ict.ac.cn 8/10/2009
More informationDeliverable D1.1. Moses Specification. Work Package: WP1: Moses Coordination and Integration Lead Author: Barry Haddow Due Date: May 1st, 2012
MOSES CORE Deliverable D1.1 Moses Specification Work Package: WP1: Moses Coordination and Integration Lead Author: Barry Haddow Due Date: May 1st, 2012 August 15, 2013 2 Note This document was extracted
More informationNTT SMT System for IWSLT Katsuhito Sudoh, Taro Watanabe, Jun Suzuki, Hajime Tsukada, and Hideki Isozaki NTT Communication Science Labs.
NTT SMT System for IWSLT 2008 Katsuhito Sudoh, Taro Watanabe, Jun Suzuki, Hajime Tsukada, and Hideki Isozaki NTT Communication Science Labs., Japan Overview 2-stage translation system k-best translation
More informationThe Prague Bulletin of Mathematical Linguistics NUMBER 94 SEPTEMBER An Experimental Management System. Philipp Koehn
The Prague Bulletin of Mathematical Linguistics NUMBER 94 SEPTEMBER 2010 87 96 An Experimental Management System Philipp Koehn University of Edinburgh Abstract We describe Experiment.perl, an experimental
More informationA General-Purpose Rule Extractor for SCFG-Based Machine Translation
A General-Purpose Rule Extractor for SCFG-Based Machine Translation Greg Hanneman, Michelle Burroughs, and Alon Lavie Language Technologies Institute Carnegie Mellon University Fifth Workshop on Syntax
More informationCS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University
CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Indexing Process Indexes Indexes are data structures designed to make search faster Text search
More informationFeatures for Syntax-based Moses: Project Report
Features for Syntax-based Moses: Project Report Eleftherios Avramidis, Arefeh Kazemi, Tom Vanallemeersch, Phil Williams, Rico Sennrich, Maria Nadejde, Matthias Huck 13 September 2014 Features for Syntax-based
More informationMOSES. Statistical Machine Translation System. User Manual and Code Guide. Philipp Koehn University of Edinburgh
MOSES Statistical Machine Translation System User Manual and Code Guide Philipp Koehn pkoehn@inf.ed.ac.uk University of Edinburgh Abstract This document serves as user manual and code guide for the Moses
More informationHierarchical Phrase-Based Translation with WFSTs. Weighted Finite State Transducers
Hierarchical Phrase-Based Translation with Weighted Finite State Transducers Gonzalo Iglesias 1 Adrià de Gispert 2 Eduardo R. Banga 1 William Byrne 2 1 Department of Signal Processing and Communications
More informationTHE CUED NIST 2009 ARABIC-ENGLISH SMT SYSTEM
THE CUED NIST 2009 ARABIC-ENGLISH SMT SYSTEM Adrià de Gispert, Gonzalo Iglesias, Graeme Blackwood, Jamie Brunning, Bill Byrne NIST Open MT 2009 Evaluation Workshop Ottawa, The CUED SMT system Lattice-based
More informationStatistical Machine Translation Part IV Log-Linear Models
Statistical Machine Translation art IV Log-Linear Models Alexander Fraser Institute for Natural Language rocessing University of Stuttgart 2011.11.25 Seminar: Statistical MT Where we have been We have
More informationSearch Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson
Search Engines Informa1on Retrieval in Prac1ce Annotations by Michael L. Nelson All slides Addison Wesley, 2008 Indexes Indexes are data structures designed to make search faster Text search has unique
More informationBetter Word Alignment with Supervised ITG Models. Aria Haghighi, John Blitzer, John DeNero, and Dan Klein UC Berkeley
Better Word Alignment with Supervised ITG Models Aria Haghighi, John Blitzer, John DeNero, and Dan Klein UC Berkeley Word Alignment s Indonesia parliament court in arraigned speaker Word Alignment s Indonesia
More informationiems Interactive Experiment Management System Final Report
iems Interactive Experiment Management System Final Report Pēteris Ņikiforovs Introduction Interactive Experiment Management System (Interactive EMS or iems) is an experiment management system with a graphical
More informationKenLM: Faster and Smaller Language Model Queries
KenLM: Faster and Smaller Language Model Queries Kenneth heafield@cs.cmu.edu Carnegie Mellon July 30, 2011 kheafield.com/code/kenlm What KenLM Does Answer language model queries using less time and memory.
More informationSparse Feature Learning
Sparse Feature Learning Philipp Koehn 1 March 2016 Multiple Component Models 1 Translation Model Language Model Reordering Model Component Weights 2 Language Model.05 Translation Model.26.04.19.1 Reordering
More informationK-best Parsing Algorithms
K-best Parsing Algorithms Liang Huang University of Pennsylvania joint work with David Chiang (USC Information Sciences Institute) k-best Parsing Liang Huang (Penn) k-best parsing 2 k-best Parsing I saw
More informationOutline GIZA++ Moses. Demo. Steps Output files. Training pipeline Decoder
GIZA++ and Moses Outline GIZA++ Steps Output files Moses Training pipeline Decoder Demo GIZA++ A statistical machine translation toolkit used to train IBM Models 1-5 (moses only uses output of IBM Model-1)
More informationThe HDU Discriminative SMT System for Constrained Data PatentMT at NTCIR10
The HDU Discriminative SMT System for Constrained Data PatentMT at NTCIR10 Patrick Simianer, Gesa Stupperich, Laura Jehl, Katharina Wäschle, Artem Sokolov, Stefan Riezler Institute for Computational Linguistics,
More informationCMU-UKA Syntax Augmented Machine Translation
Outline CMU-UKA Syntax Augmented Machine Translation Ashish Venugopal, Andreas Zollmann, Stephan Vogel, Alex Waibel InterACT, LTI, Carnegie Mellon University Pittsburgh, PA Outline Outline 1 2 3 4 Issues
More informationThe Prague Bulletin of Mathematical Linguistics NUMBER 91 JANUARY
The Prague Bulletin of Mathematical Linguistics NUMBER 9 JANUARY 009 79 88 Z-MERT: A Fully Configurable Open Source Tool for Minimum Error Rate Training of Machine Translation Systems Omar F. Zaidan Abstract
More informationA Space-Efficient Phrase Table Implementation Using Minimal Perfect Hash Functions
A Space-Efficient Phrase Table Implementation Using Minimal Perfect Hash Functions Marcin Junczys-Dowmunt Faculty of Mathematics and Computer Science Adam Mickiewicz University ul. Umultowska 87, 61-614
More informationCS 406: Syntax Directed Translation
CS 406: Syntax Directed Translation Stefan D. Bruda Winter 2015 SYNTAX DIRECTED TRANSLATION Syntax-directed translation the source language translation is completely driven by the parser The parsing process
More informationApply. A. Michelle Lawing Ecosystem Science and Management Texas A&M University College Sta,on, TX
Apply A. Michelle Lawing Ecosystem Science and Management Texas A&M University College Sta,on, TX 77843 alawing@tamu.edu Schedule for today My presenta,on Review New stuff Mixed, Fixed, and Random Models
More informationA Quantitative Analysis of Reordering Phenomena
A Quantitative Analysis of Reordering Phenomena Alexandra Birch Phil Blunsom Miles Osborne a.c.birch-mayne@sms.ed.ac.uk pblunsom@inf.ed.ac.uk miles@inf.ed.ac.uk University of Edinburgh 10 Crichton Street
More informationOpera&ng Systems ECE344
Opera&ng Systems ECE344 Lecture 8: Paging Ding Yuan Lecture Overview Today we ll cover more paging mechanisms: Op&miza&ons Managing page tables (space) Efficient transla&ons (TLBs) (&me) Demand paged virtual
More informationMachine Translation Zoo
Machine Translation Zoo Tree-to-tree transfer and Discriminative learning Martin Popel ÚFAL (Institute of Formal and Applied Linguistics) Charles University in Prague May 5th 2013, Seminar of Formal Linguistics,
More informationSearch Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson
Search Engines Informa1on Retrieval in Prac1ce Annota1ons by Michael L. Nelson All slides Addison Wesley, 2008 Evalua1on Evalua1on is key to building effec$ve and efficient search engines measurement usually
More informationRegMT System for Machine Translation, System Combination, and Evaluation
RegMT System for Machine Translation, System Combination, and Evaluation Ergun Biçici Koç University 34450 Sariyer, Istanbul, Turkey ebicici@ku.edu.tr Deniz Yuret Koç University 34450 Sariyer, Istanbul,
More informationD1.6 Toolkit and Report for Translator Adaptation to New Languages (Final Version)
www.kconnect.eu D1.6 Toolkit and Report for Translator Adaptation to New Languages (Final Version) Deliverable number D1.6 Dissemination level Public Delivery date 16 September 2016 Status Author(s) Final
More informationIntroduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.
Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How
More informationLoonyBin: Making Empirical MT Reproduciple, Efficient, and Less Annoying
LoonyBin: Making Empirical MT Reproduciple, Efficient, and Less Annoying Jonathan Clark Alon Lavie CMU Machine Translation Lunch Tuesday, January 19, 2009 Outline A (Brief) Guilt Trip What goes into a
More informationExperimenting in MT: Moses Toolkit and Eman
Experimenting in MT: Moses Toolkit and Eman Aleš Tamchyna, Ondřej Bojar Institute of Formal and Applied Linguistics Faculty of Mathematics and Physics Charles University, Prague Tue Sept 8, 2015 1 / 30
More informationTransla'on, Protec'on, and Virtual Memory. 2/25/16 CS 152 Sec'on 6 Colin Schmidt
Transla'on, Protec'on, and Virtual Memory 2/25/16 CS 152 Sec'on 6 Colin Schmidt Agenda Protec'on Transla'on Virtual Memory Lab 1 Feedback Ques'ons/Open-ended discussion Hand back Lab 1 Protec'on Why? Supervisor
More informationDiagnostic evaluation of MT with DELiC4MT
Diagnostic evaluation of MT with DELiC4MT Final report MT Marathon 2012 Edinburgh, 5th September 2012 Walid Aransa, Luong Ngoc Quang, Antonio Toral Overview Automatic Diagonistic Evaluation on Linguistic
More informationSEMANTIC ANALYSIS TYPES AND DECLARATIONS
SEMANTIC ANALYSIS CS 403: Type Checking Stefan D. Bruda Winter 2015 Parsing only verifies that the program consists of tokens arranged in a syntactically valid combination now we move to check whether
More informationINF5820/INF9820 LANGUAGE TECHNOLOGICAL APPLICATIONS. Jan Tore Lønning, Lecture 8, 12 Oct
1 INF5820/INF9820 LANGUAGE TECHNOLOGICAL APPLICATIONS Jan Tore Lønning, Lecture 8, 12 Oct. 2016 jtl@ifi.uio.no Today 2 Preparing bitext Parameter tuning Reranking Some linguistic issues STMT so far 3 We
More informationWays to implement a language
Interpreters Implemen+ng PLs Most of the course is learning fundamental concepts for using PLs Syntax vs. seman+cs vs. idioms Powerful constructs like closures, first- class objects, iterators (streams),
More informationInforma(on Retrieval
Introduc)on to Informa)on Retrieval CS3245 Informa(on Retrieval Lecture 7: Scoring, Term Weigh9ng and the Vector Space Model 7 Last Time: Index Construc9on Sort- based indexing Blocked Sort- Based Indexing
More informationChapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)
Chapter 3: Describing Syntax and Semantics Introduction Formal methods of describing syntax (BNF) We can analyze syntax of a computer program on two levels: 1. Lexical level 2. Syntactic level Lexical
More information11. a b c d e. 12. a b c d e. 13. a b c d e. 14. a b c d e. 15. a b c d e
CS-3160 Concepts of Programming Languages Spring 2015 EXAM #1 (Chapters 1-6) Name: SCORES MC: /75 PROB #1: /15 PROB #2: /10 TOTAL: /100 Multiple Choice Responses Each multiple choice question in the separate
More informationAdvanced Topics in MNIT. Lecture 1 (27 Aug 2015) CADSL
Compiler Construction Virendra Singh Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/ E-mail:
More informationThe Prague Bulletin of Mathematical Linguistics NUMBER 98 OCTOBER
The Prague Bulletin of Mathematical Linguistics NUMBER 98 OCTOBER 2012 63 74 Phrasal Rank-Encoding: Exploiting Phrase Redundancy and Translational Relations for Phrase Table Compression Marcin Junczys-Dowmunt
More informationEXPERIMENTAL SETUP: MOSES
EXPERIMENTAL SETUP: MOSES Moses is a statistical machine translation system that allows us to automatically train translation models for any language pair. All we need is a collection of translated texts
More informationBing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer
Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures Springer Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web
More informationLanguage Support for Stream. Processing. Robert Soulé
Language Support for Stream Processing Robert Soulé Introduction Proposal Work Plan Outline Formalism Translators & Abstractions Optimizations Related Work Conclusion 2 Introduction Stream processing is
More informationInforma(on Retrieval
Introduc)on to Informa)on Retrieval CS3245 Informa(on Retrieval Lecture 7: Scoring, Term Weigh9ng and the Vector Space Model 7 Last Time: Index Compression Collec9on and vocabulary sta9s9cs: Heaps and
More informationMul$ple Upstream Interface Support for IGMP/MLD Proxy
93 rd IETF, July 2015, Prague, Czech Republic Mul$ple Upstream Interface Support for dra
More informationA Smorgasbord of Features to Combine Phrase-Based and Neural Machine Translation
A Smorgasbord of Features to Combine Phrase-Based and Neural Machine Translation Benjamin Marie Atsushi Fujita National Institute of Information and Communications Technology, 3-5 Hikaridai, Seika-cho,
More informationWhat were his cri+cisms? Classical Methodologies:
1 2 Classifica+on In this scheme there are several methodologies, such as Process- oriented, Blended, Object Oriented, Rapid development, People oriented and Organisa+onal oriented. According to David
More informationCS 288: Statistical NLP Assignment 1: Language Modeling
CS 288: Statistical NLP Assignment 1: Language Modeling Due September 12, 2014 Collaboration Policy You are allowed to discuss the assignment with other students and collaborate on developing algorithms
More informationIntroduc)on to. CS60092: Informa0on Retrieval
Introduc)on to CS60092: Informa0on Retrieval Ch. 4 Index construc)on How do we construct an index? What strategies can we use with limited main memory? Sec. 4.1 Hardware basics Many design decisions in
More informationAn Unsupervised Model for Joint Phrase Alignment and Extraction
An Unsupervised Model for Joint Phrase Alignment and Extraction Graham Neubig 1,2, Taro Watanabe 2, Eiichiro Sumita 2, Shinsuke Mori 1, Tatsuya Kawahara 1 1 Graduate School of Informatics, Kyoto University
More informationCompila(on /15a Lecture 6. Seman(c Analysis Noam Rinetzky
Compila(on 0368-3133 2014/15a Lecture 6 Seman(c Analysis Noam Rinetzky 1 You are here Source text txt Process text input characters Lexical Analysis tokens Annotated AST Syntax Analysis AST Seman(c Analysis
More informationCompiler Optimization Intermediate Representation
Compiler Optimization Intermediate Representation Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology
More informationPBML. logo. The Prague Bulletin of Mathematical Linguistics NUMBER??? FEBRUARY Improved Minimum Error Rate Training in Moses
PBML logo The Prague Bulletin of Mathematical Linguistics NUMBER??? FEBRUARY 2009 1 11 Improved Minimum Error Rate Training in Moses Nicola Bertoldi, Barry Haddow, Jean-Baptiste Fouet Abstract We describe
More information: Advanced Compiler Design. 8.0 Instruc?on scheduling
6-80: Advanced Compiler Design 8.0 Instruc?on scheduling Thomas R. Gross Computer Science Department ETH Zurich, Switzerland Overview 8. Instruc?on scheduling basics 8. Scheduling for ILP processors 8.
More informationMachine Translation and Discriminative Models
Machine Translation and Discriminative Models Tree-to-tree transfer and Discriminative learning Martin Popel ÚFAL (Institute of Formal and Applied Linguistics) Charles University in Prague March 23rd 2015,
More informationCS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University
CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Previously: Indexing Process Query Process Informa.on Needs An informa(on need is the underlying
More informationTuning. Philipp Koehn presented by Gaurav Kumar. 28 September 2017
Tuning Philipp Koehn presented by Gaurav Kumar 28 September 2017 The Story so Far: Generative Models 1 The definition of translation probability follows a mathematical derivation argmax e p(e f) = argmax
More informationLeft-to-Right Tree-to-String Decoding with Prediction
Left-to-Right Tree-to-String Decoding with Prediction Yang Feng Yang Liu Qun Liu Trevor Cohn Department of Computer Science The University of Sheffield, Sheffield, UK {y.feng, t.cohn}@sheffield.ac.uk State
More informationhashfs Applying Hashing to Op2mize File Systems for Small File Reads
hashfs Applying Hashing to Op2mize File Systems for Small File Reads Paul Lensing, Dirk Meister, André Brinkmann Paderborn Center for Parallel Compu2ng University of Paderborn Mo2va2on and Problem Design
More informationForest-Based Search Algorithms
Forest-Based Search Algorithms for Parsing and Machine Translation Liang Huang University of Pennsylvania Google Research, March 14th, 2008 Search in NLP is not trivial! I saw her duck. Aravind Joshi 2
More informationHomework 1. Leaderboard. Read through, submit the default output. Time for questions on Tuesday
Homework 1 Leaderboard Read through, submit the default output Time for questions on Tuesday Agenda Focus on Homework 1 Review IBM Models 1 & 2 Inference (compute best alignment from a corpus given model
More informationForest Reranking for Machine Translation with the Perceptron Algorithm
Forest Reranking for Machine Translation with the Perceptron Algorithm Zhifei Li and Sanjeev Khudanpur Center for Language and Speech Processing and Department of Computer Science Johns Hopkins University,
More informationIntroduc)on to Probabilis)c Latent Seman)c Analysis. NYP Predic)ve Analy)cs Meetup June 10, 2010
Introduc)on to Probabilis)c Latent Seman)c Analysis NYP Predic)ve Analy)cs Meetup June 10, 2010 PLSA A type of latent variable model with observed count data and nominal latent variable(s). Despite the
More informationJane: User s Manual. David Vilar, Daniel Stein, Matthias Huck, Joern Wuebker, Markus Freitag, Stephan Peitz, Malte Nuhn, Jan-Thorsten Peter
Jane: User s Manual David Vilar, Daniel Stein, Matthias Huck, Joern Wuebker, Markus Freitag, Stephan Peitz, Malte Nuhn, Jan-Thorsten Peter February 4, 2013 Contents 1 Introduction 3 2 Installation 5 2.1
More informationProblem Score Max Score 1 Syntax directed translation & type
CMSC430 Spring 2014 Midterm 2 Name Instructions You have 75 minutes for to take this exam. This exam has a total of 100 points. An average of 45 seconds per point. This is a closed book exam. No notes
More informationRelated Course Objec6ves
Syntax 9/18/17 1 Related Course Objec6ves Develop grammars and parsers of programming languages 9/18/17 2 Syntax And Seman6cs Programming language syntax: how programs look, their form and structure Syntax
More informationVirtual Memory: Concepts
Virtual Memory: Concepts 5-23 / 8-23: Introduc=on to Computer Systems 6 th Lecture, Mar. 8, 24 Instructors: Anthony Rowe, Seth Goldstein, and Gregory Kesden Today VM Movaon and Address spaces ) VM as a
More informationLearning Non-linear Features for Machine Translation Using Gradient Boosting Machines
Learning Non-linear Features for Machine Translation Using Gradient Boosting Machines Kristina Toutanova Microsoft Research Redmond, WA 98502 kristout@microsoft.com Byung-Gyu Ahn Johns Hopkins University
More information1/10/16. RPC and Clocks. Tom Anderson. Last Time. Synchroniza>on RPC. Lab 1 RPC
RPC and Clocks Tom Anderson Go Synchroniza>on RPC Lab 1 RPC Last Time 1 Topics MapReduce Fault tolerance Discussion RPC At least once At most once Exactly once Lamport Clocks Mo>va>on MapReduce Fault Tolerance
More informationPhysics 120B: Lecture 11. Timers and Scheduled Interrupts
Physics 120B: Lecture 11 Timers and Scheduled Interrupts Timer Basics The ATMega 328 has three @mers available to it (Arduino Mega has 6) max frequency of each is 16 MHz, on Arduino TIMER0 is an 8- bit
More informationStart downloading our playground for SMT: wget
Before we start... Start downloading Moses: wget http://ufal.mff.cuni.cz/~tamchyna/mosesgiza.64bit.tar.gz Start downloading our playground for SMT: wget http://ufal.mff.cuni.cz/eman/download/playground.tar
More informationMarcello Federico MMT Srl / FBK Trento, Italy
Marcello Federico MMT Srl / FBK Trento, Italy Proceedings for AMTA 2018 Workshop: Translation Quality Estimation and Automatic Post-Editing Boston, March 21, 2018 Page 207 Symbiotic Human and Machine Translation
More informationBundle Protocol Specifica2on 22 July 2015
Bundle Protocol Specifica2on 22 July 2015 Sco$ Burleigh Jet Propulsion Laboratory California Ins;tute of Technology 23 July 2015 This research was carried out at the, under a contract with the National
More informationRobust Identification of Fuzzy Duplicates
Robust Identification of Fuzzy Duplicates ì Authors: Surajit Chaudhuri (Microso3 Research) Venkatesh Gan; (Microso3 Research) Rajeev Motwani (Stanford University) Publica;on: 21 st Interna;onal Conference
More informationEfficient Minimum Error Rate Training and Minimum Bayes-Risk Decoding for Translation Hypergraphs and Lattices
Efficient Minimum Error Rate Training and Minimum Bayes-Risk Decoding for Translation Hypergraphs and Lattices Shankar Kumar 1 and Wolfgang Macherey 1 and Chris Dyer 2 and Franz Och 1 1 Google Inc. 1600
More informationNeural Networks: Learning. Cost func5on. Machine Learning
Neural Networks: Learning Cost func5on Machine Learning Neural Network (Classifica2on) total no. of layers in network no. of units (not coun5ng bias unit) in layer Layer 1 Layer 2 Layer 3 Layer 4 Binary
More informationGes$one Avanzata dell Informazione Part A Full- Text Informa$on Management. Full- Text Indexing
Ges$one Avanzata dell Informazione Part A Full- Text Informa$on Management Full- Text Indexing Contents } Introduction } Inverted Indices } Construction } Searching 2 GAvI - Full- Text Informa$on Management:
More informationApproximate Large Margin Methods for Structured Prediction
: Approximate Large Margin Methods for Structured Prediction Hal Daumé III and Daniel Marcu Information Sciences Institute University of Southern California {hdaume,marcu}@isi.edu Slide 1 Structured Prediction
More informationExploiting Internal and External Semantics for the Clustering of Short Texts Using World Knowledge
Exploiting Internal and External Semantics for the Using World Knowledge, 1,2 Nan Sun, 1 Chao Zhang, 1 Tat-Seng Chua 1 1 School of Computing National University of Singapore 2 School of Computer Science
More informationGrammar Rules in Prolog!!
Grammar Rules in Prolog GR-1 Backus-Naur Form (BNF) BNF is a common grammar used to define programming languages» Developed in the late 1950 s Because grammars are used to describe a language they are
More informationCS395T Visual Recogni5on and Search. Gautam S. Muralidhar
CS395T Visual Recogni5on and Search Gautam S. Muralidhar Today s Theme Unsupervised discovery of images Main mo5va5on behind unsupervised discovery is that supervision is expensive Common tasks include
More informationMachine Translation in Academia and in the Commercial World: a Contrastive Perspective
Machine Translation in Academia and in the Commercial World: a Contrastive Perspective Alon Lavie Research Professor LTI, Carnegie Mellon University Co-founder, President and CTO Safaba Translation Solutions
More informationCS106X Handout 35 Winter 2018 March 12 th, 2018 CS106X Midterm Examination
CS106X Handout 35 Winter 2018 March 12 th, 2018 CS106X Midterm Examination This is an open-book, open-note, closed-electronic-device exam. You needn t write #includes, and you may (and you re even encouraged
More informationThinking Induc,vely. COS 326 David Walker Princeton University
Thinking Induc,vely COS 326 David Walker Princeton University slides copyright 2017 David Walker permission granted to reuse these slides for non-commercial educa,onal purposes Administra,on 2 Assignment
More informationCSE 12 Spring 2016 Week One, Lecture Two
CSE 12 Spring 2016 Week One, Lecture Two Homework One and Two: hw2: Discuss in section today - Introduction to C - Review of basic programming principles - Building from fgetc and fputc - Input and output
More informationVirtual Memory: Concepts
Virtual Memory: Concepts Instructor: Dr. Hyunyoung Lee Based on slides provided by Randy Bryant and Dave O Hallaron Today Address spaces VM as a tool for caching VM as a tool for memory management VM as
More informationCoveo Platform 7.0. Alfresco One Connector Guide
Coveo Platform 7.0 Alfresco One Connector Guide Notice The content in this document represents the current view of Coveo as of the date of publication. Because Coveo continually responds to changing market
More informationOpenMaTrEx: A Free/Open-Source Marker-Driven Example-Based Machine Translation System
To be presented at IceTAL (Reykjavík, 16-18 Aug. 2010) OpenMaTrEx: A Free/Open-Source Marker-Driven Example-Based Machine Translation System Sandipan Dandapat, Mikel L. Forcada, Declan Groves, Sergio Penkale,
More informationRAD, Rules, and Compatibility: What's Coming in Kuali Rice 2.0
software development simplified RAD, Rules, and Compatibility: What's Coming in Kuali Rice 2.0 Eric Westfall - Indiana University JASIG 2011 For those who don t know Kuali Rice consists of mul8ple sub-
More informationBetter Synchronous Binarization for Machine Translation
Better Synchronous Binarization for Machine Translation Tong Xiao *, Mu Li +, Dongdong Zhang +, Jingbo Zhu *, Ming Zhou + * Natural Language Processing Lab Northeastern University Shenyang, China, 110004
More informationRandom Restarts in Minimum Error Rate Training for Statistical Machine Translation
Random Restarts in Minimum Error Rate Training for Statistical Machine Translation Robert C. Moore and Chris Quirk Microsoft Research Redmond, WA 98052, USA bobmoore@microsoft.com, chrisq@microsoft.com
More informationThe Prague Bulletin of Mathematical Linguistics NUMBER 104 OCTOBER
The Prague Bulletin of Mathematical Linguistics NUMBER 104 OCTOBER 2015 39 50 Sampling Phrase Tables for the Moses Statistical Machine Translation System Ulrich Germann University of Edinburgh Abstract
More informationHomework 1 Simple code genera/on. Luca Della Toffola Compiler Design HS15
Homework 1 Simple code genera/on Luca Della Toffola Compiler Design HS15 1 Administra1ve issues Has everyone found a team- mate? Mailing- list: cd1@lists.inf.ethz.ch Please subscribe if we forgot you 2
More informationPart I: Data Mining Foundations
Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?
More information