How To enrich transcribed documents with mark-up

Similar documents
How To Transcribe Documents with Transkribus Simple Mode

Using the Swiftpage Connect List Manager

Using the Swiftpage Connect List Manager

The Login Page Designer

INSTALLING CCRQINVOICE

Tips and Tricks in Word 2000 Part II. Presented by Carla Torgerson

BI Publisher TEMPLATE Tutorial

These tasks can now be performed by a special program called FTP clients.

REFWORKS: STEP-BY-STEP HURST LIBRARY NORTHWEST UNIVERSITY

Proper Document Usage and Document Distribution. TIP! How to Use the Guide. Managing the News Page

Procurement Contract Portal. User Guide

Date: October User guide. Integration through ONVIF driver. Partner Self-test. Prepared By: Devices & Integrations Team, Milestone Systems

Faculty Textbook Adoption Instructions

Entering an NSERC CCV: Step by Step

In Outlook, how do I allow other users to view my Calendar or other folders in my Exchange mailbox?

Announcing Veco AuditMate from Eurolink Technology Ltd

TECHNICAL REQUIREMENTS

Marian Online 2 Instructor Manual 12

MyUni Adding Content. Date: 29 May 2014 TRIM Reference: D2013/ Version: 1

Integrating QuickBooks with TimePro

TRAINING GUIDE. Overview of Lucity Spatial

Element Creator for Enterprise Architect

Exporting and Importing the Blackboard Vista Grade Book

Enabling Your Personal Web Page on the SacLink

Using UB Stream and UBlearns

Relius Documents ASP Checklist Entry

6 Ways to Streamline Your Tasks in Outlook

TRAINING GUIDE. Lucity Mobile

ClubRunner. Volunteers Module Guide

INSERTING MEDIA AND OBJECTS

Copyrights and Trademarks

1on1 Sales Manager Tool. User Guide

Microsoft Excel Extensions for Enterprise Architect

Wave IP 4.5. CRMLink Desktop User Guide

Exercise 4: Working with tabular data Exploring infant mortality in the 1900s

Adverse Action Letters

ROCK-POND REPORTING 2.1

ClassFlow Administrator User Guide

CLIC ADMIN USER S GUIDE

Stock Affiliate API workflow

User Guide. Document Version: 1.0. Solution Version:

Qualtrics Instructions

Kaltura Video Extension for IBM Connections User Guide. Version: 1.0

FollowMe. FollowMe. Q-Server Quick Integration Guide. Revision: 5.4 Date: 11 th June Page 1 of 26

TIBCO Statistica Options Configuration

August 22, 2006 IPRO Tech Client Services Tip of the Day. Concordance and IPRO Camera Button / Backwards DB Link Setup

Doctoral Dissertation and Capstone Project Submission Guide

Concentrix University Learning Portal FAQ Document

Agent Online. User Manual

EndNote Online. 1: Introduction... 1 Registering for EndNote Online... 2

Test Pilot User Guide

UML : MODELS, VIEWS, AND DIAGRAMS

EBSCOhost User Guide Print/ /Save. Print, , Save, Notetaking, Export, and Cite Your Search Results. support.ebsco.com

Campuses that access the SFS nvision Windows-based client need to allow outbound traffic to:

Element Creator for Enterprise Architect

Quick start guide: Working in Transit NXT with a PPF

How to use DCI Contract Alerts

Frequently Asked Questions Read and follow all instructions for success!

Student participation Students can register online, track progress, express interest and demonstrate proficiency.

Introduction to Office 2010: What s New, Improved, and Missing

User Guide to SEP Lessons

RISKMAN REFERENCE GUIDE TO USER MANAGEMENT (Non-Network Logins)

Courseware Setup. Hardware Requirements. Software Requirements. Prerequisite Skills

VISITSCOTLAND - TOURS MANAGEMENT SYSTEM Manual for Tour Operators

Level 2 Development Training

Class Roster. Curriculum Class Roster Step-By-Step Procedure

UPGRADING TO DISCOVERY 2005

BMC Remedyforce Integration with Remote Support

CMS and e-commerce Solutions. version 1.0. Please, visit us at: or contact directly by

Historical Newspapers Quick Start Guide

Gmail and Google Drive for Rutherford County Master Gardeners

Outlook Web Application (OWA) Basic Training

Bike MS My Account Guide

Use of GIS & GPS in Trail and Land Management

You may receive a total of two GSA graduate student grants in your entire academic career, regardless of what program you are currently enrolled in.

Municode Website Instructions

Interfacing to MATLAB. You can download the interface developed in this tutorial. It exists as a collection of 3 MATLAB files.

Module: Items in DSpace

1 Getting and Extracting the Upgrader

A Purchaser s Guide to CondoCerts

Frequently Asked Questions Read and follow all instructions for success!

ComplyWorks Subscription User Guide. October 6, 2011

Scroll down to New and another menu will appear. Select Folder and a new

InformationNOW Letters

Populate and Extract Data from Your Database

Drupal 7 Manual Omega Theme: distance.nwu.ac.za news.nwu.ac.za IT Training & Empowerment

Master s Thesis Submission Guide

Homework: Populate and Extract Data from Your Database

AASHTOWare BrDR Support Center JIRA Getting Started Guide

To open the event for editing: click on the EDIT link in the far right column of the listing. To view the event: click on the EVENT TITLE.

Focus University Training Document

Pages of the Template

Reporting Requirements Specification

BMC Remedyforce Integration with Bomgar Remote Support

istartsmart 3.5 Upgrade - Installation Instructions

IBM Cognos TM1 Web Tips and Techniques

About this Guide This Quick Reference Guide provides an overview of the query options available under Utilities Query menu in InformationNOW.

Dashboard Extension for Enterprise Architect

Form Filing Instructions

Project Extranet User Guide

Transcription:

Hw T enrich transcribed dcuments with mark-up Versin v1.4.0 (22_02_2018_15:07) Last update 30.09.2018 This guide will shw yu hw t add mark-up t dcuments which are already transcribed in Transkribus. This gives yu the pprtunity t define persns, places and abbreviatins. Yu can add custmized tagging categries and search fr individual tags in yur dcuments. Additinally the tags can be exprted in different frmats. Mre infrmatin abut the exprt f tags can be fund in the Hw t Exprt Dcuments frm Transkribus guide. Dwnlad the Transkribus Expert Client, r make sure yu are using the latest versin: - https://transkribus.eu/ Cnsult the Transkribus Wiki fr further infrmatin and ther Hw t Guides: - https://transkribus.eu/wiki/ Transkribus and the technlgy behind it are made available via the fllwing prjects and sites: - https://read.transkribus.eu/ - https://transcriptrium.eu/ - https://github.cm/transkribus/ Cntact: - The Transkribus Team: email@transkribus.eu

2 Hw t enrich transcribed dcuments with mark-up Cntents Intrductin... 3 Tagging interface... 3 Create yur wn tags... 4 Adding tags... 5 Histrical letters and abbreviatin signs... 9 Illegible text... 11 Deletins... 11 Black ut text... 12 Searching fr tags... 13 Metadata... 15 Editrial Declaratin... 16 Credits... 16 The READ prject has received funding frm the Eurpean Unin s Hrizn 2020 research and innvatin prgramme under grant agreement N 674943.

3 Hw t enrich transcribed dcuments with mark-up Intrductin The tagging interface in Transkribus enables yu t - Assign tags t imprtant wrds r phrases in yur dcument. - Search fr individual tags r tag categries. - Exprt the tags yu added in different file frmats s that yu can g n wrking with them utside f Transkribus. Tagging interface - The tagging interface can be fund by clicking the Metadata tab, and then the Textual tab. Figure 1 The Textual tab - If yu click the Shw all buttn at the bttm f the Textual tab, all the predefined tags will be shwn. Yu can start wrking with these right away.

4 Hw t enrich transcribed dcuments with mark-up Figure 2 Shw all predefined tags Figure 3 Predefined tags in Transkribus Create yur wn tags - T create yur wn tag categries, click the Custmize buttn in the Tags tab. The Tag cnfiguratin windw will pen up.

5 Hw t enrich transcribed dcuments with mark-up Figure 4 Create yur wn tags - With the Create new tag buttn yu can add yur wn tags. - Once yu have created a new tag, it will appear when yu click the Shw all buttn. - In the Tag cnfiguratin windw predefined tags are shwn in italics, custmized nes are shwn withut italicisatin. Adding tags - If yu want t tag a wrd r phrase there are three ways (at least) t d it: Highlight the text in the Text Editr field and afterwards click n the green + buttn f the tag yu want t apply. Figure 5 Highlight the wrd t be tagged

6 Hw t enrich transcribed dcuments with mark-up Figure 6 Chsing the right tag Alternatively, yu can highlight the wrd r phrase and then make a right click with yur muse. Under All tags the suitable ne can then be chsen. Figure 7 Tag a wrd r phrase with right muse click Finally, if there are tag categries yu use frequently, yu can create a shrtcut fr them in rder t speed up yur wrk. T d s, within the Textual tab, click the Custmize buttn in the Tags tab. In the Tag Specificatins sectin, yu can nw add yur preferred shrtcut in the Shrtcut clumn.

7 Hw t enrich transcribed dcuments with mark-up Figure 8 Add shrtcuts fr frequently used tags - Yu can als add a shrtcut relating t the prperties f yur tags, e.g. fr expanding abbreviatins r adding a standardised cuntry name t a place tag. Click the Custmize buttn in the Tags tab. In the Tag cnfiguratin windw click the desired tag. The details relating t that tag will appear in the Prperties sectin. Click Add prperty t add the prperty yu wuld like. Then click Add tag specificatin Nw yur tag and its prperty (e.g. an expansin fr an abbreviatin) will appear in the Tag Specificatin sectin f the windw. Add the shrtcut yu wuld like t use. Nw yu can add the tag and its prperty by simply highlighting the wrd r phrase in the Text Editr field and then pressing the shrt cut.

8 Hw t enrich transcribed dcuments with mark-up Figure 9 Hw t add a fixed abbreviatin - If yu tagged smething by mistake yu can und it by highlighting the wrd r phrase again, right clicking with yur muse and then pressing the Delete buttn. The prgram will give yu tw ptins: Delete nly the highlighted tag Delete all the tags fr the current cllectin - Nte: Tags can be applied t text n regin, line, wrd, r even character level. T apply tags t a segmentatin element, click n a text r line regin in the Canvas image viewer and fllw the abve instructins. - Users can apply as many tags as necessary t the text. - In the Textual tab Transkribus will give yu an verview f the tags yu have put in yur dcument.

9 Hw t enrich transcribed dcuments with mark-up Figure 10 Overview f tags Histrical letters and abbreviatin signs - In mdern dcuments the handling f abbreviatins is less imprtant, but in histrical dcuments it is a cmplex and challenging task. - In earlier time perids wrds were ften heavily abbreviated, in the hpe f writing faster r saving paper. In sme dcuments mre than 20 r 30% f all wrds are abbreviated as shwn in the figure belw: Figure 11 Examples f typical abbreviatins in Latin text f the Middle Ages (cf. Wikipedia: https://en.wikipedia.rg/wiki/scribal_abbreviatin) - Again there are tw main ptins t transcribe abbreviated text: Optin 1: Expand abbreviatins in the usual way. Neural netwrks are ften able t learn t recgnise and reprduce expansins. E.g. Latin prefixes and suffices such as

10 Hw t enrich transcribed dcuments with mark-up cum, cn r us and rum are learned easily by the machine. This means that yu just need t prvide an expanded versin f the text in yur transcriptin. Optin 2: Keep t the rule mentined abve as lng as yu can recgnize the base character transcribe the base character. This rule is especially suited t histrians and peple interested in the cntent f a dcument and thse wh want t prvide training data fr the HTR engine. Nte: When it cmes t HTR training, tags are nt relevant yet. Develpments in Named Entity Recgnitin technlgy shuld make the autmated recgnitin f tags pssible in the future. Therefre the crrect transcriptin fr the examples abve wuld be simple: pdr qq cus qr Nte: In the future HTR engines may als learn t autmatically expand these abbreviatins (r t supply the crrect abbreviatin fr an expansin) s that cmputer assisted transcriptin may be supprted. Optin 3: If yu are als interested in using Unicde characters which are near t the special graphemes f the riginal dcument, then yu can transcribe the text by utilizing the full pwer f Unicde. In this case the transcriptin f abve culd lk like the fllwing: pˀ: LATIN SMALL LETTER P COMBINING OGONEK ABOVE ᵭ: LATIN SMALL LETTER D WITH MIDDLE TILDE : LATIN SMALL LETTER O : LATIN SMALL LETTER RUM ROTUNDA. Als LATIN SMALL LETTER R ROTUNDA may be used t represent this letter. Nte: In real-wrld cases it is ften hard t decide which diacritic, mdifier letter r Unicde character may be the right ne. Yu may cnsult the MUFI website t get mre infrmatin n this issue (cf. sectin References ): http://flk.uib.n/hnh/mufi/ Unicde and ther special characters can be fund in the Virtual keybards buttn in the Text Editr menu. Figure 12 Virtual keybards buttn

11 Hw t enrich transcribed dcuments with mark-up Figure 13 Virtual keybards windw - Of curse mixed mdels will ften be useful. E.g. frequently ccurring histrical characters may be transcribed with their crrect Unicde letter, whereas characters which were used just by a specific writer may be transcribed with their base character. Yu shuld nte such editrial decisins in the Editrial Declaratin in the Dcument tab, within the Metadata tab s that yur transcriptin rules are transparent t ther users. Example: LATIN SMALL LETTER RUM ROTUNDA is regularly used in medieval and early mdern texts. Therefre it might be useful t intrduce this letter t an HTR mdel which deals exclusively with medieval dcuments and is dedicated t prcessing large amunts f such dcuments. Illegible text - Text which cannt be transcribed since it is illegible can be marked with the tags unclear r gap. - If the text is unclear, highlight it in the text editr field and tag it as unclear. - If text is impssible t read, click yur cursr where the text appears in the text editr field and add the gap tag. - Yu may als add alternatives r suggestins fr the illegible wrd in the Prperties sectin f the tag. Deletins - If yu discver deleted text yu have several ptins: Optin 1: The text which is deleted is still readable, r at least large prtins are readable. In this case transcribe the text as well as pssible and mark it as strike thrugh. Yu can find the strike thrugh buttn in the Text Editr menu.

12 Hw t enrich transcribed dcuments with mark-up Figure 14 Strike thrugh buttn Nte: HTR engines are able t decipher strike thrugh text and the mre examples they have, the better. Optin 2: The text which is deleted is illegible, r nly small parts can be read. In this case use the gap tag t indicate that there is sme text which is illegible. Black ut text - The blackening tag can be used t redact sensitive infrmatin in the exprt frmats. Typically this is used t hide persnal data in a dcument which is made publicly available. - The blackening tag is used in cnjunctin with the blackening regin which must be added with the segmentatin tls. - T blacken part f yur text: Use the drp dwn menu n the + segmentatin element buttn n the Canvas menu and select Blackening. Use the Blackening regin t mark the wrd r sectin that yu want t hide. Nte: Click the Item visibility buttn n the Main menu and select Render blackenings t display the blackened sectins n a page. Highlight the crrespnding wrd in the Text Editr field and select the Blackening tag. In the exprt f the dcument the text will be replaced by: [ ]. When yu exprt yur dcument, make sure that D blackening is selected. Nte: In METS and TEI files the wrd r phrase is blacked ut but the infrmatin behind the blackened sectin is kept. In ther file frmats, the text behind the blacked ut sectin is cmpletely bscured.

13 Hw t enrich transcribed dcuments with mark-up Figure 15 Select "D blackening" t hide image regins and text in exprted files Searching fr tags - If yu need t search fr distinct tags click the binculars buttn in the Textual tab. Figure 16 Binculars buttn fr tag search - In the windw which will pen up yu can define yur search Chse where yu wuld like t search (current cllectin, current page ) Line r wrd level In the Name field put the name f the tag In the Text field put the written text Press the Search! buttn The search results will appear at the bttm f the windw.

14 Hw t enrich transcribed dcuments with mark-up Figure 17 Search fr.. windw fr tag search - T quickly add an expansin r anther prperty t a wrd which appears several times in the text: Srt the searching results by Value. This is dne by simply clicking n Value. Mark the similar wrds by clicking them while hlding the Cntrl buttn n yur keybard. Then click the Assign tag values buttn and type in the prperty that shuld be added.

15 Hw t enrich transcribed dcuments with mark-up Figure 18 Speeding up yur wrk by adding prperties t mre wrds r phrases at the same time Metadata - We are currently supprting nly a very simple descriptin f dcuments since we assume that in a Digital Editin mst f the metadata wuld reside n an external server and be linked t the dcument. Every dcument has its unique ID and can be accessed als via the REST services prvided by the Transkribus platfrm (https://transkribus.eu/wiki/). - The fllwing fields are currently available in the Dcument tab, within the Metadata tab: Title Authr Upladed Genre Writer Language Script type Date f writing Descriptin

16 Hw t enrich transcribed dcuments with mark-up Editrial Declaratin - Since there are always several ways t prduce a crrect transcript f a text it is imprtant t be transparent abut the way in which the transcriptin was undertaken. - Fr this purpse we have included a special feature in Transkribus, called Editrial Declaratin. This is fund in the Dcument tab, within the Metadata tab. - As with the tagging system, the Editrial Declaratin ffers a set f predefined features and ptins. Mrever yu are able t create yur wn descriptins and t stre them tgether with yur dcument. - It is especially imprtant t list special characters and their use in the Editrial Declaratin using the frm: Character Set Extensin: LATIN SMALL LETTER LONG S (U+017F) Figure 19 Create yur Editrial Declaratin buttn Credits We wuld like t thank the many users wh have cntributed their feedback t help imprve the Transkribus sftware. Transkribus is made available t the public as part f H2020 e-infrastructure Prject READ (Recgnitin and Enrichment f Archival Dcuments) which received funding frm the Eurpean Cmmissin under grant agreement N 674943.