Experimental Validation of TranScribe Prototype Design

Similar documents
Stream Features Application Usability Test Report

Usability Report for Online Writing Portfolio

IBM MANY EYES USABILITY STUDY

First-Time Usability Testing for Bluetooth-Enabled Devices

Problem and Solution Overview: An elegant task management solution, that saves busy people time.

A Comparative Usability Test. Orbitz.com vs. Hipmunk.com

Foundation Level Syllabus Usability Tester Sample Exam Answers

Usable Privacy and Security Introduction to HCI Methods January 19, 2006 Jason Hong Notes By: Kami Vaniea

Foundation Level Syllabus Usability Tester Sample Exam

Product Development for Medical, Life Sciences, and Consumer Health

Usability Testing CS 4501 / 6501 Software Testing

Team : Let s Do This CS147 Assignment 7 (Low-fi Prototype) Report

Usability Report: Amtrak s Employment Portal. Mishaun D. Cannon Concordia University - Chicago

Memorandum Participants Method

Amsterdam Medical Center Department of Medical Informatics. Improve. Usability evaluation of the sign up process of the Improve app

Evaluation studies: From controlled to natural settings

USER RESEARCH Website portfolio prototype

PROJECT SUMMARY Our group has chosen to conduct a usability study over

Safety-enhanced Design EDIS 2014 R (a)(1) Computerized Provider Order Entry

Standard Glossary of Terms used in Software Testing. Version 3.2. Foundation Extension - Usability Terms

User Experience Research Report: Heuristic Evaluation

07 - TRANSCRIPTION... 1 TRANSCRIPTION MODE Turning on Transcription Mode Start transcribing... 2

MIT GSL week 4 Wednesday. User Interfaces II

Usability Testing Methodology for the 2017 Economic Census Web Instrument

EECE 418. Fantasy Exchange Pass 2 Portfolio

Human Computer Interaction in Health Informatics: From Laboratory Usability Testing to Televaluation of Web-based Information Systems

Handling Your Data in SPSS. Columns, and Labels, and Values... Oh My! The Structure of SPSS. You should think about SPSS as having three major parts.

Concepts of Usability. Usability Testing. Usability concept ISO/IS What is context? What is context? What is usability? How to measure it?

Usability Services at the University of Maryland: Who, What and How

Usability Evaluation of Tools for Nomadic Application Development

Experimental Evaluation of Effectiveness of E-Government Websites

Issues identified from heuristic evaluations:

CHAPTER 4 HUMAN FACTOR BASED USER INTERFACE DESIGN

Cascading versus Indexed Menu Design

Usability evaluation in practice: the OHIM Case Study

PNC.com, Weather.com & SouthWest.com. Usability Analysis. Tyler A. Steinke May 8, 2014 IMS 413

Search Costs vs. User Satisfaction on Mobile

User Centered Design Approach to an Integrated Dynamic Positioning System

USABILITY REPORT A REPORT OF USABILITY FINDINGS FOR OFF THE BEATEN PATH WEBSITE

Headings: Online Reservations. Events Meet-ups. System Design. UI Design. Sports Game Reservation System. Usability Study

Choosing the Right Usability Tool (the right technique for the right problem)

CPSC 444 Project Milestone III: Prototyping & Experiment Design Feb 6, 2018

Procedia Computer Science

Implementing Games User Research Processes Throughout Development: Beyond Playtesting

How to Conduct a Heuristic Evaluation

Running Head: TREE TAP USABILITY TEST 1

Portfolio Classified due to NDA agreements

This exam is open book / open notes. No electronic devices are permitted.

Comparing the Usability of RoboFlag Interface Alternatives*

Are Senior Smartphones really better?

..in a nutshell. credit: Chris Hundhausen Associate Professor, EECS Director, HELP Lab

CUE-10: Moderation Page 1. Comparative Usability Evaluation 10. Moderation. Observing usability test moderators

During usability tests with our paper prototype, we had our participant complete the following actions:

Concept Production. S ol Choi, Hua Fan, Tuyen Truong HCDE 411

How to Choose the Right UX Methods For Your Project

MiPhone Phone Usage Tracking

Writing a Research Paper

Chapter 2. Basic Operations. you through the routine procedures that you will use nearly every time you work with SPSS.

Visual Appeal vs. Usability: Which One Influences User Perceptions of a Website More?

Comparison test of website use with mobile phone and laptop computer

Paging vs. Scrolling: Looking for the Best Way to Present Search Results

LetterScroll: Text Entry Using a Wheel for Visually Impaired Users

Usability and User Experience of Case Study Prof. Dr. Wiebke Möhring

Arranging Touch Screen Software Keyboard Split-keys based on Contact Surface

Plunging into the waters of UX

Web Accessibility for Older Readers: Effects of Font Type and Font Size on Skim Reading Webpages in Thai

Perfect Timing. Alejandra Pardo : Manager Andrew Emrazian : Testing Brant Nielsen : Design Eric Budd : Documentation

03 Usability Engineering

Release 9 Features. Feature Overview. Digital Recording Digital sound quality and computerized convenience.

Introducing Evaluation

interview.io Final Report

WORKSHOP: Using the Health Survey for England, 2014

Applying Usability to elearning

Nektarios Kostaras, Mixalis Xenos. Hellenic Open University, School of Sciences & Technology, Patras, Greece

Usability Report. Author: Stephen Varnado Version: 1.0 Date: November 24, 2014

Innovation meets UX. Tuyen Truong HCDE Capstone Process Book

Parts of the SUM: a case study of usability benchmarking using the SUM Metric

NCH Software TwelveKeys Music Transcription Software

Web-Accessibility Tutorials 1 Development and Evaluation of Web-Accessibility Tutorials

USER-CENTERED DESIGN KRANACK / DESIGN 4

Interaction Design. Task Analysis & Modelling

Instructional Design: ADDIE Model

Release 8 Features Meeting Recorder 8 Gov Recorder 8 Court Recorder 8

GFP Flies on the LEICA Dissecting Microscope - Usability and Interest. Joyce Ma, Jackie Wong, and Adam Klinger. April 2004

Improve the Order Procedure of a Student Nation s Pub

evaluation techniques goals of evaluation evaluation by experts cisc3650 human-computer interaction spring 2012 lecture # II.1

Evaluation of Malang City Public Transportation Route Search Mobile Application Implementation

User study to evaluate FlexArm flexibility and efficiency

CSc 238 Human Computer Interface Design Chapter 5 Designing the Product: Framework and Refinement. ABOUT FACE The Essentials of Interaction Design

FOREVER 21. Usability Evaluation Report. HCI 460 July 27, Group 6 Gelayol Moradzadeh Vicky Moreira Mauli Shukla Frank Sweis

Work with Lap Top computers (Portable Display Screen Equipment)

FOREVER 21. Usability Evaluation Report. HCI 460 July 27, Group 6 Gelayol Moradzadeh Vicky Moreira Mauli Shukla Frank Sweis

Evaluation studies: From controlled to natural settings

203CR PORTFOLIO 2. BLOG URL:

Overview of Today s Lecture. Analytical Evaluation / Usability Testing. ex: find a book at Amazon.ca via search

Heuristic & Severity H.10 S.2. Would be useful given the nature of the problem, but not critical. H.4 S.1. Minor complaint, but an easy fix. H.2 S.

SEGUE DISCOVERY PARTICIPATION IN DISCOVERY DISCOVERY DELIVERABLES. Discovery

Software Quality. Martin Glinz. Thomas Fritz. Lecture 7 UI Design, Usability & Testing. Many thanks to Meghan Allen and Daniel Greenblatt.

DFPS APS Facility Mobile Technology Evaluation

Improving user interfaces through a methodological heuristics evaluation framework and retrospective think aloud with eye tracking

Transcription:

Experimental Validation of TranScribe Prototype Design Hao Shi Velian Pandeliev ABSTRACT In this paper we describe an experiment to compare a new transcription design prototype to an existing commercial application. In a between-subjects design, participants transcribed short interview segments using both applications, and they were compared with regards to quantitative efficiency measures such as time taken, and also on subjective user ratings assessed using Likert scales on an exit questionnaire. Discouragingly, most measures found no significant improvements of our prototype on the commercial product, although user satisfaction was significantly higher in the prototype. Over longer transcription periods and with additional user-supporting features, we expect our application to yield greater efficiency and even higher satisfaction, although some design features such as shortcut key placement will have to be improved. INTRODUCTION We have been interested in developing a software transcription tool to help occasional transcribers to produce accurate transcriptions efficiently. A heuristic evaluation gave us the confidence that we had identified the major issues with our design, and we created a functional prototype in C#, which would allow us to evaluate our design not only through expert reviews, but in a rigorous quantitative study. We traced our improvement decisions back to the beginning of the process, when we gathered user requirements on a commercial transcription tool called Express Scribe. The problems identified in that requirements study served to guide us in designing TranScribe, and so the next step in the design process would intuitively be to use Express Scribe as the benchmark for our prototype in terms of efficiency, user satisfaction and on the principal design questions we have been faced with. The study described below was designed as a direct comparison between the two transcription applications, which has the obvious drawback of comparing a fully tested commercial application to a functional prototype coded over the course of a week. Nevertheless, the prototype of TranScribe (see Appendix A) implemented the same audio playback and text editing features that were used in Express Scribe, and was created to minimize confounds in the experiment. METHOD Participants Six adults (2 female, male) with normal or corrected-tonormal visual acuity were recruited for the experiment. Design In this experiment, our prototype was compared directly to Express Scribe, the commercial transcription application that was evaluated for usability issues at the beginning of this project. In a repeated-measures design, each participant was asked to transcribe two one-minute segments of the same interview, once using Express Scribe, and once using TranScribe. The two segments appeared in the same order, but the transcription applications were counterbalanced to minimise primacy and fatigue effects. Each transcription was timed as a measure of efficiency. In our previous experiments, it was determined that users resort to the mouse when they cannot find a way to accomplish a task efficiently with the keyboard, which is why the number of mouse clicks was recorded for each condition as an indication of interrupted typing flow. Then, a 7-point Likert scale questionnaire obtained subjective measures of the ease and enjoyment of using each application, their hotkey placements, and the participant s overall preference between the two (see Appendix A). Since TranScribe was created specifically to address problems encountered with Express Scribe, it was hypothesized that transcriptions done in TranScribe would be faster than those done in Express Scribe, that TranScribe would score higher in overall usability and shortcut key placement, and that it would be preferred by participants overall. Apparatus Experiments were conducted on a 15.'' widescreen portable computer with a screen resolution of 1280x800 using an external keyboard and mouse. Green painter's tape was applied to the function keys which served as shortcuts to Express Scribe and TranScribe and they were labeled appropriately. Participants were fitted with a headset and the volume level was adjusted individually during the practice transcription. Both applications allowed for audio playback at two different speeds (50% and 100%) while preserving pitch, and had fine-grained rewind and fast forward functionality. They also had different shortcut key maps (See Fig. 1) Transcriptions were timed to the nearest second with the stopwatch function on a portable media player and the number of mouse clicks were counted by one of the experimenters.

overall use median 5 shortcuts median 3 1 2 3 5 6 Likert Scale Value Express Scribe TranScribe Table 1. Median Likert scale (1-7) values of overall use and shortcut satisfaction. The next table shows the mean Likert scale values for each condition. Figure 1. Express Scribe (above) and TranScribe (below) playback shortcuts. overall use mean 3.83.83 Procedure Participants were seated in front of the experimental PC in a quiet conference room. They were given an instructions form and asked to sign an informed consent (see Appendix B). An experimenter described the steps to the experiment, and they were introduced to the first transcription interface. They were allowed to familiarize themselves with the playback and the flow of the interface on a practice audio file from a different interview. Then they completed the transcription. They were asked to identify speakers by A and B, and to produce a transcription of good quality efficiently. The same steps were repeated for the other transcription application. RESULTS Efficiency The mean time to complete a transcription in Express Scribe was 8 minutes and 30 seconds, and in TranScribe it was 8 minutes and 52 seconds. The difference was found to be non-significant using the t-test statistic (paired t(10) = 0.137, p>0.89). Mouse click analysis was complicated by very sparse use of the mouse (less than 3 clicks per condition in general) and one big outlier. Nevertheless, means were analysed and the difference in mouse clicks was found to be non-significant (paired t(10) = 1.025, p=0.330). Subjective Data The following table shows the median Likert scale values for each condition. shortcuts mean.17 1 2 3 5 6 Likert Scale Value Express Scribe TranScribe Table 2. Median Likert scale (1-7) values of overall use and shortcut satisfaction. The differences were analysed using the t-test statistic. While the shortcut preference was not found to be significant (paired t(10)=0.170, p=0.868), participants did express a significant preference for TranScribe (m ts =.833) over Express Scribe (m es =3.833) overall (paired t(10)=2.301, p=0.0). The final question on the questionnaire was to express a preference for one or the other directly. It was on a modified Likert scale (see Appendix A) to eliminate bias towards TranScribe. The mean and median preference reported was +0.5 towards TranScribe. DISCUSSION At the beginning of this project we conducted a usability analysis of Express Scribe which included measures of the amount of time participants took to transcribe a minute of an audio interview. While no direct comparison can be made to those ratios (partly because the interviews chosen in the first experiment were heavily British-accented, which would impact the overall time), we set out to create an application that would improve on the efficiency of Express Scribe through a more intuitive interface and better shortcut keys, which were the principal subjective complaints with the Express Scribe design. For this reason, additional features such as timestamp insertion, bookmarks and speaker identification were not tested in the functional prototype, even though they would impact the overall transcription time in a situation when they would be necessary. We wanted to focus on discovering whether

TranScribe offered a significant efficiency improvement to Express Scribe in the areas of general interface design and shortcut key placement. Our chief measure of efficiency was time. As reported, TranScribe did not significantly improve transcription time. Moreover, time means for TranScribe were in fact worse than for Express Scribe. This result is discouraging, and we must look at other factors to identify the reasons for it. Mouse clicks were initially believed to be an appropriate measure of interrupted keyboard flow, although the observed use of the mouse diminished the experimenters confidence in that measure. Its lack of significance could be thought of as consistent with the main efficiency measure, or it could be a result of a poorly framed measurement. A much better indicator of interrupted keyboard flow, something we had hoped to minimise, was the reported usefulness of our shortcut map. In the usability assessment of Express Scribe, our evaluators reported issues with the shortcut keys which included separate play and pause buttons, counterintuitive placement, lack of mapping between the on-screen button order and the shortcut key order, and finger travel distance to the Function key row. Our design moved the shortcut keys to the Number pad to address the issue, and provided a closer mapping between the keyboard and screen (see Figure 2). However, participants reported that the finger travel time to the number pad was longer and more inconvenient than to the Function key row. Observations also indicated that participants moved their hands only as far as they had to, often pressing the number pad buttons with their pinky fingers, which are likely to be less frequently used and have poorer dexterity. The subjective satisfaction difference between the two shortcut maps was not significant, but it did also favour the Express Scribe placement, which leads us to conclude that, at best, our shortcut key placement is not significantly better than Express Scribe s, and that we should use the Function key row in future iterations. However, when participants assessed the overall ease of use and enjoyment of the two products, they were significantly more satisfied with TranScribe s design. They also expressed a non-significant, yet encouraging slight preference towards using our software despite the shortcut issues. We take this as an indication that TranScribe, even in its prototypical form devoid of additional labour-saving functions is an improvement upon Express Scribe, at least in user satisfaction and enjoyment. It is possible that these effects would be amplified over longer periods of use. If the iteration timelines allowed, we would have loved to run another experiment, comparing longer use of Express Scribe to a more full-featured version of TranScribe which also used the Function row keys. Figure 2. TranScribe on-screen and keyboard audio controls. It is also worth noting that, while the chi-square statistic is recommended over the t-test for use with Likert scales, it was deemed unsuitable for our data because it requires a minimum of five instances of each value to be properly applied [3]. CONCLUSION While our experiment did not elevate our product to the status of the next revolutionary step in audio transcription, it did serve to validate several design decisions, and to identify design problems such as our shortcut key placement. It is curious that this issue was mentioned in heuristic evaluation, but was deemed low severity by our evaluators. This demonstrates that there are design issues which require a quantitative experiment to be properly diagnosed and assessed for severity. We are nevertheless concerned with the lack of significant improvement in most of our measures, and we are working on improving the user s efficiency to match their level of satisfaction with our product. ACKNOWLEDGMENTS We thank the six participants who volunteered for our experiment. REFERENCES 1. Balsamiq Mockup Tool http://www.balsamiq.com/ 2. Gravetter, F. J. & Wallnau, L.B. Essentials of Statistics for the Behavioral Sciences. 5 th ed. Wadsworth Publishing, 200 3. SPSS Techniques Series: Statistics on Likert Scale surveys. University of North Iowa http://www.uni.edu/its/us/document/stats/spss2.html. Pandeliev, V. & Shi, H. User Requirements for Audio Trascription. CSC251 Assignment 1.

Appendix A: Prototype Screenshot

Appendix B: Instruction, Informed Consent and Questionnaire Instructions: This study aims to determine the efficiency of a new user-improved transcription design. You will be asked to transcribe two one-minute sections from an interview, one using a commercially available transcription application and one using a newly designed prototype. Please try to be as efficient as possible for both transcriptions. Before each segment, a short practice session will familiarize you with each transcription product. Keyboard shortcuts are indicated in green on the keyboard. This study poses no significant physical, emotional or health risks to participants. Any data collected will be coded to preserve your anonymity and you are free to withdraw at any point without penalty. Please sign below to indicate your consent for participating in this study. Participant: Witness: Date: Participant Number: ES Order Time Clicks What did you think of using Express Scribe for transcription? How did you find the Express Scribe hotkey placement? TS Order Time Clicks What did you think of using TranScribe for transcription? How did you find the TranScribe hotkey placement? Which transcription software would you prefer to use? 3 2 1 2 3 Definitely ES Either Definitely TS